Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

discussion: submit pandera to pyOpenSci review process #49

Closed
cosmicBboy opened this issue Jun 2, 2019 · 3 comments
Closed

discussion: submit pandera to pyOpenSci review process #49

cosmicBboy opened this issue Jun 2, 2019 · 3 comments

Comments

@cosmicBboy
Copy link
Collaborator

cosmicBboy commented Jun 2, 2019

might be good to get more contributers/users/feedback to speed up development and adoption of this tool by having pandera have the stamp of approval of the pyOpenSci community:

https://www.pyopensci.org/dev_guide/intro.html

@mastersplinter
Copy link
Collaborator

mastersplinter commented Jun 3, 2019

Turning the OpenSci requirements into a to-do list:

Requirements

This section has descriptions of all the packaging requirements for pyOpenSci. I've converted the Good/Better/Best recommendations into their own bullets where appropriate as we should be aiming for best and the good/better points are fairly small.

README must include:

  • The package name
    Badges for:
  • continuous integration
  • test coverage
  • the badge for pyOpenSci peer-review once it has started (see below)
  • a repostatus.org badge
  • any other badges.
    If the README has many more badges, you might want to consider using a table for badges, see this example, that one and that one. Such a table shoud be more wide than high.
  • Short description of goals of package, with descriptive links to all vignettes (rendered, i.e. readable, cf the documentation website section) unless the package is small and there’s only one vignette repeating the README.
  • Installation instructions
  • Any additional setup required (authentication tokens, etc) (not required)
  • Brief demonstration usage
  • Direction to more detailed documentation (e.g. your documentation files or website).
  • If applicable, how the package compares to other similar packages and/or how it relates to other packages
  • Citation information (not required)

Documentation

  • All external package functions, classes, and methods should be fully documented with examples.
  • ReadTheDocs hosted Sphinx Docs with auto generated documentation from docstrings using autodoc

Testing
All packages should have a test suite that covers major functionality of the package. The tests should also cover the behavior of the package in case of errors.

  • Unit tests for all functions
  • Test coverage at least 75%
  • Consider using tox to test your package with multiple versions of Python 2 and 3.

Continuous Integration
All pyOpenSci packages must use some form of continuous integration.

  • For Linux and Mac OSX, we suggest Travis CI.
  • For Windows, we suggest AppVeyor CI.
  • CI for all platforms.
    CI for all platforms when they contain:
  • Compiled code (?)
  • Java dependencies
  • Dependencies on other languages
  • Packages with system calls
  • Text munging /user input error handling, e.g. getting people’s names (in order to find encoding issues).
  • Anything with file system / path calls
  • CI service with status badge in README.
  • Integrated code coverage and linting.
  • CI for all platforms: Linux, Mac OSX, and Windows.

License

  • pyOpenSci projects should use an open source software license that is approved by the Open Software Initiative (OSI). OSI’s website has a list of popular licenses, and GitHub has a handy tool for choosing a license.

Code Style
pyOpenSci encourages authors to consult PEP 8 for information on how to style your code.

  • Complete PEP8 Compliance

Linting

  • An automatic linter (e.g. flake8) can help ensure your code is clean and free of syntax errors. These can be integrated with your CI.

@mastersplinter
Copy link
Collaborator

mastersplinter commented Jun 7, 2019

From the OpenSci checklist: If applicable, how the package compares to other similar packages and/or how it relates to other packages

Here are alternatives to pandera and how they compare:

So I think the key differentiators for pandera are:

  • Input/Output decorators enable seamless integration into existing code
  • Checks provide huge flexibility
  • Hypotheses provides a tidy-first interface for hypothesis testing
  • Other libraries have some useful 'out-of-the box' string validators, and sometimes centre the entire library on the 'validator' itself
  • Documentation is very sporadic/often not easy for first-time users. pandera was straightforward/comprehensive from the start, and per your point in create GitHub organisation for repo? #62, comprehensive easy-to-use docs are a differentiator

@cosmicBboy
Copy link
Collaborator Author

package was accepted into pyopensci ecosystem!

pyOpenSci/software-submission#12

cosmicBboy pushed a commit that referenced this issue May 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants