Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: Check for inconsistent pandas namespace usage #37188

Merged
merged 18 commits into from
Oct 21, 2020
Merged

CI: Check for inconsistent pandas namespace usage #37188

merged 18 commits into from
Oct 21, 2020

Conversation

dsaxton
Copy link
Member

@dsaxton dsaxton commented Oct 17, 2020

Adding a CI check that we aren't (for instance) using Series(...) and pd.Series(...) in the same file. This is kept intentionally small in scope (checking only DataFrame and Series for one file name right now) since this is very common in the code base and I'm not sure if this is something we'd actually want to enforce.

@dsaxton dsaxton added CI Continuous Integration Code Style Code style, linting, code_checks labels Oct 17, 2020
ci/code_checks.sh Outdated Show resolved Hide resolved
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah we had an issue where we talked about this a while back . There was an analysis and IIRC we were basically split between these usages. The discussion basically said that an individual file should be consistent (either pd.* or just Series/DataFrame), but we wouldn't do the effort to clean everything up.

@jorisvandenbossche was for not clean
me and @jbrockmendel though we should just standarize on Series/DataFrame

IIRC.

I am still in favor of using Series/DataFrame everywhere in tests (and not pd.*)

ci/code_checks.sh Outdated Show resolved Hide resolved
@jbrockmendel
Copy link
Member

I agree with @jreback: small preference towards not pd., preference for within-file consistency, not worth bikeshedding on this topic

ci/code_checks.sh Outdated Show resolved Hide resolved
@jreback jreback added this to the 1.2 milestone Oct 20, 2020
@jreback
Copy link
Contributor

jreback commented Oct 20, 2020

yeah it think this is ok to add. will have to figure out the best way to expand this coverage. maybe a file for includes with the test files to check? (or a config like file).

@dsaxton
Copy link
Member Author

dsaxton commented Oct 20, 2020

yeah it think this is ok to add. will have to figure out the best way to expand this coverage. maybe a file for includes with the test files to check? (or a config like file).

The diffs will be very large but it may be easier just to fix this globally for the different classes. I did this for Series just now (can revert if the PR is too large) and could do for DataFrame, etc. if worthwhile.

@jreback
Copy link
Contributor

jreback commented Oct 20, 2020

yeah it think this is ok to add. will have to figure out the best way to expand this coverage. maybe a file for includes with the test files to check? (or a config like file).

The diffs will be very large but it may be easier just to fix this globally for the different classes. I did this for Series just now (can revert if the PR is too large) and could do for DataFrame, etc. if worthwhile.

oh i think this is fine (yeah do DataFrame separately); only issue is have to make sure all passing (in case we are missing imports). ci is a bit cranky now.

@jreback
Copy link
Contributor

jreback commented Oct 20, 2020

and actually merge master once again (as just merged big PR). ping when you are ready.

@dsaxton
Copy link
Member Author

dsaxton commented Oct 21, 2020

and actually merge master once again (as just merged big PR). ping when you are ready.

Looks to be mostly green (only annoying thing that can happen are false positives due to Series(...) being used in comments but not code, so I changed those as well)

@jreback jreback merged commit baddf02 into pandas-dev:master Oct 21, 2020
@jreback
Copy link
Contributor

jreback commented Oct 21, 2020

thanks @dsaxton

@dsaxton dsaxton deleted the ci-unwanted-pattern branch October 21, 2020 00:54
JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Oct 26, 2020
* CI: Check for inconsistent pandas namespace usage

* Make a function

* Remove

* Try making it fail

* Add message

* Fix

* Make pass

* Try something

* Remove sed

* Edit

* Switch file

* Revert

* Global Series fix

* More

* Remove
kesmit13 pushed a commit to kesmit13/pandas that referenced this pull request Nov 2, 2020
* CI: Check for inconsistent pandas namespace usage

* Make a function

* Remove

* Try making it fail

* Add message

* Fix

* Make pass

* Try something

* Remove sed

* Edit

* Switch file

* Revert

* Global Series fix

* More

* Remove
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration Code Style Code style, linting, code_checks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants