Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC/CI: Fixes to make validate_docstrings.py to not generate warnings or unwanted output #23552

Merged
merged 1 commit into from
Nov 8, 2018
Merged

Conversation

datapythonista
Copy link
Member

Follow up of #23514. Making validate_docstrings.py not generate warnings or other unwanted output, mainly when called with --format=json. This includes preventing matplotlib of opening windows with the plots, and also canceling output from flake8.

Also, fixed some bugs, and corrected some tests that weren't being run.

The script should now be ready to be added to the CI (and to generate the json file with the docstrings state of the art).

  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

CC @TomAugspurger

@datapythonista datapythonista added Docs CI Continuous Integration labels Nov 7, 2018
@pep8speaks
Copy link

Hello @datapythonista! Thanks for submitting the PR.

@TomAugspurger
Copy link
Contributor

Changes look good to me.

The script should now be ready to be added to the CI (and to generate the json file with the docstrings state of the art).

Can you remind me what the end-goal of this is?

@datapythonista
Copy link
Member Author

The json file is for me (or anyone else interested), to know what needs to be fixed in the docstrings.

The end goal is to have ./scripts/validate_docstrings.py in the CI, which will fail if there is anything wrong in the docstrings (wrong documented parameters, formats, pep8 in examples...).

My next PR will be something like adding ./scripts/validate_docstrings.py --prefix=Series --errors=EX03, and the fixes of pep8 issues in all the examples in Series docstrings, so the CI pass. And from here, keep adding until the 900 current errors are fixed, and everything is validated in the CI for new changes.

@codecov
Copy link

codecov bot commented Nov 7, 2018

Codecov Report

Merging #23552 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #23552   +/-   ##
=======================================
  Coverage   92.24%   92.24%           
=======================================
  Files         161      161           
  Lines       51224    51224           
=======================================
  Hits        47254    47254           
  Misses       3970     3970
Flag Coverage Δ
#multiple 90.63% <ø> (ø) ⬆️
#single 42.28% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/generic.py 96.81% <ø> (ø) ⬆️
pandas/core/panel.py 97.91% <ø> (ø) ⬆️
pandas/plotting/_misc.py 38.98% <ø> (ø) ⬆️
pandas/errors/__init__.py 100% <ø> (ø) ⬆️
pandas/core/strings.py 98.58% <ø> (ø) ⬆️
pandas/core/indexes/base.py 96.45% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 28a42da...a054769. Read the comment docs.

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Nov 7, 2018 via email

@@ -1875,35 +1875,31 @@ def get_duplicates(self):

Works on different Index of types.

>>> pd.Index([1, 2, 2, 3, 3, 3, 4]).get_duplicates()
>>> pd.Index([1, 2, 2, 3, 3, 3, 4]).get_duplicates() # doctest: +SKIP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need to be skipped?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you are skipping the deprecated things, to avoid warnings?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that's correct

1 ba
2 -e
3 dc
dtype: object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here it is actually useful to see the difference? (although it is deprecated?)

cc @h-vetinari

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes more sense to remove it. But if we restore it, I think the page should make clear that this is a deprecated behavior. The way it was feels like it's encouraging users to use join=None.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is true that it then should be more clear that it is deprecated, and pointing out the difference.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@datapythonista @jorisvandenbossche
Didn't see this on time I guess.

Indeed, the main intention of this was to show the difference, while relying on the fact that the user would see the deprecation warning if he actually used the code (but then, that was before all that docstring-validation jazz ;-)).
So, I think that example should go back in, albeit with a clearer note about the deprecation. How to avoid problems with the emitted warning? A # doctest: +SKIP marker like below?

@@ -206,7 +206,7 @@ def radviz(frame, class_column, ax=None, color=None, colormap=None, **kwds):
... 'versicolor', 'setosa', 'virginica',
... 'setosa']
... })
>>> rad_viz = pd.plotting.radviz(df, 'Category')
>>> rad_viz = pd.plotting.radviz(df, 'Category') # doctest: +SKIP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But is the plot still shown then in the html documentation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's just skipped in the doctests. Just generated the page to confirm, and it's rendered as expected.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But there are many other plots in the docs? Why only those two? And from reading further it seems you are deactivating matplotlib anyway in the script?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not skipping because it's a plot, I'm skipping because it generates a warning, something about the projection (not sure if the warning only happens with the backend Template)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I don't see any warning locally, though
(the problem with skipping it in general is that you don't catch typo's or other mistake in this line when using the validation script)

@datapythonista
Copy link
Member Author

I agree that skipping is not ideal, but I think in this case is better than having the warnings, and I don't see other options.

But if you prefer, I can remove the +SKIP, not rendering with matplotlib, removing the output of flake8 and the minor changes with stdout/stderr and exit status are the important parts of this PR, getting those merge we can move forward with the automatic validation of docstrings.

@jorisvandenbossche
Copy link
Member

It's fine for this one. I just think we might need to think of a better general way to deal with this. As there can also be genuine cases where you want a warning (showing a use case that gives a certain warning). It would be annoying to always have to skip those.

@datapythonista
Copy link
Member Author

Totally agree. I think all the skips except the plot one are for deprecations. And personally I think it's reasonable to skip them since the deprecation until the removal.

The plot one would be probably worth investigating. But with all the work pending in the docstrings, I prefer to skip that test, and move forward with all the fixes.

Let me know if this can be merged, or if there is anything that needs to be changed.

@jorisvandenbossche jorisvandenbossche merged commit adc54fe into pandas-dev:master Nov 8, 2018
thoo added a commit to thoo/pandas that referenced this pull request Nov 10, 2018
…fixed

* upstream/master: (47 commits)
  CLN: remove values attribute from datetimelike EAs (pandas-dev#23603)
  DOC/CI: Add linting to rst files, and fix issues (pandas-dev#23381)
  PERF: Speeds up creation of Period, PeriodArray, with Offset freq (pandas-dev#23589)
  PERF: define is_all_dates to shortcut inadvertent copy when slicing an IntervalIndex (pandas-dev#23591)
  TST: Tests and Helpers for Datetime/Period Arrays (pandas-dev#23502)
  Update description of Index._values/values/ndarray_values (pandas-dev#23507)
  Fixes to make validate_docstrings.py not generate warnings or unwanted output (pandas-dev#23552)
  DOC: Added note about groupby excluding Decimal columns by default (pandas-dev#18953)
  ENH: Support writing timestamps with timezones with to_sql (pandas-dev#22654)
  CI: Auto-cancel redundant builds (pandas-dev#23523)
  Preserve EA dtype in DataFrame.stack (pandas-dev#23285)
  TST: Fix dtype mismatch on 32bit in IntervalTree get_indexer test (pandas-dev#23468)
  BUG: raise if invalid freq is passed (pandas-dev#23546)
  remove uses of (ts)?lib.(NaT|iNaT|Timestamp) (pandas-dev#23562)
  BUG: Fix error message for invalid HTML flavor (pandas-dev#23550)
  ENH: Support EAs in Series.unstack (pandas-dev#23284)
  DOC: Updating DataFrame.join docstring (pandas-dev#23471)
  TST: coverage for skipped tests in io/formats/test_to_html.py (pandas-dev#22888)
  BUG: Return KeyError for invalid string key (pandas-dev#23540)
  BUG: DatetimeIndex slicing with boolean Index raises TypeError (pandas-dev#22852)
  ...
JustinZhengBC pushed a commit to JustinZhengBC/pandas that referenced this pull request Nov 14, 2018
tm9k1 pushed a commit to tm9k1/pandas that referenced this pull request Nov 19, 2018
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration Docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants