Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: fix PR07 errors in docstrings #57420

Closed
jordan-d-murphy opened this issue Feb 14, 2024 · 7 comments
Closed

DOC: fix PR07 errors in docstrings #57420

jordan-d-murphy opened this issue Feb 14, 2024 · 7 comments
Labels

Comments

@jordan-d-murphy
Copy link
Contributor

Pandas has a script for validating docstrings:

Currently, some methods fail the PR07 check.

pandas/ci/code_checks.sh

Lines 144 to 258 in 5fff2cd

MSG='Partially validate docstrings (PR07)' ; echo $MSG
$BASE_DIR/scripts/validate_docstrings.py --format=actions --errors=PR07 --ignore_functions \
pandas.DataFrame.align\
pandas.DataFrame.get\
pandas.DataFrame.rolling\
pandas.DataFrame.to_hdf\
pandas.DatetimeIndex.indexer_between_time\
pandas.DatetimeIndex.mean\
pandas.HDFStore.append\
pandas.HDFStore.get\
pandas.HDFStore.put\
pandas.Index\
pandas.Index.append\
pandas.Index.copy\
pandas.Index.difference\
pandas.Index.drop\
pandas.Index.get_indexer\
pandas.Index.get_indexer_non_unique\
pandas.Index.get_loc\
pandas.Index.get_slice_bound\
pandas.Index.insert\
pandas.Index.intersection\
pandas.Index.join\
pandas.Index.reindex\
pandas.Index.slice_indexer\
pandas.Index.symmetric_difference\
pandas.Index.take\
pandas.Index.union\
pandas.IntervalIndex.get_indexer\
pandas.IntervalIndex.get_loc\
pandas.MultiIndex.append\
pandas.MultiIndex.copy\
pandas.MultiIndex.drop\
pandas.MultiIndex.get_indexer\
pandas.MultiIndex.get_loc\
pandas.MultiIndex.get_loc_level\
pandas.MultiIndex.sortlevel\
pandas.PeriodIndex.from_fields\
pandas.RangeIndex\
pandas.Series.add\
pandas.Series.align\
pandas.Series.cat\
pandas.Series.div\
pandas.Series.eq\
pandas.Series.floordiv\
pandas.Series.ge\
pandas.Series.get\
pandas.Series.gt\
pandas.Series.le\
pandas.Series.lt\
pandas.Series.mod\
pandas.Series.mul\
pandas.Series.ne\
pandas.Series.pow\
pandas.Series.radd\
pandas.Series.rdiv\
pandas.Series.rfloordiv\
pandas.Series.rmod\
pandas.Series.rmul\
pandas.Series.rolling\
pandas.Series.rpow\
pandas.Series.rsub\
pandas.Series.rtruediv\
pandas.Series.sparse.from_coo\
pandas.Series.sparse.to_coo\
pandas.Series.str.decode\
pandas.Series.str.encode\
pandas.Series.sub\
pandas.Series.to_hdf\
pandas.Series.truediv\
pandas.Series.update\
pandas.Timedelta\
pandas.Timedelta.max\
pandas.Timedelta.min\
pandas.Timedelta.resolution\
pandas.TimedeltaIndex.mean\
pandas.Timestamp\
pandas.Timestamp.max\
pandas.Timestamp.min\
pandas.Timestamp.replace\
pandas.Timestamp.resolution\
pandas.api.extensions.ExtensionArray._concat_same_type\
pandas.api.extensions.ExtensionArray.insert\
pandas.api.extensions.ExtensionArray.isin\
pandas.api.types.infer_dtype\
pandas.api.types.is_dict_like\
pandas.api.types.is_file_like\
pandas.api.types.is_iterator\
pandas.api.types.is_named_tuple\
pandas.api.types.is_re\
pandas.api.types.is_re_compilable\
pandas.api.types.pandas_dtype\
pandas.arrays.ArrowExtensionArray\
pandas.arrays.SparseArray\
pandas.arrays.TimedeltaArray\
pandas.core.groupby.DataFrameGroupBy.boxplot\
pandas.core.resample.Resampler.quantile\
pandas.io.formats.style.Styler.set_table_attributes\
pandas.io.formats.style.Styler.set_uuid\
pandas.io.json.build_table_schema\
pandas.merge\
pandas.merge_asof\
pandas.merge_ordered\
pandas.pivot\
pandas.pivot_table\
pandas.plotting.parallel_coordinates\
pandas.plotting.scatter_matrix\
pandas.plotting.table\
pandas.qcut\
pandas.testing.assert_index_equal\
pandas.testing.assert_series_equal\
pandas.unique\
pandas.util.hash_array\
pandas.util.hash_pandas_object # There should be no backslash in the final line, please keep this comment in the last ignored function
RET=$(($RET + $?)) ; echo $MSG "DONE"

The task here is:

  • take 2-4 methods
  • run: scripts/validate_docstrings.py --format=actions --errors=PR07 method-name
  • check if validation docstrings passes for those methods, and if it’s necessary fix the docstrings according to whatever error is reported
  • remove those methods from code_checks.sh
  • commit, push, open pull request

Please don't comment take as multiple people can work on this issue. You also don't need to ask for permission to work on this, just comment on which methods are you going to work.

If you're new contributor, please check the contributing guide

@jordan-d-murphy
Copy link
Contributor Author

I don't have permission to add them, but this could probably use some labels:

  • CI
  • Docs
  • good first issue

@jordan-d-murphy
Copy link
Contributor Author

Addresses #57357

@jordan-d-murphy
Copy link
Contributor Author

opened a fix for pandas.DataFrame.align

@jordan-d-murphy
Copy link
Contributor Author

opening a fix for

pandas.DataFrame.get
pandas.DataFrame.rolling
pandas.DataFrame.to_hdf

@jordan-d-murphy
Copy link
Contributor Author

jordan-d-murphy commented Mar 17, 2024

Hi all, if CI: speedup docstring check consecutive runs #57826 gets merged in, I might be reworking our approach here; this would look like closing the following issues:

DOC: fix GL08 errors in docstrings
DOC: fix PR01 errors in docstrings
DOC: fix PR07 errors in docstrings
DOC: fix SA01 errors in docstrings
DOC: fix RT03 errors in docstrings
DOC: fix PR02 errors in docstrings

And opening a new issue to address these based on the new approach.

tl;dr

the work can still be done, but probably under a new ticket once #57826 is merged in

@jordan-d-murphy
Copy link
Contributor Author

@jordan-d-murphy
Copy link
Contributor Author

Opened DOC: Enforce Numpy Docstring Validation (Parent Issue) #58063 as a parent issue for fixing docstrings based on the refactoring in code_checks.sh

Feel free to swing by and help out! 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants