Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Altered capitalization validation script to handle edge cases #48100

Closed
wants to merge 12 commits into from

Conversation

INDIG0N
Copy link
Contributor

@INDIG0N INDIG0N commented Aug 15, 2022

references #32550

I fixed the capitalization validation script to stop considering urls.

I also altered the script to consider hyphenated words as a single word. Previously this was asking you to change the capitalization of packages that had "Pandas" in the name, and adding that to the exceptions list would have let the script pass through instances where Pandas DID need to be changed to lowercase.

@pep8speaks
Copy link

pep8speaks commented Aug 15, 2022

Hello @INDIG0N! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2022-08-23 04:15:11 UTC

Copy link
Member

@phofl phofl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you run pre-commit locally? The check is failing currently

@@ -206,7 +206,7 @@ invoked with the following command
D-Tale integrates seamlessly with Jupyter notebooks, Python terminals, Kaggle
& Google Colab. Here are some demos of the `grid <http://alphatechadmin.pythonanywhere.com/dtale/main/1>`__.

`hvplot <https://hvplot.holoviz.org/index.html>`__
`Hvplot <https://hvplot.holoviz.org/index.html>`__
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we use hvPlot here too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, I'll change that real quick

@INDIG0N
Copy link
Contributor Author

INDIG0N commented Aug 16, 2022

@phofl I'm a bit new to this, how can I run pre-commit locally, and will doing that automatically resolve the failed check here or do I need to do something else?

@mroeschke mroeschke added the Code Style Code style, linting, code_checks label Aug 16, 2022
@datapythonista
Copy link
Member

@INDIG0N you can simply run the validation script and make sure it's happy locally: ./scripts/validate_rst_title_capitalization.py (or any other script that pre-commit calls and fails in the CI, such as black, isort...

@datapythonista datapythonista changed the title Altered capitalization validation script to handle edge cases DOC: Altered capitalization validation script to handle edge cases Aug 17, 2022
@INDIG0N
Copy link
Contributor Author

INDIG0N commented Aug 23, 2022

@datapythonista oh, the script worked fine locally, even used it to find some capitalization errors and fix them. I see there's a conflict though, I'll take care of that and then the branch should be good to merge

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work on this, nice fixes, I added few comments, but looks good.

I guess we're currently ignoring some titles in the docs that we can stop ignoring here in this PR, no?

Comment on lines +193 to +205
word_list = re.split(r"\W", correct_title)

# Recombine hyphenated words
for word in correct_title.split():
if '-' in word:
lst = word.split('-')
first = lst[0]
for idx, val in enumerate(word_list):
if val == first:
for _ in range(len(lst)):
del word_list[idx]
word_list.insert(idx, '-'.join(lst))
break
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
word_list = re.split(r"\W", correct_title)
# Recombine hyphenated words
for word in correct_title.split():
if '-' in word:
lst = word.split('-')
first = lst[0]
for idx, val in enumerate(word_list):
if val == first:
for _ in range(len(lst)):
del word_list[idx]
word_list.insert(idx, '-'.join(lst))
break
word_list = re.split(r"[^a-zA-Z0-9_-]", correct_title)

I think by changing the regular expression you can get rid of this function. I think the regex [\W-] should also work and it's way shorter, but in a quick test it didn't seem like it worked, but feel free to research a better pattern.

@@ -184,12 +189,20 @@ def correct_title_capitalization(title: str) -> str:
# first word character.
correct_title: str = re.sub(r"^\W*", "", title).capitalize()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you didn't introduce this (maybe it was me), but don't you think correct_title is misleading here? I'd probably overwrite title instead. Also, it may help read this function is in these steps we have a comment with an example of what each step is doing.

I guess removing this line temporary should make the script fail with the examples that this is making pass. I personally think it'd be helpful if for the non trivial steps we had comments with what it does.

Like - foo bar foo-bar, then foo bar foo-bar, then ['foo', 'bar', 'foo-bar']...

correct_title_capitalization(title)}" """
f"""{filename}:{line_number}:{err_msg} "{
removed_https_title.strip()}" to "{
correct_title_capitalization(removed_https_title).strip()}" """
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a second thought, I'd probably just force all urls to be lowercase. The code will be simpler, and it's probably a good idea anyway, even if some people make them follow their branding, as urls are case insensitive.

@datapythonista
Copy link
Member

@INDIG0N do you have time to address the last comments?

@github-actions
Copy link
Contributor

github-actions bot commented Oct 2, 2022

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Oct 2, 2022
@mroeschke
Copy link
Member

Thanks for the pull request, but it appears to have gone stale. If interested in continuing, please merge in the main branch, address any review comments and/or failing tests, and we can reopen.

@mroeschke mroeschke closed this Oct 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Code Style Code style, linting, code_checks Docs Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants