DOC: Altered capitalization validation script to handle edge cases #48100

INDIG0N · 2022-08-15T22:53:32Z

references #32550

I fixed the capitalization validation script to stop considering urls.

I also altered the script to consider hyphenated words as a single word. Previously this was asking you to change the capitalization of packages that had "Pandas" in the name, and adding that to the exceptions list would have let the script pass through instances where Pandas DID need to be changed to lowercase.

…dered for fixing capitalization.

pep8speaks · 2022-08-15T22:53:35Z

Hello @INDIG0N! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2022-08-23 04:15:11 UTC

phofl

Could you run pre-commit locally? The check is failing currently

phofl · 2022-08-16T08:35:10Z

doc/source/ecosystem.rst

@@ -206,7 +206,7 @@ invoked with the following command
 D-Tale integrates seamlessly with Jupyter notebooks, Python terminals, Kaggle
 & Google Colab. Here are some demos of the `grid <http://alphatechadmin.pythonanywhere.com/dtale/main/1>`__.

-`hvplot <https://hvplot.holoviz.org/index.html>`__
+`Hvplot <https://hvplot.holoviz.org/index.html>`__


Shouldn't we use hvPlot here too?

good catch, I'll change that real quick

INDIG0N · 2022-08-16T16:25:42Z

@phofl I'm a bit new to this, how can I run pre-commit locally, and will doing that automatically resolve the failed check here or do I need to do something else?

datapythonista · 2022-08-17T08:51:57Z

@INDIG0N you can simply run the validation script and make sure it's happy locally: ./scripts/validate_rst_title_capitalization.py (or any other script that pre-commit calls and fails in the CI, such as black, isort...

INDIG0N · 2022-08-23T04:00:03Z

@datapythonista oh, the script worked fine locally, even used it to find some capitalization errors and fix them. I see there's a conflict though, I'll take care of that and then the branch should be good to merge

datapythonista

Thanks for the work on this, nice fixes, I added few comments, but looks good.

I guess we're currently ignoring some titles in the docs that we can stop ignoring here in this PR, no?

datapythonista · 2022-08-23T06:45:16Z

scripts/validate_rst_title_capitalization.py

+    word_list = re.split(r"\W", correct_title)
+
+    # Recombine hyphenated words
+    for word in correct_title.split():
+        if '-' in word:
+            lst = word.split('-')
+            first = lst[0]
+            for idx, val in enumerate(word_list):
+                if val == first:
+                    for _ in range(len(lst)):
+                        del word_list[idx]
+                    word_list.insert(idx, '-'.join(lst))
+                    break


Suggested change

word_list = re.split(r"\W", correct_title)

# Recombine hyphenated words

for word in correct_title.split():

if '-' in word:

lst = word.split('-')

first = lst[0]

for idx, val in enumerate(word_list):

if val == first:

for _ in range(len(lst)):

del word_list[idx]

word_list.insert(idx, '-'.join(lst))

break

word_list = re.split(r"[^a-zA-Z0-9_-]", correct_title)

I think by changing the regular expression you can get rid of this function. I think the regex [\W-] should also work and it's way shorter, but in a quick test it didn't seem like it worked, but feel free to research a better pattern.

datapythonista · 2022-08-23T06:51:48Z

scripts/validate_rst_title_capitalization.py

@@ -184,12 +189,20 @@ def correct_title_capitalization(title: str) -> str:
    # first word character.
    correct_title: str = re.sub(r"^\W*", "", title).capitalize()


I know you didn't introduce this (maybe it was me), but don't you think correct_title is misleading here? I'd probably overwrite title instead. Also, it may help read this function is in these steps we have a comment with an example of what each step is doing.

I guess removing this line temporary should make the script fail with the examples that this is making pass. I personally think it'd be helpful if for the non trivial steps we had comments with what it does.

Like - foo bar foo-bar, then foo bar foo-bar, then ['foo', 'bar', 'foo-bar']...

datapythonista · 2022-08-23T06:55:29Z

scripts/validate_rst_title_capitalization.py

-                    correct_title_capitalization(title)}" """
+                    f"""{filename}:{line_number}:{err_msg} "{
+                    removed_https_title.strip()}" to "{
+                    correct_title_capitalization(removed_https_title).strip()}" """


In a second thought, I'd probably just force all urls to be lowercase. The code will be simpler, and it's probably a good idea anyway, even if some people make them follow their branding, as urls are case insensitive.

datapythonista · 2022-09-01T20:13:42Z

@INDIG0N do you have time to address the last comments?

github-actions · 2022-10-02T00:10:16Z

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

mroeschke · 2022-10-04T18:29:08Z

Thanks for the pull request, but it appears to have gone stale. If interested in continuing, please merge in the main branch, address any review comments and/or failing tests, and we can reopen.

INDIG0N added 7 commits August 12, 2022 15:55

Started fixing capitalization. Added STUMPY and IDE to exceptions.

8455def

Merge https://github.com/pandas-dev/pandas into main

cc39a0f

Merge https://github.com/pandas-dev/pandas into main

291f70f

main function now handles removing links to ensure they are not consi…

e92d4a5

…dered for fixing capitalization.

Merge https://github.com/pandas-dev/pandas into main

4cba271

handled edge cases for hyphenated words, and fixed some capitalizations.

63e344a

fixed to combine words with multiple hyphens

2670d8b

INDIG0N added 2 commits August 16, 2022 00:44

Merge https://github.com/pandas-dev/pandas into main

5872e2e

shortened lines

5e8cd0d

phofl reviewed Aug 16, 2022

View reviewed changes

mroeschke added the Code Style Code style, linting, code_checks label Aug 16, 2022

datapythonista added the Docs label Aug 17, 2022

datapythonista changed the title ~~Altered capitalization validation script to handle edge cases~~ DOC: Altered capitalization validation script to handle edge cases Aug 17, 2022

INDIG0N added 3 commits August 23, 2022 00:05

deleted conflicting files to prepare merge

c20c0c1

Merge https://github.com/pandas-dev/pandas

c062cf9

fixed merge conflicts and pushed new changes

e7a407a

datapythonista reviewed Aug 23, 2022

View reviewed changes

github-actions bot added the Stale label Oct 2, 2022

mroeschke closed this Oct 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC: Altered capitalization validation script to handle edge cases #48100

DOC: Altered capitalization validation script to handle edge cases #48100

INDIG0N commented Aug 15, 2022

pep8speaks commented Aug 15, 2022 •

edited

Loading

phofl left a comment

phofl Aug 16, 2022

INDIG0N Aug 16, 2022

INDIG0N commented Aug 16, 2022

datapythonista commented Aug 17, 2022

INDIG0N commented Aug 23, 2022

datapythonista left a comment

datapythonista Aug 23, 2022

datapythonista Aug 23, 2022

datapythonista Aug 23, 2022

datapythonista commented Sep 1, 2022

github-actions bot commented Oct 2, 2022

mroeschke commented Oct 4, 2022

		@@ -184,12 +189,20 @@ def correct_title_capitalization(title: str) -> str:
		# first word character.
		correct_title: str = re.sub(r"^\W*", "", title).capitalize()

DOC: Altered capitalization validation script to handle edge cases #48100

DOC: Altered capitalization validation script to handle edge cases #48100

Conversation

INDIG0N commented Aug 15, 2022

pep8speaks commented Aug 15, 2022 • edited Loading

Comment last updated at 2022-08-23 04:15:11 UTC

phofl left a comment

Choose a reason for hiding this comment

phofl Aug 16, 2022

Choose a reason for hiding this comment

INDIG0N Aug 16, 2022

Choose a reason for hiding this comment

INDIG0N commented Aug 16, 2022

datapythonista commented Aug 17, 2022

INDIG0N commented Aug 23, 2022

datapythonista left a comment

Choose a reason for hiding this comment

datapythonista Aug 23, 2022

Choose a reason for hiding this comment

datapythonista Aug 23, 2022

Choose a reason for hiding this comment

datapythonista Aug 23, 2022

Choose a reason for hiding this comment

datapythonista commented Sep 1, 2022

github-actions bot commented Oct 2, 2022

mroeschke commented Oct 4, 2022

pep8speaks commented Aug 15, 2022 •

edited

Loading