-
-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: Altered capitalization validation script to handle edge cases #48100
Conversation
…dered for fixing capitalization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you run pre-commit locally? The check is failing currently
@@ -206,7 +206,7 @@ invoked with the following command | |||
D-Tale integrates seamlessly with Jupyter notebooks, Python terminals, Kaggle | |||
& Google Colab. Here are some demos of the `grid <http://alphatechadmin.pythonanywhere.com/dtale/main/1>`__. | |||
|
|||
`hvplot <https://hvplot.holoviz.org/index.html>`__ | |||
`Hvplot <https://hvplot.holoviz.org/index.html>`__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we use hvPlot here too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, I'll change that real quick
@phofl I'm a bit new to this, how can I run pre-commit locally, and will doing that automatically resolve the failed check here or do I need to do something else? |
@INDIG0N you can simply run the validation script and make sure it's happy locally: |
@datapythonista oh, the script worked fine locally, even used it to find some capitalization errors and fix them. I see there's a conflict though, I'll take care of that and then the branch should be good to merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the work on this, nice fixes, I added few comments, but looks good.
I guess we're currently ignoring some titles in the docs that we can stop ignoring here in this PR, no?
word_list = re.split(r"\W", correct_title) | ||
|
||
# Recombine hyphenated words | ||
for word in correct_title.split(): | ||
if '-' in word: | ||
lst = word.split('-') | ||
first = lst[0] | ||
for idx, val in enumerate(word_list): | ||
if val == first: | ||
for _ in range(len(lst)): | ||
del word_list[idx] | ||
word_list.insert(idx, '-'.join(lst)) | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
word_list = re.split(r"\W", correct_title) | |
# Recombine hyphenated words | |
for word in correct_title.split(): | |
if '-' in word: | |
lst = word.split('-') | |
first = lst[0] | |
for idx, val in enumerate(word_list): | |
if val == first: | |
for _ in range(len(lst)): | |
del word_list[idx] | |
word_list.insert(idx, '-'.join(lst)) | |
break | |
word_list = re.split(r"[^a-zA-Z0-9_-]", correct_title) |
I think by changing the regular expression you can get rid of this function. I think the regex [\W-]
should also work and it's way shorter, but in a quick test it didn't seem like it worked, but feel free to research a better pattern.
@@ -184,12 +189,20 @@ def correct_title_capitalization(title: str) -> str: | |||
# first word character. | |||
correct_title: str = re.sub(r"^\W*", "", title).capitalize() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know you didn't introduce this (maybe it was me), but don't you think correct_title
is misleading here? I'd probably overwrite title instead. Also, it may help read this function is in these steps we have a comment with an example of what each step is doing.
I guess removing this line temporary should make the script fail with the examples that this is making pass. I personally think it'd be helpful if for the non trivial steps we had comments with what it does.
Like - foo bar foo-bar
, then foo bar foo-bar
, then ['foo', 'bar', 'foo-bar']
...
correct_title_capitalization(title)}" """ | ||
f"""{filename}:{line_number}:{err_msg} "{ | ||
removed_https_title.strip()}" to "{ | ||
correct_title_capitalization(removed_https_title).strip()}" """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a second thought, I'd probably just force all urls to be lowercase. The code will be simpler, and it's probably a good idea anyway, even if some people make them follow their branding, as urls are case insensitive.
@INDIG0N do you have time to address the last comments? |
This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this. |
Thanks for the pull request, but it appears to have gone stale. If interested in continuing, please merge in the main branch, address any review comments and/or failing tests, and we can reopen. |
references #32550
I fixed the capitalization validation script to stop considering urls.
I also altered the script to consider hyphenated words as a single word. Previously this was asking you to change the capitalization of packages that had "Pandas" in the name, and adding that to the exceptions list would have let the script pass through instances where Pandas DID need to be changed to lowercase.