Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression fix: leave R prefixes capitalization alone #2285

Merged
merged 2 commits into from
Jun 9, 2021

Conversation

ichard26
Copy link
Collaborator

black.strings.get_string_prefix used to lowercase the extracted
prefix before returning it. This is wrong because 1) it ignores the
fact we should leave R prefixes alone because of MagicPython, and 2)
there is dedicated prefix casing handling code that fixes issue 1.
.lower is too naive.

This was originally fixed in 20.8b0, but was reintroduced since 21.4b0.

(Re)fixes GH-1244.

@ichard26 ichard26 added the F: strings Related to our handling of strings label May 30, 2021
@JelleZijlstra
Copy link
Collaborator

I think there's code in trans.py that relies on the result of this function being lowercased. See trans.py line 438 for example, where we check for "f" in prefix.

@ichard26 ichard26 marked this pull request as draft May 30, 2021 20:56
@felix-hilden
Copy link
Collaborator

felix-hilden commented Jun 2, 2021

Yeah after #2297 I was wondering why there are essentially two functions for getting docstring prefixes. Looking at their usage it seems that get_prefix is used more widely for just information about the prefix and normalize in one place to actually transform the lines. Maybe they should be combined in some way, or separated entirely into strictly getter and transformer. The dependencies to lowercase could very easily be implemented in the calling code.

But there is at least one other place which transforms prefixes, in string merging and splitting.

`black.strings.get_string_prefix` used to lowercase the extracted
prefix before returning it. This is wrong because 1) it ignores the
fact we should leave R prefixes alone because of MagicPython, and 2)
there is dedicated prefix casing handling code that fixes issue 1.
`.lower` is too naive.

This was originally fixed in 20.8b0, but was reintroduced since 21.4b0.

I also added proper prefix normalization for docstrings by using the
`black.strings.normalize_string_prefix` helper.

Some more test strings were added to make sure strings with capitalized
prefixes aren't treated differently (actually happened with my original
patch, Jelle had to point it out to me).
@ichard26 ichard26 force-pushed the fix-string-prefix-regression branch from ca7a34d to c2ce5cf Compare June 4, 2021 20:38
@ichard26 ichard26 marked this pull request as ready for review June 4, 2021 20:39
@@ -87,7 +87,7 @@ def get_string_prefix(string: str) -> str:
prefix = ""
prefix_idx = 0
while string[prefix_idx] in STRING_PREFIX_CHARS:
prefix += string[prefix_idx].lower()
prefix += string[prefix_idx]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this could not repeatedly append to the string and just calculate the index, and then slice the whole prefix at once 🤔 Not really a comment to the PR but came to my mind anyway.

Copy link
Collaborator

@felix-hilden felix-hilden Jun 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly min(s.find("'"), s.find('"')) doesn't quite work because no matches is returned as -1 😄 and it could be way less efficient

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm taking this as that no you have no nits or objections for this PR :)

Thanks for the review though! You're doing great adjusting to your new role!

@felix-hilden
Copy link
Collaborator

Apart from the comment above, the changes seem good to me at least. Leaving all processing to calling code seems appropriate in a function called get.


fstring = (
f"f-strings definitely make things more {difficult} than they need to be for"
" {black}. But boy they sure are handy. The problem is that some lines will need"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this style of test

@JelleZijlstra JelleZijlstra merged commit 00e7e12 into main Jun 9, 2021
@ichard26 ichard26 deleted the fix-string-prefix-regression branch June 9, 2021 00:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
F: strings Related to our handling of strings
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Preserve r-string prefix capitalization
3 participants