Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge implicitly concatenated string literals that fit on one line #26

Open
max-sixty opened this issue Mar 15, 2018 · 16 comments

Comments

@max-sixty
Copy link

commented Mar 15, 2018

Black could make single-line strings over multiple lines (i.e. a number of single quotes strings on multiple lines surrounded by parentheses) more efficient, by resizing them to the full length of the line.

Even if that was overreach, there's a peculiar situation where you end up with multiple strings on the same line, like below:

-        warnings.warn('Dataset.sel_points is deprecated: use Dataset.sel()'
-                      'instead.', DeprecationWarning, stacklevel=2)
-
+        warnings.warn(
+            'Dataset.sel_points is deprecated: use Dataset.sel()' 'instead.',
+            DeprecationWarning,
+            stacklevel=2,
+        )
@ambv

This comment has been minimized.

Copy link
Collaborator

commented Mar 15, 2018

In this case Black is suggesting that you should merge the two strings into one and the result is more readable that way.

I don't do this automatically (yet?) because it gets complicated if the two strings don't share the same prefix (for example r'something' f'another thing'). This is where user action after the formatting is probably best.

@max-sixty

This comment has been minimized.

Copy link
Author

commented Mar 15, 2018

Right, that makes sense. I think whether the strings on the same line are merge-able is clear (i.e. do they have the same prefix), but yes it's a rare case; feel free to close

And changing beyond that may require discretion (e.g. turning 6 lines of 2/3-long lines into 4 full length lines)

@ambv

This comment has been minimized.

Copy link
Collaborator

commented Mar 15, 2018

There's another related problem: if I merged string literals, I am now making semantic changes to the AST. I'm not opposed to those but this will make safety checks after reformatting trickier.

Let's leave this open for the time being, it's an interesting problem.

@ambv ambv changed the title Handling long strings Merge implicitly concatenated string literals that fit on one line Mar 15, 2018

@ambv ambv added the enhancement label Mar 16, 2018

@zsol

This comment has been minimized.

Copy link
Collaborator

commented Apr 3, 2018

I'm not even sure what I would expect black to do with code that implicit-concatenates two differently prefixed strings to be honest. I think the path of least surprise is just leaving them alone.

@ambv

This comment has been minimized.

Copy link
Collaborator

commented Apr 3, 2018

Yeah, it they are different prefixes, leave them alone. If they share a prefix and they end up on the same line, they should be merged.

If you really want to be correct here the implementation is going to be hard in the following edge case:

  • two strings like "STR1" "STR2" don't fit on one line because the closing quote of STR1, the space, and the opening quote of STR2 are the 3 characters that cause the entire thing to not fit in a single line. So you will keep them on two lines.
  • but if you knew that it's safe to concatenate them, it would fit in a single line (without those 3 extra characters).

I'm inclined not to touch this edge case since that makes it tricky where to perform the merge.

Another small edge case which I'm inclined to avoid is this:

a = (
    "a"
    "bb"
    "ccc"
    "dddd"
    "eeeee"
    "ffffff"
    "ggggggg"
    "hhhhhhhh"
    "iiiiiiiii"
    "jjjjjjjjjj"
    "kkkkkkkkkkk"
    "llllllllllll"
    "mmmmmmmmmmmmmmmm"
)

Technically Black could implement the "fill" algorithm for this case that Prettier also has for JSX. But I think what it currently does is fine for simplicity and obvious for users to recognize.

@aldanor

This comment has been minimized.

Copy link

commented Jul 15, 2018

Another related case I've managed to hit with black is when it joins \-split string into one, and by doing so it violates line length limit. In this case (at least for now, while it's not implemented still), it'd be probably better if it did nothing? E.g.:

$ black -S -l30 --diff long_str.py

--- long_str.py	2018-07-15 12:24:14 +0000
+++ long_str.py	2018-07-15 12:24:41.221434 +0000
@@ -1,5 +1,2 @@
-s = '111111111111111111111' \
-    '222222222222222222222' \
-    '333333333333333333333' \
-    '444444444444444444444'
+s = '111111111111111111111' '222222222222222222222' '333333333333333333333' '444444444444444444444'
@e3krisztian

This comment has been minimized.

Copy link

commented Aug 1, 2018

@aldanor line length violation should be a bug (I have also seen it with black==18.6b4). I think it deserves a separate issue.

@graingert

This comment has been minimized.

Copy link

commented Aug 20, 2018

@ambv Python 3.7 has AST level constant folding: https://bugs.python.org/issue29469 so implicit string concatenation - or lack thereof - would be invisible to the AST check

@ambv

This comment has been minimized.

Copy link
Collaborator

commented Aug 20, 2018

I know, @graingert, but we can't require Python 3.7+ for all Black users. Not yet at least. What I'm pondering is if we should rather switch to do the AST post-check using typed-ast which would make it work exactly the same on both Python 2 and Python 3. And the same on all Python 3 versions.

@hugovk

This comment has been minimized.

Copy link
Contributor

commented Aug 20, 2018

By the way, here's the pip installs for Black from PyPI for July 2018:

python_version percent download_count
3.6 86.07% 15,408
3.7 13.21% 2,364
3.5 0.41% 73
2.7 0.22% 39
3.4 0.08% 14
3.8 0.01% 2
3.3 0.01% 1
2.6 0.01% 1
Total 17,902

Source: pypinfo --start-date 2018-07-01 --end-date 2018-07-31 --percent --markdown black pyversion

@ambv

This comment has been minimized.

Copy link
Collaborator

commented Aug 20, 2018

Haha, one of those 3.8 downloads is me :-)

@bertjwregeer

This comment has been minimized.

Copy link

commented Oct 16, 2018

Having Black complete the concatenation would be great.

@ofek

This comment has been minimized.

Copy link
Contributor

commented Dec 2, 2018

In the time being can we add a warning when this happens so we can manually resolve it?

@davidism

This comment has been minimized.

Copy link

commented Feb 18, 2019

I'd definitely like to see a warning for this, maybe something like "Implicit string concatenation in line N not merged." An issue I've run into is that someone writing multiple similar strings, in tests for example, will add continuations for all the strings so they look the same, even though some would fit on a single line. Then Black moves them back to a single line but leaves the continuation sitting in the middle. The user was trying to satisfy the formatting rules, but ended up producing less ideal formatting without knowing it.

@luxcem

This comment has been minimized.

Copy link

commented Mar 28, 2019

What's your view on this example? Black left it unchanged.

def foo():
    return "Some long string cut in half," " this is really a long string"

def bar(text):
    return text

bar(("Some long string cut in half," " this is really a long string"))

sirosen added a commit to sirosen/globus-cli that referenced this issue Jul 19, 2019

Fix blackened string concatenation
Several string literals got moved to one line by black, but didn't get
merged -- the result is odd and undesirable.
Although people want Black to make the jump to editing AST for cases
exactly like this ( psf/black#26 ), it's
doubtful that will happen anytime soon.

I would not be surprised, based on that thread, if black just waits for
py3.6 to EOL and to support 3.7+ with this.

sirosen added a commit to sirosen/globus-cli that referenced this issue Jul 19, 2019

Fix blackened string concatenation
Several string literals got moved to one line by black, but didn't get
merged -- the result is odd and undesirable.
Although people want Black to make the jump to editing AST for cases
exactly like this ( psf/black#26 ), it's
doubtful that will happen anytime soon.

I would not be surprised, based on that thread, if black just waits for
py3.6 to EOL and to support 3.7+ with this.
@peterjc

This comment has been minimized.

Copy link

commented Aug 9, 2019

Is there an open issue for the doing the opposite? I've found when black has left long lines in my code, it is usually overly long strings (mostly error messages, and when defining command line arguments).

Black could break long strings over multiple lines with implicit continuation (e.g. at spaces, or hyphens). I appreciate this would mean black having to set a convention for if the break point space should be trailing at the end of a truncated line, or leading at the start of a continued line.

Found it: See #182

jdufresne added a commit to jdufresne/isort that referenced this issue Aug 17, 2019

Combine adjacent strings into one string
Black has a bug such that strings that were previously multiple lines
sometimes end up as adjacent strings on one line. See:

psf/black#26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.