Improve String Handling #1132

bbugyi200 · 2019-11-02T22:02:45Z

This pull request's main intention is to wrap long strings (as requested by #182); however, it also provides better string handling in general and, in doing so, closes the following issues:

Closes #26
Closes #182
Closes #933
Closes #1183
Closes #1243

Examples

f-strings will be split if they are too long (just like a normal long string would be) and the f prefix will be dropped when possible:

##### INPUT
fstring = f"f-strings definitely make things more {difficult} than they need to be for {{black}}. But boy they sure are handy. The problem is that some lines will need to have the 'f' whereas others do not. This {line}, for example, needs one."

##### OLD OUTPUT
fstring = f"f-strings definitely make things more {difficult} than they need to be for {{black}}. But boy they sure are handy. The problem is that some lines will need to have the 'f' whereas others do not. This {line}, for example, needs one."

##### NEW OUTPUT
fstring = (
    f"f-strings definitely make things more {difficult} than they need to be for"
    " {black}. But boy they sure are handy. The problem is that some lines will need"
    f" to have the 'f' whereas others do not. This {line}, for example, needs one."
)

Manual user splits will be respected when it is possible to do so without violating the line length limit:

##### INPUT
good_split_func(
    xxx, yyy, zzz,
    long_string_kwarg="But what should happen when code has already "
                      "been formatted but in the wrong way? Like "
                      "with a space at the end instead of the "
                      "beginning. Or what about when it is split too "
                      "soon?",
)

##### OLD OUTPUT
good_split_func(
    xxx,
    yyy,
    zzz,
    long_string_kwarg="But what should happen when code has already "
    "been formatted but in the wrong way? Like "
    "with a space at the end instead of the "
    "beginning. Or what about when it is split too "
    "soon?",
)

##### NEW OUTPUT
good_split_func(
    xxx,
    yyy,
    zzz,
    long_string_kwarg=(
        "But what should happen when code has already "
        "been formatted but in the wrong way? Like "
        "with a space at the end instead of the "
        "beginning. Or what about when it is split too "
        "soon?"
    ),
)

If a manual user split violates the line length limit, however, it will NOT be respected:

##### INPUT
bad_split_func(
    xxx, yyy, zzz,
    long_string_kwarg="But what should happen when code has already been formatted but in the wrong way? Like with "
                      "a space at the end instead of the beginning. Or what about when it is split too soon?",
)

##### OLD OUTPUT
bad_split_func(
    xxx,
    yyy,
    zzz,
    long_string_kwarg="But what should happen when code has already been formatted but in the wrong way? Like with "
    "a space at the end instead of the beginning. Or what about when it is split too soon?",
)

##### NEW OUTPUT
bad_split_func(
    xxx,
    yyy,
    zzz,
    long_string_kwarg=(
        "But what should happen when code has already been formatted but in the wrong"
        " way? Like with a space at the end instead of the beginning. Or what about"
        " when it is split too soon?"
    ),
)

Line continuation backslashes inside of strings will no longer be tolerated:

##### INPUT
bad_split = "\
But what should happen when code has already\
 been formatted but in the wrong way? Like\
 with a space at the beginning instead of the\
 end. Or what about when it is split too\
 soon? In the case of a split that is too\
 short, black will try to honer the custom\
 split.\
"

##### OLD OUTPUT
bad_split = "\
But what should happen when code has already\
 been formatted but in the wrong way? Like\
 with a space at the beginning instead of the\
 end. Or what about when it is split too\
 soon? In the case of a split that is too\
 short, black will try to honer the custom\
 split.\
"

##### NEW OUTPUT
bad_split = (
    "But what should happen when code has already been formatted but in the wrong way?"
    " Like with a space at the end instead of the beginning. Or what about when it is"
    " split too soon? In the case of a split that is too short, black will try to honer"
    " the custom split."
)

Unnecessary surrounding parens will be stripped from strings and adjacent short strings will be merged when possible:

##### INPUT
func_call(
    sss=(
        "Some "
        '"short" '
        f"{string}."
    )
)

##### OLD OUTPUT
func_call(sss=("Some " '"short" ' f"{string}."))

##### NEW OUTPUT
func_call(sss=f'Some "short" {string}.')

This commit signifigantly dirties up the code introduced in this PR (psf#1132). I plan to clean all of this up considerably before merging with master.

…ings

ichard26 · 2020-07-20T01:21:48Z

@bbugyi200, sorry for the ping. I have a question if this formatting is intended or not.

(black) richard-26@ubuntu-laptop:~/programming/black$ black test.py --diff --color
--- test.py	2020-07-20 00:40:48.843094 +0000
+++ test.py	2020-07-20 00:40:50.391909 +0000
@@ -10,11 +10,6 @@
-raise ValueError(
-    'Invalid input:\n'
-    f' * x={x}\n'
-    f' * y={y}\n'
-    f' * z={z}'
-)
+raise ValueError(f"Invalid input:\n * x={x}\n * y={y}\n * z={z}")

At first, I assumed this wouldn't happen because of "Manual user splits will be respected when it is possible to do so without violating the line length limit" Is this string literal merging related to "Unnecessary surrounding parens will be stripped from strings and adjacent short strings will be merged when possible"? Or is the "manual user splits" protection is only for assignments?

Asking since I am a bit surprised at Black's output from this issue: #1540

Thanks!

bbugyi200 · 2020-07-22T19:31:05Z

@ichard26 No problem. Let's look at how black handled this example before compared to how it does now:

##### INPUT
raise ValueError(
    'Invalid input:\n'
    f' * x={x}\n'
    f' * y={y}\n'
    f' * z={z}'
)

##### OLD OUTPUT
raise ValueError("Invalid input:\n" f" * x={x}\n" f" * y={y}\n" f" * z={z}")

##### NEW OUTPUT
raise ValueError(f"Invalid input:\n * x={x}\n * y={y}\n * z={z}")

So the only thing that #1132 has changed about this output is that the strings are merged together.

bbugyi200 changed the title ~~Wrap long strings~~ [WIP] Wrap long strings Nov 2, 2019

bbugyi200 force-pushed the 182-wrap-long-strings branch from 274715a to 8592d46 Compare November 3, 2019 18:09

bbugyi200 mentioned this pull request Nov 9, 2019

feature_request(formatting): wrap long strings #182

Closed

peterjc mentioned this pull request Dec 1, 2019

Poor handling of multi-line string literals using slash continuation #1183

Closed

bbugyi200 force-pushed the 182-wrap-long-strings branch from c74192d to 4d42b81 Compare December 7, 2019 20:42

bbugyi200 force-pushed the 182-wrap-long-strings branch from 2a93aa3 to e0eec91 Compare December 29, 2019 22:03

bbugyi200 added 23 commits January 19, 2020 22:42

[feat] Improve regex for matching strings

f4455e6

[ref] Better name for regex group

0050244

[ref] Move long_tuple under "Regression Tests" section

585301f

[ref] Use "RE" prefix for regular expressions instead of "STRING"

324375b

[ref] Clean up variable names

188b212

[fix] Handle backslashes better

dcf6362

[ref] Cleanup code for finding an even number of backslashes

200f422

[ref] Change variable name to RE_NOT_QUOTE_FMT

ec5ab65

[ref] Make RE_NOT_QUOTE a format function

bcd8124

[ref] Remove unnecessary '.replace(...)' call

eee4150

[ref] Remove '_REGEXP' suffix from variable names

ebba5ee

Merge remote-tracking branch 'upstream/master' into 182-wrap-long-str…

d63c20e

…ings

[sty] Blacken test_black.py

e8c7d1b

[meta-fix] Remove 'NamedTuple' import

d2fbc02

[ref] Rename variable to 'RE_EOL'

280bf9f

[test] Add tests for paren stripper with old-style formatting

377c306

[fix] Bug with old-style formatting strippers

78a2391

[ref] Factor 'get_first_unmatched_rpar_idx(...)' out of StringStrippers

9e6bd22

[fix] Fix bug with merging strings that contain backslashed quotes

c320f2d

[fix] Don't match internal strings when stripping parens

12e05fc

[fix] Break index must be > 1

f9ab316

[fix] + operator was being stripped from string expressions sometimes

d0befe2

[fix] Don't try to merge strings in comments

2a5b19c

ichard26 mentioned this pull request Jul 1, 2020

Long string remains unchanged, can black make them in multine representation [help] #1528

Closed

pradyunsg mentioned this pull request Jul 5, 2020

Format code with Black pypa/pip#7084

Closed

chrahunt mentioned this pull request Jul 16, 2020

Change when we warn about dependency conflicts during pip install pypa/pip#8590

Merged

ichard26 mentioned this pull request Jul 20, 2020

When splitting lines, \n is ignored #1467

Open

bbugyi200 mentioned this pull request Jul 22, 2020

Black shouldn't merge multi-line string when last character on the line is '\n' #1540

Open

skshetry mentioned this pull request Aug 17, 2020

upgrade to isort5 iterative/dvc#4399

Merged

2 tasks

JelleZijlstra mentioned this pull request Aug 18, 2020

Disable string splitting/merging by default #1609

Merged

bnavigator mentioned this pull request Aug 19, 2020

Added IPython LaTeX representation method for StateSpace objects python-control/python-control#450

Merged

bbugyi200 mentioned this pull request Sep 6, 2020

Fix crash on assert and parenthesized % format (fixes #1597, fixes #1605) #1681

Merged

lyz-code mentioned this pull request Oct 26, 2020

Long string remains unchanged #1787

Closed

bbugyi200 mentioned this pull request Nov 6, 2020

Black produced invalid code: invalid syntax with --experimental-string-processing #1807

Closed

kierdavis mentioned this pull request Mar 25, 2021

Long strings not broken? #2062

Closed

felix-hilden mentioned this pull request Apr 14, 2021

Document experimental string processing and docstring indentation #2106

Merged

bbugyi200 mentioned this pull request Jun 5, 2021

Line too long: Long triple-quoted string #1617

Open

peterjc mentioned this pull request Jul 10, 2021

Extend to trailing/leading space on split strings [Enhancement] flake8-implicit-str-concat/flake8-implicit-str-concat#12

Open

bbugyi200 mentioned this pull request Mar 11, 2022

Enable feature string_processing to be default #2188

Open

yilei mentioned this pull request Jul 19, 2022

Add parens around implicit string concatenations where increases readability #3162

Merged

3 tasks

Hendler mentioned this pull request Nov 14, 2023

Cannot parse f-string. #4046

Closed

JelleZijlstra mentioned this pull request Feb 5, 2024

Future of string_processing #4208

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve String Handling #1132

Improve String Handling #1132

bbugyi200 commented Nov 2, 2019 •

edited

Loading

ichard26 commented Jul 20, 2020

bbugyi200 commented Jul 22, 2020 •

edited

Loading

Improve String Handling #1132

Improve String Handling #1132

Conversation

bbugyi200 commented Nov 2, 2019 • edited Loading

Examples

ichard26 commented Jul 20, 2020

bbugyi200 commented Jul 22, 2020 • edited Loading

bbugyi200 commented Nov 2, 2019 •

edited

Loading

bbugyi200 commented Jul 22, 2020 •

edited

Loading