Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support named escapes (\N{...}) in string processing #2319

merged 12 commits into from Jun 9, 2021


Copy link

@jack1142 jack1142 commented Jun 9, 2021

Fixes #1468 (or at least the issue that's in the comment there since they're not as closely related as I initially thought)

Not an awfully big amount of tests added here but I'm not sure if there's a need for more.

Copy link

@felix-hilden felix-hilden left a comment

Hi, thanks for submitting! I'm not familiar with this part of the code base so take it with a grain of salt, but I left some thoughts below. I'd like for the more experienced maintainers to take a look too.


src/black/ Outdated Show resolved Hide resolved
src/black/ Show resolved Hide resolved
tests/data/ Show resolved Hide resolved
src/black/ Outdated Show resolved Hide resolved
Copy link

@felix-hilden felix-hilden left a comment

In my opinion this looks much better now, thanks a ton for being so speedy to modify!


Copy link

@JelleZijlstra JelleZijlstra commented Jun 9, 2021

I wonder if this could be generalized into other string pieces that can't be split, so we don't need special-case logic for the different kinds. Instead, we could just generate a single list of unsplittable slices.


Copy link

@felix-hilden felix-hilden commented Jun 9, 2021

Yep it should definitely be done!


Copy link
Contributor Author

@jack1142 jack1142 commented Jun 9, 2021

I wonder if this could be generalized into other string pieces that can't be split, so we don't need special-case logic for the different kinds. Instead, we could just generate a single list of unsplittable slices.

I can look into this but before I do I would like to know if we care that some spans might overlap? Because I could either just make a list of slices from all functions that generate the slices (so _get_nameescape_slices and fexpr_slices at the current time) and just use that or I could go further - sort it and make a new list of non-overlapping spans.


Copy link

@JelleZijlstra JelleZijlstra commented Jun 9, 2021

Overlap seems fine with how we're using these slices. Actually it may be more efficient to generate a set of illegal indices, since the current code looks quadratic to me. Which reminds me I should run some profiling for #2314.


Copy link
Contributor Author

@jack1142 jack1142 commented Jun 9, 2021

I went with the set idea as that indeed seems like a more performant way and isn't really hard to do either.

Sadly this causes the diff to be more complicated, if that's a problem I can split this into a separate PR that can be reviewed separately after this one is merged.


src/black/ Outdated Show resolved Hide resolved
Co-authored-by: Jelle Zijlstra <>
Copy link

@felix-hilden felix-hilden commented Jun 9, 2021

If we wanted, I bet we could construct a single function that loops the string through once, cycling through N escape and f-string modes and generating the index ranges. But if this works, then it could be enough for this PR!


@JelleZijlstra JelleZijlstra merged commit 62402a3 into psf:main Jun 9, 2021
36 checks passed
Copy link

@ichard26 ichard26 commented Jun 9, 2021

Thank you so much for your contribution! This project is only possible by contributions like these 🖤. You're awesome, @jack1142. Many thanks for pretty much beta testing the experimental string processing handling extremely early. I'm sure you're the reason why this feature will be a lot less buggy. Which is great since we want to push a release with it enabled by default soon.


@jack1142 jack1142 deleted the fix_splitting_for_escape_sequences branch Jun 9, 2021
BD103 added a commit to BD103/2black that referenced this issue Jul 27, 2021
* Fix and test docs on Windows (psf#2262)

There's some weird interaction between Click and
sphinxcontrib-programoutput on Windows that leads to an encoding error
during the printing of black-primer's help text.

Also symlinks aren't well supported on Windows so let's just use
includes which actually work because we now use MyST :D

* Use latest Python in uploading binaries (psf#2260)

* Use latest Python in uploading binaries

* Don't pin version at all

* Add changelog entry

* Issue templates: use HTML comments (psf#2269)

This commit makes use of HTML comments inside GitHub issue templates
to make sure that even if they aren't removed by the issue author they won't be shown
in the rendered output.

The goal is to simply make the issues less noisy by removing template messages.

* Add --experimental-string-processing to future changes (psf#2273)

* add esp to future style

* changelog

* fix label

* Fix path_empty() (psf#2275)

Behavior other than output shouldn't depend on the verbose/quiet option. As far as I can tell this currently has no visible effect, since code after this function is called handles an empty list gracefully.

* Add @zzzeek testimonial to README and docs

* ptr nolong requires changes (psf#2276)

- I worked on this project yesterday and must have fixed the formatting

* add discussion of magic comments to FAQ (psf#2272)

Co-authored-by: Cooper Lees <>

* Fix --experiemental-string-processing crash when matching parens not found (psf#2283)

Fixes psf#2271

* Make sure to split lines that start with a string operator (psf#2286)

Fixes psf#2284

* Fix regular expression that black uses to identify f-expressions (psf#2287)

Fixes psf#1469

* Update for 21.5b2 release (psf#2290)

* Update for 21.5b2 release

* Correct max string length calculation when there are string operators (psf#2292)

PR psf#2286 did not fix the edge-cases (e.g. when the string is just long
enough to cause a line to be 89 characters long). This PR corrects that

* Add `version` to github action (and rewrite the whole thing while at it) (psf#1940)

Commit history before merge:

* Add black_version to github action
* Merge upstream/main into this branch
* Add version support for the Black action pt.2

  Since we're moving to a composite based action, quite a few changes
  were made. 1) Support was added for all OSes (Windows was painful). 
  2) Isolation from the rest of the workflow had to be done manually
  with a virtual environment.

  Other noteworthy changes:

  - Rewrote basically all of the logic and put it in a Python script
    for easy testing (not doing it here tho cause I'm lazy and I can't
    think of a reasonable way of testing it).
  - Renamed `black_version` to `version` to better fit the existing
    input naming scheme.
  - Added support for log groups, this makes our action's output a
    bit more fancy (I may or may have not added some debug output too).

* Add more to and sorta rewrite the Action's docs

  Reflect compatability and gotchas.

* Add CHANGELOG entry
* Merge main into this branch
* Remove debug; address typos; clean up action.yml

Co-authored-by: Richard Si <>

* Bump urllib3 from 1.26.4 to 1.26.5 (psf#2298)

Bumps [urllib3]( from 1.26.4 to 1.26.5.
- [Release notes](
- [Changelog](
- [Commits](urllib3/urllib3@1.26.4...1.26.5)

- dependency-name: urllib3
  dependency-type: indirect

Signed-off-by: dependabot[bot] <>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]>

* Code Flag Options (psf#2259)

Properly handles the diff, color, and fast option when black is run with
 the `--code` option.

Closes psf#2104, closes psf#1801.

* Move `--code` psf#2259 change log to correct unlreased section of

* remove unnecessary docs changelog

* don't uvloop.install on import (psf#2303)

* Add option to require a specific version to be running (psf#2300)

Closes psf#1246: This PR adds a new option (and automatically a toml entry, hooray for existing configuration management 🎉) to require a specific version of Black to be running.

For example: `black --required-version 20.8b -c "format = 'this'"`

Execution fails straight away if it doesn't match `__version__`.

* Go back to single core for test suite on CI (psf#2305)

The random asyncio bug is just too frequent and annoying to be
worth the speed improvements. Our test suite is already quite fast.
Random test failures hurt for 3 reasons, 1) they are discouraging for
new contributors who won't understand it's out of their control, 2)
it's annoying and time consuming to rerun the workflow, and 3) it
makes single job failures feel less important (even they should be
treated as important!).

* Account for += assignment when deciding whether to split string (psf#2312)

Fixes psf#2294

* Fix incorrect custom breakpoint indices when string group contains fake f-strings (psf#2311)

Fixes psf#2293

* [primer] Enable everything (psf#2288)

See if we pass all our repos with experimental string processing enabled.
Django probably needed:
- Ignores >= 3.8 only

We could support PEP440 version specifiers, but that would introduce the packaging module as a dependency that I'd like to avoid ... Or I could implement a poor persons version or vendor

Commit history before merge:
 * [primer] Enable everything
 * Add exclude extend to django CLI args for primer
 * Change default timeout to from 5 to 10 mins for a primer project
 * Skip string normalization for Django
 * Limit Django to >= 3.8 due to := operator

* Possible fix for issue with indentation and fmt: skip (psf#2281)

Not sure the fix is right.  Here is what I found: issue is connected
with line

    first.prefix = prefix[comment.consumed :]

in ``.  `first.prefix` is a prefix of the line, that ends
with `# fmt: skip`, but `comment.consumed` is the length of the
`"  # fmt: skip"` string.  If prefix length is greater than 14,
`first.prefix` will grow every time we apply formatting.

Fixes psf#2254

* Mention comment non-processing in documentation (psf#2306)

This commit adds a short section discussing the non-processing of docstrings
besides spacing improvements, mentions comment moving and links to the
AST equivalence discussion. I also added a simple spacing test for good

Commit history before merge:

* Mention comment non-processing in documentation, add spacing test
* Mention special cases for comment spacing
* Add all special cases, improve wording

* Regression fix: leave R prefixes capitalization alone (psf#2285)

`black.strings.get_string_prefix` used to lowercase the extracted
prefix before returning it. This is wrong because 1) it ignores the
fact we should leave R prefixes alone because of MagicPython, and 2)
there is dedicated prefix casing handling code that fixes issue 1.
`.lower` is too naive.

This was originally fixed in 20.8b0, but was reintroduced since 21.4b0.

I also added proper prefix normalization for docstrings by using the
`black.strings.normalize_string_prefix` helper.

Some more test strings were added to make sure strings with capitalized
prefixes aren't treated differently (actually happened with my original
patch, Jelle had to point it out to me).

* Fix flake8 configuration by switching from extend-ignore to ignore (psf#2320)

* Support named escapes (`\N{...}`) in string processing (psf#2319)

Co-authored-by: Felix Hildén <>
Co-authored-by: Jelle Zijlstra <>

* Don't run Docker workflow on push to forks (psf#2324)

* Add coverage files to gitignore (psf#2323)

* Update for 21.6b0 release (psf#2325)

* Fix incorrect document referance (psf#2326)

* Add STDIN test to primer (psf#2315)

* Add STDIN test to primer

- Check that out STDIN black support stays working
- Add asyncio.subprocess STDIN pip via communicate
- We just check we format python code from primer's ``

Fixes psf#2310

* Find pyproject from vim relative to current file (psf#1871)

Commit history before merge:

* Find pyproject from vim relative to current file
* Merge remote-tracking branch 'upstream/main' into find-pyproject-vim
* Finish and fix this patch (thanks Matt Wozniski!)

Both the existing code and the proposed code are broken.
The vim.eval() call (whether it's vim.eval("@%") or
vim.eval("fnamemodify(getcwd(), ':t')) returns a string, and it passes
that string to find_pyproject_toml, which expects a sequence of strings,
not a single string, and - since a string is a sequence of single
character strings - it gets turned into a list of ridiculous paths. I
tested with a file called, and added a print(path_srcs) into
find_project_root, which printed out:


This does work for an unnamed buffer, too - we wind up calling
black.find_pyproject_toml(("",)), and that winds up prepending the
working directory to any relative paths, so "" just gets turned into
the current working directory.

Note that find_pyproject_toml needs to be passed a 1-tuple, not a
list, because it requires something hashable (thanks to
functools.lru_cache being used)

Co-authored-by: Matt Wozniski <>

* I forgot the CHANGELOG entry ... again
* I'm really bad at dealing with merge conflicts sometimes
* Be more correct describing search behaviour

Co-authored-by: Austin Glaser <>
Co-authored-by: Richard Si <>
Co-authored-by: Matt Wozniski <>

* Vim plugin fix string normalization option (psf#1869)

This commit fixes parsing of the skip-string-normalization option in vim
plugin. Originally, the plugin read the string-normalization option,
which does not exist in help (--help) and it's not respected by black
on command line.

Commit history before merge:

* fix string normalization option in vim plugin
* fix string normalization option in vim plugin
* Finish and fix patch (thanks Matt Wozniski!)

FYI: this is totally the work and the comments below of Matt (AKA godlygeek)

This fixes two entirely different problems related to how pyproject.toml
files are handled by the vim plugin.

=== Problem #1 ===

The plugin fails to properly read boolean values from pyproject.toml.
For instance, if you create this pyproject.toml:

quiet = true

the Black CLI is happy with it and runs without any messages, but the
:Black command provided by this plugin fails with:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "<string>", line 102, in Black
  File "<string>", line 150, in get_configs
  File "<string>", line 150, in <dictcomp>
  File "/usr/lib/python3.6/distutils/", line 311, in strtobool
    val = val.lower()
AttributeError: 'bool' object has no attribute 'lower'

That's because the value returned by the toml.load() is already a
bool, but the vim plugin incorrectly tries to convert it from a str to a bool.

The value returned by toml_config.get() was always being passed to
flag.cast(), which is a function that either converts a string to an
int or a string to a bool, depending on the flag. vim.eval()
returns integers and strings all as str, which is why we need the cast,
but that's the wrong thing to do for values that came from toml.load().
We should be applying the cast only to the return from vim.eval()
(since we know it always gives us a string), rather than casting the
value that toml.load() found - which is already the right type.

=== Problem psf#2 ===

The vim plugin fails to take the value for skip_string_normalization
from pyproject.toml. That's because it looks for a string_normalization
key instead of a skip_string_normalization key, thanks to this line
saying the name of the flag is string_normalization:

black/autoload/black.vim (line 25 in 05b54b8)
 Flag(name="string_normalization", cast=strtobool),

and this dictcomp looking up each flag's name in the config dict:

black/autoload/black.vim (lines 148 to 151 in 05b54b8)
 return {
   flag.var_name: flag.cast(toml_config.get(, vim.eval(flag.vim_rc_name)))
   for flag in FLAGS

For the second issue, I think I'd do a slightly different patch. I'd
keep the change to invert this flag's meaning and change its name that
this PR proposes, but I'd also change the handling of the
g:black_skip_string_normalization and g:black_string_normalization
variables to make it clear that g:black_skip_string_normalization is
the expected name, and g:black_string_normalization is only checked
when the expected name is unset, for backwards compatibility.

My proposed behavior is to check if g:black_skip_string_normalization
is defined and to define it if not, using the inverse of
g:black_string_normalization if that is set, and otherwise to the
default of 0. The Python code in autoload/black.vim runs later, and
will use the value of g:black_skip_string_normalization (and ignore
g:black_string_normalization; it will only be used to set
g:black_skip_string_normalization if it wasn't already set).


Co-authored-by: Matt Wozniski <>

* Fix plugin/black.vim (need to up my vim game)

Co-authored-by: Matt Wozniski <>

Co-authored-by: Richard Si <>
Co-authored-by: Matt Wozniski <>
Co-authored-by: Matt Wozniski <>

* Fix internal error when FORCE_OPTIONAL_PARENTHESES feature is enabled (psf#2332)

Fixes psf#2313.

* Add EOF and trailing whitespace fixer to pre-commit config (psf#2330)

* Docs: no space is inserted to empty docstrings (psf#2249) (psf#2333)

* Chat on Discord instead of Freenode (psf#2336)

Now that we've moved, let's direct our users to Discord in the
documentation and readme.

* Add Duolingo to list of users (psf#2341)

* Update pre-commit config (psf#2331)

via `pre-commit autoupdate`

... updating 3.9.0 -> 3.9.2.
... updating v0.812 -> v0.902.
... updating v2.2.1 -> v2.3.1.

Signed-off-by: SADIK KUZU <>

* Add necessary typeshed packages to requirements

Co-authored-by: Richard Si <>

* Get `click` types from main repo (psf#2344)

Click types have been moved to click repo itself. See pallets/click#1856

I've had some issues with typeshed types being outdated in another project
so might be good to avoid that here.

Commit history before merge:

* Get `click` types from main repo
* Fix mypy errors
* Require click v8 for type annotations
* Update Pipfile

* Accept empty stdin (close psf#2337) (psf#2346)

Commit history before merge:

* Accept empty stdin (close psf#2337)
* Update tests/
* Add changelog
* Assert Black reformats an empty string to an empty string (psf#2337) (psf#2346)
* fix

* fix typo (psf#2358)

* Avoid src being marked as optional in help (psf#2356)

* Use setuptools.find_packages in setup (psf#2363)

* Use setuptools.find_packages in setup

* Address mypy errors

* Second run of tox -e py results in a test error for test marked with no_python2 (psf#2369)

Fixes psf#2367

* Add LocalStack and Twisted to projects using Black

* Switch `toml` TOML library for `tomli` (psf#2301)

toml unfortunately has a lack of maintainership issue right now. It's
evident by the fact toml only supports TOML v0.5.0. TOML v1.0.0 has
been recently released and right now Black crashes hard on its usage.

tomli is a brand new parse only TOML library. It supports TOML
v1.0.0. Although TBH we're switching to this one mostly because
pip is doing the same.

*The upper bound was included at the library maintainer's request.

Co-authored-by: Łukasz Langa <>
Co-authored-by: Taneli Hukkinen <>

* Improve AST safety parsing error message (psf#2304)

Co-authored-by: Hasan Ramezani <>

* Don't include profiling/ to cut down sdist by ~2x (psf#2362)

They seem to be used as test cases for a specific region of formatting
that was slow. Now performance testing is probably something end users
won't be needing to do, so this is an easy way of reducing the sdist
size sigificantly.

* Create Docker tag 'latest_release' (psf#2374)

Docker images created during release process will have extra tag 'latest_release'.

This closes psf#2373.

* Update for 21.7b0 release (psf#2376)

* Update for 21.7b0 release

* move some changes to the right section

* another one

Co-authored-by: Jelle Zijlstra <>

* Use platformdirs over appdirs (psf#2375)

Signed-off-by: Bernát Gábor <>
Signed-off-by: Bernát Gábor <>

* add context manager to temporarily change the cwd (psf#2377)

Commit history before merge:

* add context manager to temporarily change the cwd
* Iterator, not Iterable

* isort docs have changed urls (psf#2390)

* Clarify contributing docs (psf#2398)

"as configurable as gofmt" means little to people who haven't used gofmt.

* Add ESP to sqlalchemy for black-primer (psf#2400)

The crash has been fixed for a little while now. Tentatively assuming
that this will lead to changes.

Co-authored-by: Richard Si <>
Co-authored-by: Felix Hildén <>
Co-authored-by: Matteo Bertucci <>
Co-authored-by: Jelle Zijlstra <>
Co-authored-by: Łukasz Langa <>
Co-authored-by: Cooper Lees <>
Co-authored-by: Bryan Bugyi <>
Co-authored-by: Stefan Foulis <>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]>
Co-authored-by: Hassan Abouelela <>
Co-authored-by: Sergey Vartanov <>
Co-authored-by: jack1142 <>
Co-authored-by: Ryan McPartlan <>
Co-authored-by: Austin Glaser <>
Co-authored-by: Austin Glaser <>
Co-authored-by: Matt Wozniski <>
Co-authored-by: Bartosz Telenczuk <>
Co-authored-by: Matt Wozniski <>
Co-authored-by: Art Chaidarun <>
Co-authored-by: SADIK KUZU <>
Co-authored-by: Taneli Hukkinen <>
Co-authored-by: simaki <>
Co-authored-by: Marco Edward Gorelli <>
Co-authored-by: Hasan Ramezani <>
Co-authored-by: pszlazak <>
Co-authored-by: Bernát Gábor <>
Co-authored-by: David Szotten <>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
None yet
Linked issues

Successfully merging this pull request may close these issues.

4 participants