A wrapper for black, adding pre- and post-processing to better align with Globality conventions.
globality-black
performs the following steps:
- pre-processing: to protect from black actions
- black
- postprocessing: to revert / correct black actions
Note: if you are not familiar with black (or need a refresh), please read our Black refresh.
- Globality black
- Table of contents
- Installation
- Usage
- Features
- Pending / Future work
- Black refresh
- FAQ
- Tests
- Contributing
pip install globality-black
There are two ways to use globality-black
, via CLI, or importing the helpers in the library.
Next, we show some typical use cases:
Please see command line arguments running globality-black --help
.
To use globality-black
in PyCharm, go to PyCharm -> Preferences... -> Tools -> External Tools -> Click + symbol
to add new external tool.
Recommended configuration to format the current file:
- Program: path to
globality-black
, e.g./Users/marty-mcfly/miniconda3/envs/gb/bin/globality-black
- Arguments:
$FilePath$
- Working directory:
$ProjectFileDir$
Recommended configuration to check the whole repo (but not formatting it it):
- Program: path to
globality-black
, e.g./Users/marty-mcfly/miniconda3/envs/gb/bin/globality-black
- Arguments:
. --check
- Working directory:
$ProjectFileDir$
Next, configure a keymap, as in here.
We can leverage this extension, with a custom formatter. Here we explain how to get the following options:
There are two ways to apply globality-black
, see left-hand-side, or by clicking on the button next to "Code". We will configure
the extension to make it apply the isort + globality-black
pipeline when clicking such button.
To do so, install the extension, generate the config for jupyter lab and edit it:
pip install jupyterlab_code_formatter
jupyter lab --generate-config
vim ~/.jupyter/jupyter_lab_config.py
You might already have some config in jupyter_lab_config
. If so, you might want to omit
the second command above, and edit it (vim) instead.
In any case, we will add the following code:
from jupyterlab_code_formatter.formatters import SERVER_FORMATTERS
from globality_black.jupyter_formatter import GlobalityBlackFormatter
SERVER_FORMATTERS['globality-black'] = GlobalityBlackFormatter(line_length=100)
Then, go to the extension preferences, and add:
{
"preferences": {
"default_formatter": {
"python": [
"isort",
"globality-black",
],
}
},
"isort": {
"combine_as_imports": true,
"force_grid_wrap": 4,
"force_to_top": "true",
"include_trailing_comma": true,
"known_third_party": ["wandb", "tqdm"],
"line_length": 100,
"lines_after_imports": 2,
"multi_line_output": 3,
}
}
Notes:
- The last step above translates into user settings saved in
~/.jupyter/lab/user-settings/@ryantam626/
. - The extension is applied to all cells in the notebook. It can be configured to be applied just to the current cell, if interested.
- The extension is applied to each cell in isolation. Hence, if multiple imports appear in different cells, they won't be merged together on top of the notebook.
To use globality-black
in VScode, install the extension
External formatters.
Then, go to Preferences: Settings (JSON). A file settings.json
will open. Add this to the file:
"[python]": {
"editor.codeActionsOnSave": {
"source.organizeImports": true
},
"editor.defaultFormatter": "SteefH.external-formatters",
},
"isort.args":["--profile", "black"],
This will configure isort to run when saving, and glo-black as the default formatter. Add also this
[
"externalFormatters.languages": {
"python": {
"command": "$PATH_TO_ENV/bin/globality-black",
"arguments": [
"-",
]
}
},
"isort.interpreter": [
"$PATH_TO_ENV/bin/python"
],
]
setting or replacing $PATH_TO_ENV
with whatever you need to get to the
globality-black
Python env.
Important: these two paths have to be absolute paths, and need to be the
same everywhere you use VScode. This means that if you use VScode on remotes (e.g. an EC2 instance),
the same paths needs to exist. Obviosuly, that's not possible, since e.g.
in EC2 you'll have /home/ubuntu/...
and in your machine /Users/john/...
. A workaround for this is to create a symbolic link in your instance, e.g:
sudo ln -s /home/ubuntu /Users/john
To configure shortcuts, go to Preferences: Keyboard Shortcuts (JSON)
from the Palette (command+shift+p). The file keybindings.json
will open.
Add to this file:
[
{
"key": "cmd+shift+j",
"command": "editor.action.formatDocument",
"when": "editorHasDocumentFormattingProvider && editorTextFocus && !editorReadonly && !inCompositeEditor"
}
]
This will allow you to run globality-black
on the currently open file,
passing the content of the file via stdin.
To format notebooks, there is a shortcut
{
"key": "shift+alt+f",
"command": "notebook.formatCell",
"when": "editorHasDocumentFormattingProvider && editorTextFocus && inCompositeEditor && notebookEditable && !editorReadonly && activeEditor == 'workbench.editor.notebook'",
}
that you can modify if you don't like this combination. This will format the current cell with the default formatter. I did not find a way to run isort though. There is a "format notebook on save" option, but it's not exactly what we configured for python files. That would run isort + glo-black.
Black would remove those blank lines after wandb
and scikit-learn
below:
graph.use(
"wandb",
"scikit-learn",
# we love pandas
"pandas",
)
globality-black
protects those assuming the developer added them for readability.
In a similar fashion to the "blank lines" feature, "dotted chains" allows to keep the block:
return (
df_field[COLUMNS_PER_FIELD[name]]
.dropna(subset=["column"])
.reset_index(drop=True)
.assign(mapped_type=MAP_DICT[name])
)
LABELS = set(
df[df.labels.apply(len) > 0]
.flag.apply(curate)
.apply(normalize)
.unique()
)
the same. In this feature, we don't explode anything but rather protect code assuming it was written by this in purpose for readability.
This is a very simple and specific feature. Black (at least up to 21.9b0) has a bug so that tuples with one element are compressed as in
x = (
3,
)
becomes
x = (3,)
See psf/black#1139 (comment). With globality-black, will protect these.
Explode comprehensions
- all dict comprehensions
- any comprehension with an if
- any comprehension with multiple for loops (see examples below)
- list / set comprehensions where the element:
- has a ternary operator (see examples below)
- has another comprehension
For everything else, we rely on black
. Examples:
[3 for _ in range(10)]
[3 for i in range(10) if i < 4]
{"a": 3 for _ in range(4)}
{"a": 3 for _ in range(4) if i < 4}
["odd" if i %% 2 == 0 else "even" for _ in range(10)]
double_comp1 = [3*i*j for i in range(10) for j in range(4)]
double_comp2 = [[i for i in range(7) if i < 5] for j in range(10)]
double_comp3 = {i: [i for i in range(7) if i < 5] for j in range(10) if i < 2}
[3 for _ in range(10)]
[
3
for i in range(10)
if i < 4
]
{
"a": 3
for _ in range(4)
}
{
"a": 3
for _ in range(4)
if i < 4
}
[
"odd" if i %% 2 == 0 else "even"
for _ in range(10)
]
double_comp1 = [
3 * i * j
for i in range(10)
for j in range(4)
]
double_comp2 = [
[i for i in range(7) if i < 5]
for j in range(10)
]
double_comp3 = {
i: [i for i in range(7) if i < 5]
for j in range(10)
if i < 2
}
Note that in the last two comprehensions, the nested comprehensions are not exploded even though
having an if. This is a limitation of globality-black
, but we believe not very frequent
in everyday cases. If you really want to explode those and make globality-black
respect it,
please use the feature explained next.
If you see some block where you don't want to apply globality-black
, wrap it
with # fmt.off
and # fmt:on
and it will be ignored. Note that this is the same syntax as
for black
. For example, for readability you might want to do something as:
# fmt: off
files_to_read = [
(f"{key1}_{key2}", key1, key2, key1 + key2)
for key1 in range(10)
]
# fmt: on
Note that as a default (same as black
), globality-black
will write the expression above as a
one-liner.
- Explode ternary operators under some criteria
- Nested comprehensions
- Magic comma for single element subscripts, due to this
Please give us feedback if you find any issues, and check known_failed
black
is an opinionated Python formatter that tries to save as much vertical space as possible. In
this regard, it compresses lines to the maximum character length that has been configured. black
's
default is 88, whereas in globality-black
we use a default of 100 characters, as agreed for
Globality repos globally. If you want to have a custom max character length, add a pyproject.toml
file to the root of your repo. This works the same way as in black
, and globality-black
will
take your config from there.
See how black
works in their README. It is especially useful to
review this section, where
important recent features are explained.
black
added a feature at the end of 2020 that we used to call the "magic comma". It's one of the
first examples where black
is giving a bit of freedom to the developer on how the final code will
look like (apart from fmt:off
and fmt:on
to ignore black
entirely). Read more about it
here.
Here we list a number of questions and solutions raised when presenting this project to other teams:
I like this project, but this would destroy all our git history and git blames
Our recommendation is:
- Create a big PR for all your repo, and do the effort of reviewing the changes just once.
- Add a
.git-blame-ignore-revs
file to your repo, ignoring the bulk commit whereglobality-black
is applied. See here for more details.
I like most of the changes, but in some places I really prefer the way I write the code
No problem, for those specific cases where you like more your style, just wrap the block with
fmt:off
and fmt:on
, see the
Partially disable Globality Black section.
100 characters per line is too short / too long for me
Just add a pyproject.toml
to the root of your repo (as the one in this very own
project) and specify your preferred length, see the Black refresh section.
I want to know what will be changed before applying the changes
Please use the --diff
option from the CLI, see the CLI section.
I want to explode list of arguments, but globality-black
is compressing them into one line
Please use the magic comma feature, see Magic comma.
Run the tests as in
bash entrypoint.sh test
or simply
pytest .
Some options:
-s
to show prints and be able to debug--pdb
to trigger debugger when having an exceptionpytest route_to_test
to test a specific test filepytest route_to_test::test_function
to test a specific test functionpytest route_to_test::test_function[test_case]
--cov-report term
to show coverage
You might find other code inspectors in entrypoint.sh
. Note that these are run
against your code if opening a pull request.
All contributions, bug reports, security issues, bug fixes, documentation improvements, enhancements, and ideas are welcome. This section is adapted and simplified from pandas contribution guide.
Bug reports, security issues, and enhancement requests are an important part of making open-source software more stable and are curated through Github issues. When reporting and issue or request, please fill out the issue form fully to ensure others and the core development team can fully understand the scope of the issue.
The issue will then show up to the community and be open to comments/ideas from others.
deboiler
is hosted on GitHub, and to contribute, you will need to sign up for a free GitHub account. We use Git for version control to allow many people to work together on the project. If you are new to Git, you can reference some of the resources in the pandas contribution guide cited above.
Also, the project follows a standard forking workflow whereby contributors fork the repository, make changes, create a feature branch, push changes, and then create a pull request. To avoid redundancy, please follow all the instructions in the pandas contribution guide cited above.
As contributors and maintainers to this project, you are expected to abide by the code of conduct. More information can be found at the Contributor Code of Conduct.