# Python Formatters

This tutorial will look at using some Code Formatters ```autopep8```, ```isort```, ```black``` and ```ruff```.

If on Windows the string instance ```os.name``` will be ```'nt'```, otherwise if on Linux/Mac the string instance will be ```'posix'```:

In [4]:
import os
os.name

'nt'

## AutoPEP8 Formatter

Supposing the following script file is created twice so there is a copy of the original:

In [58]:
%%writefile script1.py
var1= 'Hello'
var2 ="World"
import numpy as np
x=np.array([0,1,2,3,4])
y=np.array([0,2,4, 6 ,8])
import pandas as pd
df=pd.DataFrame({'x':x,"y":y})
import datetime
now=datetime.datetime(year = 2023,month=12 ,day=1)
hour=datetime.timedelta(hours=1)
import collections
counts=collections.Counter([1,  2,2  ,2,3,3])
import itertools
cycle=itertools.cycle([1,2,3])
import sys, os
sys.sizeof(cycle)
os.environ['USERPROFILE']
num1 = 0xabb4ab8a
import string

Overwriting script1.py


In [59]:
%%writefile script2.py
var1= 'Hello'
var2 ="World"
import numpy as np
x=np.array([0,1,2,3,4])
y=np.array([0,2,4, 6 ,8])
import pandas as pd
df=pd.DataFrame({'x':x,"y":y})
import datetime
now=datetime.datetime(year = 2023,month=12 ,day=1)
hour=datetime.timedelta(hours=1)
import collections
counts=collections.Counter([1,  2,2  ,2,3,3])
import itertools
cycle=itertools.cycle([1,2,3])
import sys, os
sys.sizeof(cycle)
os.environ['USERPROFILE']
num1 = 0xabb4ab8a
import string

Overwriting script2.py


Notice that the spacing is deliberately sloppy around the assignment operator and also around delimiters. This code violates [PEP8 Whitespace Expressions and Statements](https://peps.python.org/pep-0008/#whitespace-in-expressions-and-statements) which explains how to use whitespace to emphasis Python code. Whitespace for example should be placed around an operator such as the assignment operator, except in the case for a function call where whitespace should instead be placed after each comma to visually seperate input arguments.

This code also violates [PEP 8 Imports](https://peps.python.org/pep-0008/#imports) which states imports should be at the top of the file, each import should be on a seperate line and standard libraries should be imported before third-party or custom modules.

To rectify this ```autopep8``` can be used. 

In VSCode the AutoPEP8 extension can be installed:

<img src='images\img_001.png' alt='img_001' width='350'/>

Once installed, the Format Document and Format Document with... commands will be available from the Command Palette when a Python script file is opened. Press ```Ctrl```, ```⇧``` and ```p``` to open the command palette:

<img src='images\img_002.png' alt='img_002' width='450'/>

Notice that the imports are placed at the top of the file, with standard module imports placed before third-party library imports. Notice that the spacing has mainly been addressed:

<img src='images\img_003.png' alt='img_003' width='450'/>

The Format Document with... command allows selection of other formatters if installed:

<img src='images\img_004.png' alt='img_004' width='450'/>

It also allows for configuration of the default formatter:

<img src='images\img_005.png' alt='img_005' width='450'/>

For an interactive Python notebook file, there is the equivalent command Format notebook:

<img src='images\img_006.png' alt='img_006' width='450'/>

This makes the spacing AutoPEP8 compliant but does not sort imports:

<img src='images\img_007.png' alt='img_007' width='450'/>

AutoPEP8 can also be used via the command:

In [7]:
if os.name == 'nt':
    !powershell ~\anaconda3\Scripts\autopep8.exe script2.py
else:
    !autopep8 script2.py

import os
import sys
import itertools
import collections
import datetime
import pandas as pd
import numpy as np
var1 = 'Hello'
var2 = "World"
x = np.array([0, 1, 2, 3, 4])
y = np.array([0, 2, 4, 6, 8])
df = pd.DataFrame({'x': x, "y": y})
now = datetime.datetime(year=2023, month=12, day=1)
hour = datetime.timedelta(hours=1)
counts = collections.Counter([1,  2, 2, 2, 3, 3])
cycle = itertools.cycle([1, 2, 3])
sys.sizeof(cycle)
os.environ['USERPROFILE']
num1 = 0xabb4ab8a


To make the changes in place:

In [8]:
if os.name == 'nt':
    !powershell ~\anaconda3\Scripts\autopep8.exe -i script2.py
else:
    !autopep8 -i script2.py

These can be viewed using:

In [9]:
if os.name == 'nt':
    !powershell type script2.py
else:
    !type script2.py

import os
import sys
import itertools
import collections
import datetime
import pandas as pd
import numpy as np
var1 = 'Hello'
var2 = "World"
x = np.array([0, 1, 2, 3, 4])
y = np.array([0, 2, 4, 6, 8])
df = pd.DataFrame({'x': x, "y": y})
now = datetime.datetime(year=2023, month=12, day=1)
hour = datetime.timedelta(hours=1)
counts = collections.Counter([1,  2, 2, 2, 3, 3])
cycle = itertools.cycle([1, 2, 3])
sys.sizeof(cycle)
os.environ['USERPROFILE']
num1 = 0xabb4ab8a


## Import Sort Formatter

AutoPEP8 has placed all the imports at the start of the fil with the standard modules placed before the third-party modules. However the moduels are not sorted alphabetically by this grouping. To rectify this import sort formatter ```isort``` can be used. Let's return to the starting code (3 identical files will eb created):

In [10]:
%%writefile script3.py
var1= 'Hello'
var2 ="World"
import numpy as np
x=np.array([0,1,2,3,4])
y=np.array([0,2,4, 6 ,8])
import pandas as pd
df=pd.DataFrame({'x':x,"y":y})
import datetime
now=datetime.datetime(year = 2023,month=12 ,day=1)
hour=datetime.timedelta(hours=1)
import collections
counts=collections.Counter([1,  2,2  ,2,3,3])
import itertools
cycle=itertools.cycle([1,2,3])
import sys, os
sys.sizeof(cycle)
os.environ['USERPROFILE']
num1 = 0xabb4ab8a
import string

Overwriting script3.py


In [131]:
%%writefile script4.py
var1= 'Hello'
var2 ="World"
import numpy as np
x=np.array([0,1,2,3,4])
y=np.array([0,2,4, 6 ,8])
import pandas as pd
df=pd.DataFrame({'x':x,"y":y})
import datetime
now=datetime.datetime(year = 2023,month=12 ,day=1)
hour=datetime.timedelta(hours=1)
import collections
counts=collections.Counter([1,  2,2  ,2,3,3])
import itertools
cycle=itertools.cycle([1,2,3])
import sys, os
sys.sizeof(cycle)
os.environ['USERPROFILE']
num1 = 0xabb4ab8a
import string

Overwriting script4.py


In [134]:
%%writefile script5.py
var1= 'Hello'
var2 ="World"
import numpy as np
x=np.array([0,1,2,3,4])
y=np.array([0,2,4, 6 ,8])
import pandas as pd
df=pd.DataFrame({'x':x,"y":y})
import datetime
now=datetime.datetime(year = 2023,month=12 ,day=1)
hour=datetime.timedelta(hours=1)
import collections
counts=collections.Counter([1,  2,2  ,2,3,3])
import itertools
cycle=itertools.cycle([1,2,3])
import sys, os
sys.sizeof(cycle)
os.environ['USERPROFILE']
num1 = 0xabb4ab8a
import string

Overwriting script5.py


In VSCode the isort extension can be installed. Once installed, the Organize Imports command displays:

<img src='images\img_008.png' alt='img_008' width='350'/>

This command can be used on a Python script file:

<img src='images\img_009.png' alt='img_009' width='450'/>

It seems however that isort does not work very well on a file that has not previously been processed by autopep8:

<img src='images\img_010.png' alt='img_010' width='450'/>

The import sort formatter ```isort``` can be used directly on this script and by default operates inplace:

In [132]:
if os.name == 'nt':
    !powershell ~\anaconda3\Scripts\isort.exe script4.py
else:
    !isort script4.py

Fixing C:\Users\Philip\Documents\GitHub\python-notebooks\autopep8_module\script4.py


Unfortunately the results aren't great when the code has not previously been processed with autopep8:

In [133]:
if os.name == 'nt':
    !powershell type script4.py
else:
    !type script4.py

var1= 'Hello'
var2 ="World"
import numpy as np

x=np.array([0,1,2,3,4])
y=np.array([0,2,4, 6 ,8])
import pandas as pd

df=pd.DataFrame({'x':x,"y":y})
import datetime

now=datetime.datetime(year = 2023,month=12 ,day=1)
hour=datetime.timedelta(hours=1)
import collections

counts=collections.Counter([1,  2,2  ,2,3,3])
import itertools

cycle=itertools.cycle([1,2,3])
import os
import sys

sys.sizeof(cycle)
os.environ['USERPROFILE']
num1 = 0xabb4ab8a
import string


Returning to the starting code and using ```autopep8``` and ```isort``` together:

In [135]:
if os.name == 'nt':
    !powershell ~\anaconda3\Scripts\autopep8.exe -i script5.py
    !powershell ~\anaconda3\Scripts\isort.exe script5.py
else:
    !autopep8 -i script5.py
    !isort script5.py

Fixing C:\Users\Philip\Documents\GitHub\python-notebooks\autopep8_module\script5.py


Gives better results:

In [15]:
if os.name == 'nt':
    !powershell type script4.py
else:
    !type script4.py

import collections
import datetime
import itertools
import os
import sys

import numpy as np
import pandas as pd

var1 = 'Hello'
var2 = "World"
x = np.array([0, 1, 2, 3, 4])
y = np.array([0, 2, 4, 6, 8])
df = pd.DataFrame({'x': x, "y": y})
now = datetime.datetime(year=2023, month=12, day=1)
hour = datetime.timedelta(hours=1)
counts = collections.Counter([1,  2, 2, 2, 3, 3])
cycle = itertools.cycle([1, 2, 3])
sys.sizeof(cycle)
os.environ['USERPROFILE']
num1 = 0xabb4ab8a


The quotation style above is inconsistent and haven't been amended using ```autopep8```. The reason no amendment to the quotation style has been made is because [PEP8 String Quotes](https://peps.python.org/pep-0008/#string-quotes) has acknowledged that the Python community is divided on quotation style and does not explicitly recommend single quotations over double quotations or vice-versa.

## Black Formatter

```black``` is an opinionated formatter that applies additional opinionated formatting to the script. Let's return to the starting code, creating 3 identicial files:

In [136]:
%%writefile script6.py
var1= 'Hello'
var2 ="World"
import numpy as np
x=np.array([0,1,2,3,4])
y=np.array([0,2,4, 6 ,8])
import pandas as pd
df=pd.DataFrame({'x':x,"y":y})
import datetime
now=datetime.datetime(year = 2023,month=12 ,day=1)
hour=datetime.timedelta(hours=1)
import collections
counts=collections.Counter([1,  2,2  ,2,3,3])
import itertools
cycle=itertools.cycle([1,2,3])
import sys, os
sys.sizeof(cycle)
os.environ['USERPROFILE']
num1 = 0xabb4ab8a
import string

Overwriting script6.py


In [137]:
%%writefile script7.py
var1= 'Hello'
var2 ="World"
import numpy as np
x=np.array([0,1,2,3,4])
y=np.array([0,2,4, 6 ,8])
import pandas as pd
df=pd.DataFrame({'x':x,"y":y})
import datetime
now=datetime.datetime(year = 2023,month=12 ,day=1)
hour=datetime.timedelta(hours=1)
import collections
counts=collections.Counter([1,  2,2  ,2,3,3])
import itertools
cycle=itertools.cycle([1,2,3])
import sys, os
sys.sizeof(cycle)
os.environ['USERPROFILE']
num1 = 0xabb4ab8a
import string

Overwriting script7.py


In [138]:
%%writefile script8.py
var1= 'Hello'
var2 ="World"
import numpy as np
x=np.array([0,1,2,3,4])
y=np.array([0,2,4, 6 ,8])
import pandas as pd
df=pd.DataFrame({'x':x,"y":y})
import datetime
now=datetime.datetime(year = 2023,month=12 ,day=1)
hour=datetime.timedelta(hours=1)
import collections
counts=collections.Counter([1,  2,2  ,2,3,3])
import itertools
cycle=itertools.cycle([1,2,3])
import sys, os
sys.sizeof(cycle)
os.environ['USERPROFILE']
num1 = 0xabb4ab8a
import string

Writing script8.py


There is a VSCode extension for black. This makes the black formatter available under the command Format Document with...

<img src='images\img_012.png' alt='img_012' width='350'/>

<img src='images\img_013.png' alt='img_013' width='450'/>

Unfortunately black doesn't support import sorting and relies on use of isort to do this. Therefore autopep8, isort and then black should be sued:

<img src='images\img_014.png' alt='img_014' width='450'/>

```black``` can be used on the script file and changes are made inplace by default:

In [139]:
if os.name == 'nt':
    !powershell ~\anaconda3\Scripts\black.exe script7.py
else:
    !black script7.py

reformatted script7.py

All done! ✨ 🍰 ✨
1 file reformatted.


The result looks like the following:

In [140]:
if os.name == 'nt':
    !powershell type script7.py
else:
    !type script7.py

var1 = "Hello"
var2 = "World"
import numpy as np

x = np.array([0, 1, 2, 3, 4])
y = np.array([0, 2, 4, 6, 8])
import pandas as pd

df = pd.DataFrame({"x": x, "y": y})
import datetime

now = datetime.datetime(year=2023, month=12, day=1)
hour = datetime.timedelta(hours=1)
import collections

counts = collections.Counter([1, 2, 2, 2, 3, 3])
import itertools

cycle = itertools.cycle([1, 2, 3])
import sys, os

sys.sizeof(cycle)
os.environ["USERPROFILE"]
num1 = 0xABB4AB8A
import string


```black``` does not sort the imports and therefore ```autopep8``` and ```isort``` should be used before using ```black```. Running these three formatters gives the following results:

In [141]:
if os.name == 'nt':
    !powershell ~\anaconda3\Scripts\autopep8.exe -i script8.py
    !powershell ~\anaconda3\Scripts\isort.exe script8.py
    !powershell ~\anaconda3\Scripts\black.exe script8.py
else:
    !autopep8 -i script8.py
    !isort script8.py
    !black script8.py

Fixing C:\Users\Philip\Documents\GitHub\python-notebooks\autopep8_module\script8.py


reformatted script8.py

All done! ✨ 🍰 ✨
1 file reformatted.


In [142]:
if os.name == 'nt':
    !powershell type script8.py
else:
    !type script8.py

import collections
import datetime
import itertools
import os
import string
import sys

import numpy as np
import pandas as pd

var1 = "Hello"
var2 = "World"
x = np.array([0, 1, 2, 3, 4])
y = np.array([0, 2, 4, 6, 8])
df = pd.DataFrame({"x": x, "y": y})
now = datetime.datetime(year=2023, month=12, day=1)
hour = datetime.timedelta(hours=1)
counts = collections.Counter([1, 2, 2, 2, 3, 3])
cycle = itertools.cycle([1, 2, 3])
sys.sizeof(cycle)
os.environ["USERPROFILE"]
num1 = 0xABB4AB8A


Notice all the quotes are now consistent however the ```black``` opinionated preferences are inconsistent to the IPython interpreter and Python documentation... 

In [143]:
'Hello World!'

'Hello World!'

In [144]:
"Hello World!"

'Hello World!'

In [145]:
hex(2880744330)

'0xabb4ab8a'

## Ruff Formatter

The Rust implemented fast formatter (ruff) is a configurable formatter. It is in the early stages of development and unfortunately is not preinstalled in the Anaconda base Python environment yet, however as it is a small package with little dependencies it can be directly installed in the base Python environment using:

Once this is done, return to the starting code:

In [164]:
%%writefile script9.py
var1= 'Hello'
var2 ="World"
import numpy as np
x=np.array([0,1,2,3,4])
y=np.array([0,2,4, 6 ,8])
import pandas as pd
df=pd.DataFrame({'x':x,"y":y})
import datetime
now=datetime.datetime(year = 2023,month=12 ,day=1)
hour=datetime.timedelta(hours=1)
import collections
counts=collections.Counter([1,  2,2  ,2,3,3])
import itertools
cycle=itertools.cycle([1,2,3])
import sys, os
sys.sizeof(cycle)
os.environ['USERPROFILE']
num1 = 0xabb4ab8a
import string

Overwriting script9.py


In [155]:
%%writefile script10.py
var1= 'Hello'
var2 ="World"
import numpy as np
x=np.array([0,1,2,3,4])
y=np.array([0,2,4, 6 ,8])
import pandas as pd
df=pd.DataFrame({'x':x,"y":y})
import datetime
now=datetime.datetime(year = 2023,month=12 ,day=1)
hour=datetime.timedelta(hours=1)
import collections
counts=collections.Counter([1,  2,2  ,2,3,3])
import itertools
cycle=itertools.cycle([1,2,3])
import sys, os
sys.sizeof(cycle)
os.environ['USERPROFILE']
num1 = 0xabb4ab8a
import string

Writing script10.py


There is a third-party extension for RUff in VSCode:

<img src='images\img_015.png' alt='img_015' width='350'/>

This displays warnings for problems:

<img src='images\img_016.png' alt='img_016' width='450'/>

The commands Ruff Format Imports and Ruff Format Document are available. Ruff: Format Imports seems to be reliant on isort and does not work well unless autopep8 has previously been run. Ruff: Format Document can be used to apply opinionated formatting:

<img src='images\img_017.png' alt='img_017' width='450'/>

Ruff can be used with the command to check the file, these display the same warnings the VSCode extension displays when a Python script file is opened:

In [156]:
if os.name == 'nt':
    !powershell ~\anaconda3\Scripts\ruff.exe check script10.py
else:
    !ruff check script10.py

[1mscript10.py[0m[36m:[0m3[36m:[0m1[36m:[0m [1;31mE402[0m Module level import not at top of file
[1mscript10.py[0m[36m:[0m6[36m:[0m1[36m:[0m [1;31mE402[0m Module level import not at top of file
[1mscript10.py[0m[36m:[0m8[36m:[0m1[36m:[0m [1;31mE402[0m Module level import not at top of file
[1mscript10.py[0m[36m:[0m11[36m:[0m1[36m:[0m [1;31mE402[0m Module level import not at top of file
[1mscript10.py[0m[36m:[0m13[36m:[0m1[36m:[0m [1;31mE402[0m Module level import not at top of file
[1mscript10.py[0m[36m:[0m15[36m:[0m1[36m:[0m [1;31mE401[0m Multiple imports on one line
[1mscript10.py[0m[36m:[0m15[36m:[0m1[36m:[0m [1;31mE402[0m Module level import not at top of file
[1mscript10.py[0m[36m:[0m19[36m:[0m1[36m:[0m [1;31mE402[0m Module level import not at top of file
[1mscript10.py[0m[36m:[0m19[36m:[0m8[36m:[0m [1;31mF401[0m [[36m*[0m] `string` imported but unused
Found 9 errors.
[[36m*[0m] 1 fixable

Ruff seems to have limited support for fixing these errors and only seems to be able to fix 1 error in this case:

In [157]:
if os.name == 'nt':
    !powershell ~\anaconda3\Scripts\ruff.exe check --fix script10.py
else:
    !ruff check --fix script10.py

[1mscript10.py[0m[36m:[0m3[36m:[0m1[36m:[0m [1;31mE402[0m Module level import not at top of file
[1mscript10.py[0m[36m:[0m6[36m:[0m1[36m:[0m [1;31mE402[0m Module level import not at top of file
[1mscript10.py[0m[36m:[0m8[36m:[0m1[36m:[0m [1;31mE402[0m Module level import not at top of file
[1mscript10.py[0m[36m:[0m11[36m:[0m1[36m:[0m [1;31mE402[0m Module level import not at top of file
[1mscript10.py[0m[36m:[0m13[36m:[0m1[36m:[0m [1;31mE402[0m Module level import not at top of file
[1mscript10.py[0m[36m:[0m15[36m:[0m1[36m:[0m [1;31mE401[0m Multiple imports on one line
[1mscript10.py[0m[36m:[0m15[36m:[0m1[36m:[0m [1;31mE402[0m Module level import not at top of file
Found 8 errors (1 fixed, 7 remaining).


The unused import statement is removed:

In [158]:
if os.name == 'nt':
    !powershell type script10.py
else:
    !type script10.py

var1= 'Hello'
var2 ="World"
import numpy as np
x=np.array([0,1,2,3,4])
y=np.array([0,2,4, 6 ,8])
import pandas as pd
df=pd.DataFrame({'x':x,"y":y})
import datetime
now=datetime.datetime(year = 2023,month=12 ,day=1)
hour=datetime.timedelta(hours=1)
import collections
counts=collections.Counter([1,  2,2  ,2,3,3])
import itertools
cycle=itertools.cycle([1,2,3])
import sys, os
sys.sizeof(cycle)
os.environ['USERPROFILE']
num1 = 0xabb4ab8a


The other errors are sorted by using ```autopep8``` and ```isort```:

In [159]:
if os.name == 'nt':
    !powershell ~\anaconda3\Scripts\autopep8.exe -i script10.py
    !powershell ~\anaconda3\Scripts\isort.exe script10.py
else:
    !autopep8 -i script9.py
    !isort script9.py

Fixing C:\Users\Philip\Documents\GitHub\python-notebooks\autopep8_module\script10.py


In [160]:
if os.name == 'nt':
    !powershell type script10.py
else:
    !type script10.py

import collections
import datetime
import itertools
import os
import sys

import numpy as np
import pandas as pd

var1 = 'Hello'
var2 = "World"
x = np.array([0, 1, 2, 3, 4])
y = np.array([0, 2, 4, 6, 8])
df = pd.DataFrame({'x': x, "y": y})
now = datetime.datetime(year=2023, month=12, day=1)
hour = datetime.timedelta(hours=1)
counts = collections.Counter([1,  2, 2, 2, 3, 3])
cycle = itertools.cycle([1, 2, 3])
sys.sizeof(cycle)
os.environ['USERPROFILE']
num1 = 0xabb4ab8a


Now no errors are found. This means that ruff can be used to format the file:

In [161]:
if os.name == 'nt':
    !powershell ~\anaconda3\Scripts\ruff.exe format script10.py
else:
    !ruff format script10.py

1 file reformatted


By default ruff, like black unfortunately preferences double quotations:

In [162]:
if os.name == 'nt':
    !powershell type script10.py
else:
    !type script10.py

import collections
import datetime
import itertools
import os
import sys

import numpy as np
import pandas as pd

var1 = 'Hello'
var2 = 'World'
x = np.array([0, 1, 2, 3, 4])
y = np.array([0, 2, 4, 6, 8])
df = pd.DataFrame({'x': x, 'y': y})
now = datetime.datetime(year=2023, month=12, day=1)
hour = datetime.timedelta(hours=1)
counts = collections.Counter([1, 2, 2, 2, 3, 3])
cycle = itertools.cycle([1, 2, 3])
sys.sizeof(cycle)
os.environ['USERPROFILE']
num1 = 0xABB4AB8A


However the quote style is easily configurable using a ```toml``` file:

In [153]:
%%writefile ruff.toml
[format]
quote-style = 'single'
#case = 'lower'

Writing ruff.toml


Now when the file is formatted single quotations are preferred:

In [124]:
if os.name == 'nt':
    !powershell ~\anaconda3\Scripts\ruff.exe format script10.py
else:
    !ruff format script10.py

1 file reformatted


In [163]:
if os.name == 'nt':
    !powershell type script10.py
else:
    !type script10.py

import collections
import datetime
import itertools
import os
import sys

import numpy as np
import pandas as pd

var1 = 'Hello'
var2 = 'World'
x = np.array([0, 1, 2, 3, 4])
y = np.array([0, 2, 4, 6, 8])
df = pd.DataFrame({'x': x, 'y': y})
now = datetime.datetime(year=2023, month=12, day=1)
hour = datetime.timedelta(hours=1)
counts = collections.Counter([1, 2, 2, 2, 3, 3])
cycle = itertools.cycle([1, 2, 3])
sys.sizeof(cycle)
os.environ['USERPROFILE']
num1 = 0xABB4AB8A


With the current version there seems to be no option to set hexadecimal values to lower case.