### Seven Tips To Clean Code With Python
All examples are taken from the following article
https://medium.com/analytics-vidhya/seven-tips-to-clean-code-with-python-24930d35927f

1. [String formatting with f-strings](#string)
2. [Platform-independent directory delimiters](#delimiters)
3. [Variable unpacking](#unpacking)
4. [`.get` instead of `[key]` for dictionary iterations](#dict_iter)
5. [Loop two iterators with the zip function](#loop_two)
6. [List comprehensions](#comp)
7. [Multiple assignment with `*` and `**`](#multi)

[Putting it all together](#together)


<a id='string'></a>
## 1. String formatting with f-strings

Python 3.6+ update includes a new way of formatting strings: the Python formatted string literal. String formatting in Python has come a long way. Below a walkthrough. 

In [30]:
purpose = "string formatting"
percent_sign = "% operator"
dot_format_method = ".format() method"
string_literal = "f-string"
quantity, revenue = 100, 25
d = {"costs": 300}
list_operations = ['slicing and other things', 'indexing', 'key references']
limitations = ["be empty", "contain comments", "contain backslashes"]

In [31]:
print("Before we used the %s for %s." % (percent_sign, purpose))

Before we used the % operator for string formatting.


In [32]:
print("Since Python 2.7 we are using the {method} for {0}, enabling us {1:^8s}."\
      .format(purpose, "to do fun things", method=dot_format_method))

Since Python 2.7 we are using the .format() method for string formatting, enabling us to do fun things.


In [7]:
print(f"But now with Python 3.6 come the {string_literal}!")

But now with Python 3.6 come the f-string!


In [8]:
print(f"With f-strings we can do arithmetic expressions: price €{quantity / revenue} per unit.")

With f-strings we can do arithmetic expressions: price €4.0 per unit.


In [11]:
print(f"We can do {list_operations[0][0:7]}, {list_operations[1]} and {list_operations[2]}:\
cost €{d['costs'] / quantity} per unit.")

We can do slicing, indexing and key references:cost €3.0 per unit.


In [12]:
print(f"f-strings have three limitations: {*limitations,}.")

f-strings have three limitations: ('be empty', 'contain comments', 'contain backslashes').


In [13]:
limitations

['be empty', 'contain comments', 'contain backslashes']

F-strings consider everything within { curly brackets } as an expression, and with these expressions, we can do simple arithmetic but also functions and method calls!

In [16]:
s = "This string"
print(f"The length of |{s}| is {len(s)} characters and contains {len(s.split())} tokens.")

The length of |This string| is 11 characters and contains 2 tokens.


<a id='delimiters'></a>
## 2. Platform-independent directory delimiters

Making your Python code as re-usable as possible should be one of your main concerns. But what if you’re working on a Unix platform and your colleague is working on Windows?

The path delimiter on Windows is `\`, but on my Linux or Mac system, it is `/`. Avoid dealing with these nuances by using the built-in os library:

In [17]:
import os

directory = os.path.join('main_dir', 'sub_dir')
file = 'example.json'

print(os.path.join(directory, file))

main_dir\sub_dir\example.json


<a id='unpacking'></a>
## 3. Variable Unpacking

Unpacking variables are probably most used for functions that return multiple variables, such as in the example below.

In [18]:
def two_strings():
    return 'first', 'second'
x, y = two_strings()

But it is also useful for data types that contain multiple items. The only important notion here is that, if not otherwise defined, variable unpacking results in `tuples`.

In [25]:
name, age, job = ['Pietje Puk', 27, 'Data Scientist']  # unpack lists & tuples
name

'Pietje Puk'

In [29]:
a, b, c = '123'  # unpack strings
print(a)
print(b)
print(c)

1
2
3


In [30]:
a, b, c = (i ** 2 for i in range(2, 5))  # unpack generator
print(a)
print(b)
print(c)

4
9
16


In [32]:
person = {'name': 'Pietje Puk', 'age': 27, 'profession': 'Data Scientist'}
a, b, c = person   # unpacking dictionary keys
print(a)
print(b)
print(c)

name
age
profession


In [33]:
a, b, c = person.values()   # unpacking dictionary values
print(a)
print(b)
print(c)

Pietje Puk
27
Data Scientist


In [34]:
a, b, c = person.items()   # unpacking (key-value pairs)
print(a)
print(b)
print(c)

('name', 'Pietje Puk')
('age', 27)
('profession', 'Data Scientist')


The _ operator is an _unnamed_ variable, essentially a variable that you’re not interested in and won’t be doing anything with, for instance in the following case:

In [35]:
a, b, _ = 'important', 'second important', 'not important'

In [38]:
person = {'name': 'Pietje Puk', 'age': 27, 'profession': 'Data Scientist'}
_, age, _ = person.items()
age

('age', 27)

## 4. `.get` instead of `[key]` for dictionary iterations
<a id='dict_iter'></a>

Dictionaries are great data-types for storing `values` with an attribute field known as the `key`, in so-called `key-value` pairs. When extracting key-value pairs from dictionaries, avoid running into `KeyError` exceptions with the `.get` method instead of the more traditional `[key]` method. The `.get` method provides a default value if the key is not present.

In [39]:
persons = [
  {'first_name': 'Louis', 'last_name': 'de Bruijn', 'age': 26, 'profession': 'Data Scientist', 'company': 'Data Analytics, an Ortec Finance company'},
  {'first_name': 'Pietje', 'last_name': 'Puk', 'age': 18, 'profession': 'student', 'university': 'University of Groningen'},
] # note that the first dictionary contains a key 'company', but the second dictionary does not contain the key 'company'

In [41]:
for cnt, person in enumerate(persons):
  print("Person {0}'s occupation is {1} at {2}.".format(cnt, person['profession'], person['company']))

Person 0's occupation is Data Scientist at Data Analytics, an Ortec Finance company.


KeyError: 'company'

In [45]:
for cnt, person in enumerate(persons):
  print("Person {0}'s occupation is {1} at {2}."\
        .format(cnt, person.get('profession', 'unknown'), person.get('company', 'undefined')))

Person 0's occupation is Data Scientist at Data Analytics, an Ortec Finance company.
Person 1's occupation is student at undefined.


Our program stops running because of the `KeyError` exception, but in later our program continues, using the default `'undefined'` string that is set as the second argument of the `.get` method.

When you have the possibility of assigning a default value for accessing key-value pairs in a dictionary, always try to use the `.get` method to avoid your program from stopping prematurely, especially when in production.

## 5. Loop two iterators with the zip function
<a id='loop_two'></a>

The zip function and it can save countless (nested) loops. You can use it mostly for iterating over two data types at the same time, where you need the indexes to be equal.

In [4]:
%time
list1 = [102, 306, 918, 2754]
list2 = [1, 3, 9, 27]

averages = []
for idx1, el1 in enumerate(list1):  # first for-loop
  for idx2, el2 in enumerate(list2):  # second for-loop
    if idx1 == idx2:  # check whether the indexes of first * second for-loop match
      y_intercept = el1/el2
      averages.append(y_intercept)
averages

Wall time: 0 ns


[102.0, 102.0, 102.0, 102.0]

In [5]:
%time
averages = []
for el1, el2 in zip(list1, list2):  # loop through both list using zip
  y_intercept = el1/el2
  averages.append(y_intercept)
averages

Wall time: 0 ns


[102.0, 102.0, 102.0, 102.0]

You can do this with any data type or generator. For instance, you could create dictionaries without looping over the separate lists, as such:

In [6]:
keys = ['a', 'b', 'c']
values = [1, 2, 3]
dictionary = dict(zip(keys, values))
dictionary

{'a': 1, 'b': 2, 'c': 3}

Later on, we’ll introduce the `*args` operator, which in combination with the `zip` function is very powerful!

## 6. List comprehensions
<a id='comp'></a>

List, tuple, and dictionary comprehensions are ways to code more efficiently: do the same in fewer lines of code.

In [8]:
# lists and tuples: [element for element in iterator]
elements = []
for i in range(1, 6):
  elements.append(i)
elements

[1, 2, 3, 4, 5]

In [9]:
elements = [i for i in range(1, 6)]
elements

[1, 2, 3, 4, 5]

In [10]:
# dictionaries: {key: value for element in iterator}
alphabet = 'abcde'

In [14]:
dictionary = {}
for i in range(1, 6):
  dictionary[i] = alphabet[i - 1]
dictionary

{1: 'a', 2: 'b', 3: 'c', 4: 'd', 5: 'e'}

In [17]:
dictionary = {i: alphabet[i - 1] for i in range(1,6)} 
dictionary

{1: 'a', 2: 'b', 3: 'c', 4: 'd', 5: 'e'}

This saves up unnecessary loops and creates a cleaner codebase.

## 7. Multiple assignment with `*` and `**`
<a id='multi'></a>

the `*` prefix operator was added to the multiple assignment syntax, allowing us to unpack the remaining items in an iterable.

In [18]:
first, second, *rest = [1, 2, 3, 4, 5, 6]

In [19]:
first

1

In [20]:
second

2

In [22]:
rest

[3, 4, 5, 6]

The `**` operator does something similar, but with keyword arguments. The `**` operator allows us to take a dictionary of key-value pairs and unpack it into keyword arguments in a function call.


In [23]:
persons = [
  {'first_name': 'Louis', 'last_name': 'de Bruijn', 'age': 26, 'profession': 'Data Scientist'},
  {'first_name': 'Pietje', 'last_name': 'Puk', 'age': 18, 'profession': 'student'},
]

In [24]:
for cnt, person in enumerate(persons):
  print("Person {0}'s occupation is {profession}.".format(cnt, **person))

Person 0's occupation is Data Scientist.
Person 1's occupation is student.


## Putting it all together
<a id='together'></a>

The `zip` function, `*` and `**` operators and list comprehension are powerful on their own, but they become especially interesting when combined. Below are three examples of a combination of these operators, functions, and comprehensions.

In [25]:
# Example 1: shuffle data to ensure random class distribution in train/test split
import random
documents = ["positive tweet message", "negative tweet message"]
labels = ["pos", "neg"]

tuples = [(doc, label) for doc, label in zip(documents, labels)]
random.shuffle(tuples)
X, Y = zip(*tuples)

In [27]:
# Example 2: merging two dictionaries
first_dictionary = {"A": 1, "B": 2}
second_dictionary = {"C": 3, "D": 4}
merged_dictionary = {**first_dictionary, **second_dictionary}
merged_dictionary

{'A': 1, 'B': 2, 'C': 3, 'D': 4}

In [29]:
# Example 3: dropping unneccesary function variables
def return_stuff():
  """Example function that returns data."""
  return "This", "is", "interesting", "This", "is", "not"
a, b, c, *_ = return_stuff()

In the first example, we need to keep the documents and labels coming from two separate lists together in a tuple of `(document, label)` by their index. the `zip` function provides the perfect solution. A list comprehension that ties this all together in a single expression. Then shuffles the tuple pairs in a random fashion. We then need to unpack the list of tuples `[(first_doc, first_label), (second_doc, second_label)]`, and use the `*` operator together with `zip` for this. Variable `X` contains the documents and variable `Y` the labels, with corresponding indexes.

In the second example, we’re able to merge two dictionaries in a single expression, without having to instantiate a new empty dictionary and iterate over the first and second dictionary separately.


In the third example, we can use a function that only partly returns the information we need, while discarding the rest of the returned variables. The *_ operator applies to the rest of the items in the iterator, without having to explicitly define the number of items. This is useful when you don’t know how many items or returned variables there are!