# Code Like a Pythonista: Idiomatic Python

Inspired by book Writing Idiomatic Python by Jeff Knupp.

### What?
● Idioms in a programming language are a sort of
lingua franca (bridge language, common language, ..) to let future readers know exactly
what we’re trying to accomplish.
 
● We may document our code extensively, write
exhaustive unit tests, and hold code reviews three
times a day, but the fact remains: when someone
else needs to make changes, the code is king.

**Always code as if the guy who ends up maintaining your code
will be a violent psychopath who knows where you live.
--John Woods comp.lang.c++   **

### Why?
● Helps maintainability
● Easier to spot bugs
● Easier to write and read
● Often faster
● Often shorter

## Avoid placing multiple statements on a single line

In [17]:
#Bad
N = 100; M = 20; print(N + M);


120


In [18]:
#Good
N = 100
M = 20
print(N + M)

120


## Avoid comparing directly to True, False, or None

In [2]:
#BAD
x = bool()
x = True
if x == True:
    print("BAD")

BAD


In [3]:
#GOOD
x = bool()
x = True
if x:
    print("GOOD")

GOOD


## Avoid repeating variable name in compound if statement

In [4]:
#Bad
is_generic_name = False
name = 'Tom'
if name == 'Tom' or name == 'Dick' or name == 'Harry':
    is_generic_name = True
print(is_generic_name)


True


In [6]:
#Good
name = 'Tom'
is_generic_name = name in ('Tom', 'Dick', 'Harry')
print(is_generic_name)

False


## Avoid placing conditional branch code on the same line as the colon

**Using indentation to indicate scope like you already do everywhere else in Python

In [1]:
#Bad
name = 'Jorge'
address = 'Chicago, IL'
if name: print(name); print(address)


Jorge
Chicago, IL


In [22]:
#Good
name = 'Jorge'
address = 'Chicago, IL'
if name:
    print(name)
    print(address)

Jorge
Chicago, IL


## Use the in keyword to iterate over an iterable

In [23]:
#Bad
my_list = ['Larry', 'Moe', 'Curly']
index = 0
while index < len(my_list):
    print (my_list[index])
    index += 1


Larry
Moe
Curly


In [24]:
#Good
my_list = ['Larry', 'Moe', 'Curly']
for element in my_list:
    print (element)

Larry
Moe
Curly


## Use the “enumerate” function in loops instead of creating an “index” variable


In [25]:
#Bad
my_container = ['Larry', 'Moe', 'Curly']
index = 0
for element in my_container:
    print ('{} {}'.format(index, element))
    index += 1


0 Larry
1 Moe
2 Curly


In [26]:
#Good
my_container = ['Larry', 'Moe', 'Curly']
for index, element in enumerate(my_container):
    print ('{} {}'.format(index, element))

0 Larry
1 Moe
2 Curly


## Use list comprehension to create a transformed version of an existing list

* Listcomps are clear & concise, up to a point. 
* You can have multiple for-loops and if-conditions in a listcomp
* if the conditions are complex, regular for loops should be used. 
* Applying the Zen of Python, choose the more readable way.

In [27]:
#Bad
original_list = range(10)
new_list = list()
for element in original_list:
    if element % 2:
        new_list.append(element + 5)
print(new_list)

[6, 8, 10, 12, 14]


In [29]:
#Good
original_list = range(10)
new_list = [element + 5 for element in original_list if element % 2]

## Generator Expressions

* Generator expressions ("genexps") are just like list comprehensions, 
* except that where listcomps are greedy, generator expressions are lazy. 
* Listcomps compute the entire result list all at once, as a list. 
* Generator expressions compute one value at a time, when needed, as individual values. 
* This is especially useful for long sequences where the computed list is just an intermediate step and not the final result.

* The difference in syntax is that listcomps have square brackets, but generator expressions don't. 
* Generator expressions sometimes do require enclosing parentheses though, so you should always use them.
* Rule of thumb:
 * Use a list comprehension when a computed list is the desired end result.
 * Use a generator expression when the computed list is just an intermediate step.

In [32]:
# For example, if we were summing the squares of several billion integers, we'd run out of memory with list comprehensions!

#total = sum(num * num for num in xrange(1, 1000000000))  - DO NOT RUN

#Example:

month_codes = dict((fn(i+1), code) for i, code in enumerate('FGHJKMNQUVXZ') for fn in (int, str))

print(month_codes)

{'6': 'M', 1: 'F', 2: 'G', 3: 'H', 4: 'J', 5: 'K', 6: 'M', 7: 'N', '2': 'G', 9: 'U', 10: 'V', 11: 'X', 12: 'Z', '5': 'K', '9': 'U', '4': 'J', '1': 'F', '10': 'V', '12': 'Z', '7': 'N', 8: 'Q', '11': 'X', '3': 'H', '8': 'Q'}


## Generator expression

* Use a generator expression instead of a function if:
 * You only need the function in one place
 * You are just going to iterate once through the values

In [63]:
def grep(lines, searchtext):
    for line in lines:
        if searchtext in line:
            yield line
            
lines = "line 1 \n line 2 \n line 3"
matchingLines = (line for line in lines if searchtext in line)

## Generators - complex functions

* The yield keyword turns a function into a generator. 
* When you call a generator function, instead of running the code immediately Python returns a generator object.
* The generator object is an iterator; it has a next method. 

**This is how a for loop really works. Python looks at the sequence supplied after the in keyword. 
If it's a simple container (such as a list, tuple, dictionary, set, or user-defined container) Python converts it into an iterator. If it's already an iterator, Python uses it directly.**


In [30]:
def my_range_generator(stop):
    value = 0
    while value < stop:
        yield value
        value += 1

for i in my_range_generator(10):
    print(i)

0
1
2
3
4
5
6
7
8
9


## Use the default parameter of dict.get to provide default values


In [37]:
#Bad
class_grades = {'id':1001, 'grade':5}
log_severity = None
if 'severity1' in configuration:
    log_severity = configuration['grade']
else:
    log_severity = 'No Data'
print(log_severity)

No Data


In [39]:
#Good
log_severity = class_grades.get('severity', 'Info')
print(log_severity)

Info


## Use dict comprehension to build a dict clearly and efficiently

Filter a list to construct a dictionary!
(Recall that in list comprehension we filter a list to create another list)


In [41]:
#Bad
users_list = [('Jim','jim@a.com'),('Kim',''),('Frank','frank@a.com')]
user_with_email = {}
for user in users_list:
    if user[1]:
        user_with_email[user[0]] = user[1]
print(user_with_email)

{'Frank': 'frank@a.com', 'Jim': 'jim@a.com'}


In [43]:
#Good
users_list = [('Jim','jim@a.com'),('Kim',''),('Frank','frank@a.com')]
user_email = {user[0] : user[1] for user in users_list if user[0]}
print(user_with_email)

{'Frank': 'frank@a.com', 'Jim': 'jim@a.com'}


## Use set comprehension to generate sets concisely

* The syntax is identical to list comprehension
* Except for the enclosing characters
* set behaves like a dictionary with keys but no values)

In [48]:
# Bad
users = ['Jim Winter', 'Thomas Winter','Thomas Fall']
users_first_names = set()
for user in users:
    users_first_names.add(user.split()[0])
    
print(users_first_names)

{'Jim', 'Thomas'}


In [50]:
# Good
users = ['Jim Winter', 'Thomas Winter','Thomas Fall']
users_first_names = {user.split()[0] for user in users}

print(users_first_names)

{'Jim', 'Thomas'}


## Use sets to eliminate duplicate entries from Iterable containers

* Note that most often you do not need to convert the set back to a list
* A set is an Iterable just like a list!
* so you can use it in for loops, list comprehensions, etc.

In [52]:
#Bad
employee_surnames = ('jim','kim','jim','alec')
unique_surnames = []
for surname in employee_surnames:
    if surname not in unique_surnames:
        unique_surnames.append(surname)
print(unique_surnames)

['jim', 'kim', 'alec']


In [51]:
#Good
employee_surnames = ('jim','kim','jim','alec')
unique_surnames = set(employee_surnames)
print(unique_surnames)

{'kim', 'alec', 'jim'}


## Understand and use the "set" mathematical set operations

* Union: A | B
* Intersection: A & B
* Difference: A – B (Note: order matters here. A - B is not necessarily the same as B - A).
* Symmetric Difference: ˆ B

In [55]:
# Bad
most_active_users = ('alec','steve','jim','fred')
most_popular_users = ('sam','steve','jim')
popular_and_active_users = []
for user in most_active_users:
    if user in most_popular_users:
        popular_and_active_users.append(user)
print(popular_and_active_users)

['steve', 'jim']


In [53]:
# Good
most_active_users = ('alec','steve','jim','fred')
most_popular_users = ('sam','steve','jim')
popular_and_active_users = set(most_active_users) & set(most_popular_users)
print(popular_and_active_users)

{'steve', 'jim'}


## Chain string functions to make a simple
series of transformations more clear
Too much chaining can make your code harder to follow.
“No more than three chained functions” is a good rule of thumb.


In [59]:
#Bad
book_info = ' The Three Musketeers: Alexandre Dumas'
formatted_book_info = book_info.strip()
formatted_book_info = formatted_book_info.upper()
formatted_book_info = formatted_book_info.replace(':', ' by')
print(formatted_book_info)

THE THREE MUSKETEERS by ALEXANDRE DUMAS


In [57]:
#Good
book_info = ' The Three Musketeers: Alexandre Dumas'
formatted_book_info = book_info.strip().upper().replace(':', ' by')
print(formatted_book_info)

THE THREE MUSKETEERS by ALEXANDRE DUMAS


## Prefer the format function for formatting strings


In [71]:
#Bad
def get_formatted_user_info_worst(name,age,sex):
    # Tedious to type and prone to conversion errors
    return 'Name: ' + name + ', Age: ' + str(age) + ', Sex: ' + sex
print(get_formatted_user_info_worst('James',30,'M'))

Name: James, Age: 30, Sex: M


In [72]:
#Bad
def get_formatted_user_info_slightly_better(name,age,sex):
    # No visible connection between the format string placeholders
    # and values to use. Also, why do I have to know the type?
    return 'Name: %s, Age: %i, Sex: %c' % (name, age, sex)

print(get_formatted_user_info_worst('James',30,'M'))

Name: James, Age: 30, Sex: M


In [73]:
#Good
def get_formatted_user_info(name,age,sex):
    # Clear and concise. At a glance I can tell exactly what
    # the output should be.
    output = 'Name: {name}, Age: {age}'', Sex: {sex}'.format(name,age,sex)
    return output

print(get_formatted_user_info_worst('James',30,'M'))

Name: James, Age: 30, Sex: M


## Use ''.join when creating a single string for list elements


In [70]:
#Bad
result_list = ['True', 'False', 'File not found']
result_string = ''
for result in result_list:
    result_string += result
print(result_string)

TrueFalseFile not found


In [68]:
#Good
result_list = ['True', 'False', 'File not found']
result_string = ''.join(result_list)
print(result_string)

TrueFalseFile not found


## Use tuples to unpack data for multiple assignment


In [67]:
#Bad
l = ['dog', 'Fido', 10]
animal = l[0]
name = l[1]
age = l[2]
output = ('{name} the {animal} is {age} years old'.format(animal=animal, name=name, age=age))
print(output)

Fido the dog is 10 years old


In [66]:
#Good
l = ['dog', 'Fido', 10]
(animal, name, age) = l
output = ('{name} the {animal} is {age} years old'.format(animal=animal, name=name, age=age))
print(output)

Fido the dog is 10 years old


## Avoid using a temporary variable when performing a swap of two values


In [64]:
#Bad
foo = 'Foo'
bar = 'Bar'
temp = foo
foo = bar
bar = temp

In [65]:
#Good
foo = 'Foo'
bar = 'Bar'
(foo, bar) = (bar, foo)