# The Wild World of Python

In [2]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


## Don't Repeat Yourself


if you find yourself copy-and-pasting code

<Font size="+5">Don't!</Font>

consider writing a function, class or decorator instead

## Document, but Document the right things

Design and intent, not implementation

Bad:

In [None]:
# loop over movies in the list
for m in ml:
    # add genre to genre counts
    g.update(m.genre)

Better:

In [None]:
#takes a list of movie objects and returns a Counter of genre objects
def count_genres(movie_list):
    ...

## Docstrings are even nicer

Docstrings are triple-quoted strings places after a def or class that describes the functionality of that thing.

Many tools expect and use this feature. e.g. Jupyter notebook

In [2]:
def f(x):
    """
    this function multiplys by 2
    """
    return 2*x

In [3]:
f.__doc__

'\n    this function multiplys by 2\n    '

In [None]:
f()

## Sometimes descriptive names are all you need

In [6]:
def double(number):
    return 2*number

## Python Truthiness

![](https://media.giphy.com/media/12QgPOiTa7Ab04/giphy.gif)

Rely on truthiness

In [7]:
numlist = []

if numlist:
    x = 1

# instead of

if len(numlist)>0:
    x = 1

In [4]:
bool([1])

True

In [9]:
bool(0)

False

In [10]:
bool(1)

True

In [11]:
bool(14)

True

In [12]:
bool('')

False

In [13]:
bool('false')

True

In [5]:
class weirdnumber(int):
    def __bool__(self):
        return (self != 5)
    
a = weirdnumber(5)
b = weirdnumber(0)

print(bool(a))
print(bool(b))

False
True


## Think about your loops

`for` loops are great for almost any iterable. But there are some situations where they don't wook as well

In [3]:
l = 'mary had a little lamb'.split(' ')
print(l)


for i in range(len(l)):
    print(l[i])

['mary', 'had', 'a', 'little', 'lamb']
mary
had
a
little
lamb


In [18]:

for word in l:
    print(word)

['mary', 'had', 'a', 'little', 'lamb']
mary
had
a
little
lamb


if the iterable you are looping over is mutated inside the loop, this can cause problems.

In [19]:
for word in l:
    if word[0] == 'l':
        l = ['foo'] + l
    print(word)
        

mary
had
a
little
lamb


perhaps a `while` loop will be better

In [1]:
l = 'mary had a little lamb'.split(' ')

while l:
    word = l[0]
    if word[0] == 'l':
        l = ['foo'] + l
    l.remove(word)
    print(word)
    

mary
had
a
little
foo
lamb
foo


if you need to access the index of the list, perhaps for comparison, `enumerate()` is useful

In [25]:
l = 'mary had a little lamb'.split(' ')

for i,word in enumerate(l):
    print(i,word)

0 mary
1 had
2 a
3 little
4 lamb


## Sorting

`sort()` vs `sorted()`


In [6]:
alumni = [('bob',32,72000),
          ('alice',29,115000),
          ('charlie',25,95000)]

alumni.sort()
alumni

[('alice', 29, 115000), ('bob', 32, 72000), ('charlie', 25, 95000)]

In [7]:
alumni

[('alice', 29, 115000), ('bob', 32, 72000), ('charlie', 25, 95000)]

In [9]:
sorted(alumni)

[('alice', 29, 115000), ('bob', 32, 72000), ('charlie', 25, 95000)]

In [28]:
alumni.sort(reverse=True)
alumni

[('charlie', 25, 95000), ('bob', 32, 72000), ('alice', 29, 115000)]

In [4]:
alumni = sorted(alumni,key=lambda x: x[1])
alumni

[('charlie', 25, 95000), ('alice', 29, 115000), ('bob', 32, 72000)]

But we can use `itemgetter()` instead of `lambda`

In [32]:
from operator import itemgetter

alumni.sort(key=itemgetter(2), reverse=True)
alumni

[('alice', 29, 115000), ('charlie', 25, 95000), ('bob', 32, 72000)]

## Errors should never pass silently

When things happen that are unexpected, have your code raise an exception

In [15]:
class_list = [1,2,3]
if len(class_list) % 2 !=0:
    raise ValueError('list should have an even number of elements')

ValueError: list should have an even number of elements

## Don't use bare `except:` statements

Bad:

In [16]:
try:
    str(x)
except:
    print("¯\_(ツ)_/¯")

¯\_(ツ)_/¯


try instead:

In [17]:

try:
    str(x)
except TypeError:
    print("x could not be made a string")
    raise

NameError: name 'x' is not defined

## Mutability and Reference

![](http://i.imgur.com/lVz0IlX.jpg)

### The White Knight's Song by Lewis Carroll

"You are sad", the Knight said in an anxious tone: "let me sing you a song to comfort you."
"Is it very long?" Alice asked, for she had heard a good deal of poetry that day.

"It's long," said the Knight, "but it's very, very beautiful. Everybody that hears me sing it - either it brings the tears into their eyes, or else -"

"Or else what?" said Alice, for the Knight had made a sudden pause.

"Or else it doesn't, you know. The name of the song is called *Haddocks' Eyes*."

"Oh, that's the name of the song, is it?" Alice said, trying to feel interested.

"No, you don't understand," the Knight said, looking a little vexed. "That is what the name is called. The name really is *The Aged Aged Man*."

"Then I ought to have said 'That's what the song is called?' " Alice corrected herself.

"No, you oughtn't: that's quite another thing! The song is called *Ways And Means*: but that's only what it's called, you know!"

"Well, what is the song, then?" said Alice, who was by this time completely bewildered.

"I was coming to that," the Knight said. "The song really is *A-sitting On A Gate*: and the tune's my own invention."

Variables are not boxes

They are labels for objects. you can give the same object multiple labels. you do that with the assignment `=` operator

In [37]:
a = [1,2,3]
b = a
b.append(4)

print(a)
print(b)

[1, 2, 3, 4]
[1, 2, 3, 4]


`b = a` does not create a new object called be. It pastes b as another label to `[1,2,3]`.

if you want a new object, you will have to call some kind of constructor

In [38]:
a = [1,2,3]
b = list(a)
b.append(4)

print(a)
print(b)

[1, 2, 3]
[1, 2, 3, 4]


`[:]` is equivelant to calling `list()`

In [13]:
l1 = [3,[235,23],26,(2,3,4)]
l2 = l1[:]
l2

[3, [235, 23], 26, (2, 3, 4)]

In [16]:
l1 == l2

True

In [18]:
l1 is l2

False

In [17]:
l1[0] is l2[0]

True

In [22]:
l1[1] is l2[1]

True

In [23]:
l1.append(12)
l1[1].remove(23)
print(l1)
print(l2)

[3, [235], 26, (2, 3, 4), 12]
[3, [235], 26, (2, 3, 4)]


In [24]:
l2[1] += [86,36]
l2[3] += (1,1,4)
print(l1)
print(l2)

[3, [235, 86, 36], 26, (2, 3, 4), 12]
[3, [235, 86, 36], 26, (2, 3, 4, 1, 1, 4)]


In [25]:
from copy import deepcopy
l1 = [3,[235,23],26,(2,3,4)]
l2 = deepcopy(l1)
l1.append(12)
l1[1].remove(23)
print(l1)
print(l2)


[3, [235], 26, (2, 3, 4), 12]
[3, [235, 23], 26, (2, 3, 4)]


# Collections

# DefaultDict

for when you are too lazy to initialize things

In [26]:
count = {}
count['duck'] = 0

animals = ['duck','duck','duck','goose']

for animal in animals:
    count[animal] += 1
    print(animal)

count

duck
duck
duck


KeyError: 'goose'

In [27]:
count = {}

animals = ['duck','duck','duck','goose']

for animal in animals:
    try:
        count[animal] += 1
    except KeyError:
        count[animal] = 1

count

{'duck': 3, 'goose': 1}

In [28]:
from collections import defaultdict

count = defaultdict(int)
animals = ['duck','duck','duck','goose']

for animal in animals:
    count[animal] += 1
    
count

defaultdict(int, {'duck': 3, 'goose': 1})

DefaultDicts can also be used to make quick tree structures

In [19]:
def tree(): return defaultdict(tree)

dir(tree)

['__annotations__',
 '__call__',
 '__class__',
 '__closure__',
 '__code__',
 '__defaults__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__get__',
 '__getattribute__',
 '__globals__',
 '__gt__',
 '__hash__',
 '__init__',
 '__kwdefaults__',
 '__le__',
 '__lt__',
 '__module__',
 '__name__',
 '__ne__',
 '__new__',
 '__qualname__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__']

In [30]:
import json

taxonomy = tree()
taxonomy['Animalia']['Chordata']['Mammalia']['Carnivora']['Felidae']['Felis']['cat'] = 1
taxonomy['Animalia']['Chordata']['Mammalia']['Carnivora']['Felidae']['Panthera']['lion'] = 1
taxonomy['Animalia']['Chordata']['Mammalia']['Carnivora']['Canidae']['Canis']['dog'] = 1
taxonomy['Animalia']['Chordata']['Mammalia']['Carnivora']['Canidae']['Canis']['coyote'] = 1
taxonomy['Plantae']['Solanales']['Solanaceae']['Solanum']['tomato'] = 1
taxonomy['Plantae']['Solanales']['Solanaceae']['Solanum']['potato'] = 1
taxonomy['Plantae']['Solanales']['Convolvulaceae']['Ipomoea']['sweet potato'] =1

print(json.dumps(taxonomy))

{"Plantae": {"Solanales": {"Convolvulaceae": {"Ipomoea": {"sweet potato": 1}}, "Solanaceae": {"Solanum": {"tomato": 1, "potato": 1}}}}, "Animalia": {"Chordata": {"Mammalia": {"Carnivora": {"Canidae": {"Canis": {"coyote": 1, "dog": 1}}, "Felidae": {"Felis": {"cat": 1}, "Panthera": {"lion": 1}}}}}}}


## Named Tuple

sometimes you want to create a class, but the class only needs to store data, and you are lazy.

You could put the data in a dictionary, but there is a set amount of info that never changes for each instance.

You could put the data in a tuple, but then you need to remember the order.

What if you could have the simplicity of a tuple, but labels like a dictionary, and access methods like a class?

Welcome to namedtuples

In [23]:
from collections import namedtuple

Alumni = namedtuple('Alumni','name age gender degree title salary employer')

alice = Alumni(name='Alice',
               age=29,
               gender='F',
               degree ='PhD',
               title = 'Data Scientist',
               salary = 115000,
               employer = 'Thumbtack')

alice.age

29

namedtuples are actually classes, so they can be inherited from just like any other class.

A namedtuple will create a class with 
* properties and `__getitem__`s
* an `__init__`

And we can add functionality to it

# DunderMethods

In python there are many secret functions that start with __ (pronounced dunder)

lets looks at the point we just created.

In [35]:
class Point(object):
    def __init__(self,x,y):
        self.x = x
        self.y = y
    def __add__(self,b):
        return Point(self.x+b.x,self.y+b.y)
    def __sub__(self,b):
        return Point(self.x-b.x,self.y-b.y)
    def __bool__(self):
        return bool(self.x or self.y)
    
    def __repr__(self):
        return str((self.x,self.y))
    
a = Point(-1,5)
b = Point(1,-5)

print(a-b)


(-2, 10)


In [36]:
x = 2
dir(x)

['__abs__',
 '__add__',
 '__and__',
 '__bool__',
 '__ceil__',
 '__class__',
 '__delattr__',
 '__dir__',
 '__divmod__',
 '__doc__',
 '__eq__',
 '__float__',
 '__floor__',
 '__floordiv__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__index__',
 '__init__',
 '__int__',
 '__invert__',
 '__le__',
 '__lshift__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__neg__',
 '__new__',
 '__or__',
 '__pos__',
 '__pow__',
 '__radd__',
 '__rand__',
 '__rdivmod__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rfloordiv__',
 '__rlshift__',
 '__rmod__',
 '__rmul__',
 '__ror__',
 '__round__',
 '__rpow__',
 '__rrshift__',
 '__rshift__',
 '__rsub__',
 '__rtruediv__',
 '__rxor__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__sub__',
 '__subclasshook__',
 '__truediv__',
 '__trunc__',
 '__xor__',
 'bit_length',
 'conjugate',
 'denominator',
 'from_bytes',
 'imag',
 'numerator',
 'real',
 'to_bytes']

In [24]:
class Point(namedtuple('Point', 'x y')):
    def __add__(self,b):
        return Point(self.x+b.x,self.y+b.y)
    def __sub__(self,b):
        return Point(self.x-b.x,self.y-b.y)
    def __bool__(self):
        return bool(self.x or self.y)
    
a = Point(-1,5)
b = Point(1,-5)

print(a-b)
#bool(a+b)

Point(x=-2, y=10)


In [26]:
dir(a)

['__add__',
 '__bool__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__module__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__slots__',
 '__str__',
 '__sub__',
 '__subclasshook__',
 '_asdict',
 '_fields',
 '_make',
 '_replace',
 '_source',
 'count',
 'index',
 'x',
 'y']

In [27]:
dir(f)

['__annotations__',
 '__call__',
 '__class__',
 '__closure__',
 '__code__',
 '__defaults__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__get__',
 '__getattribute__',
 '__globals__',
 '__gt__',
 '__hash__',
 '__init__',
 '__kwdefaults__',
 '__le__',
 '__lt__',
 '__module__',
 '__name__',
 '__ne__',
 '__new__',
 '__qualname__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__']

In [25]:
print(alice)

dir(alice)

Alumni(name='Alice', age=29, gender='F', degree='PhD', title='Data Scientist', salary=115000, employer='Thumbtack')


['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__module__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__slots__',
 '__str__',
 '__subclasshook__',
 '_asdict',
 '_fields',
 '_make',
 '_replace',
 '_source',
 'age',
 'count',
 'degree',
 'employer',
 'gender',
 'index',
 'name',
 'salary',
 'title']