# Pythonic Code, By Example

## Intro

* Speaker: Michael Kennedy
* [Youtube Link](https://www.youtube.com/watch?v=o0LohkA3UR4)  
* [Link to code](https://github.com/mikeckennedy/pycon-sk-pythonic-talk)  

> The idea of writing idiomatic code that is most aligned with the language features and ideals is a key concept in Python.
>
> We call this idiomatic code Pythonic.

The definition of 'Pythonic' is a bit fuzzy but [Zen of Python](https://www.python.org/dev/peps/pep-0020/) is a good place to begin, but has a bit of a tension between convention and practicality hard-coded in.

As far as I can tell, Pythonic code tends to use the features of the language in a way that makes it obvious what's going on at a glance. Many of the most Pythonic practises tends to run more efficiently too (due to a mix of heavy tuning and CPython).

## [String Formatting](https://youtu.be/o0LohkA3UR4?t=5m32s)

There's a bunch of ways of writing string formatting with pre-defined variables:

In [1]:
name = 'Michael'
age = 43

This is a general non language-specific way of working:

In [2]:
print("Hi, I'm " + name + " and I'm " + str(age) + " years old.")

Hi, I'm Michael and I'm 43 years old.


Or ````printf```` style formatting:

In [3]:
print("Hi, I'm %s and I'm %d years old." % (name, age))

Hi, I'm Michael and I'm 43 years old.


Or with the ````format```` method:

In [4]:
print("Hi, I'm {} and I'm {} years old.".format(name, age))
print("Hi, I'm {1} years old and my name is {0}, yeah {1}.".format(name, age))

Hi, I'm Michael and I'm 43 years old.
Hi, I'm 43 years old and my name is Michael, yeah 43.


Or with a dictionary:

In [5]:
data = {'day': 'Saturday', 'office': 'Home office', 'other': 'UNUSED'}
print("On {day} I was working in my {office}!".format(**data))

On Saturday I was working in my Home office!


Or in Python 3.6+ there's f-strings, shorter and can take arbitary Python expressions:

In [6]:
print(f"Hi, I'm {name} and I'm {age+1} years old.")

Hi, I'm Michael and I'm 44 years old.


These also get more performant going down.

## [Merging Dictionaries](https://youtu.be/o0LohkA3UR4?t=8m8s)

(Python 3 only)

Say we want to mash a few dictionaries together in a web app. Also we need to ensure priority of the merging.

In [7]:
route = {'id': 271, 'title': 'Fast apps'}
query = {'id': 1, 'render_fast': True}
post = {'email': 'j@j.com', 'name': 'Jeff'}

* Route is the most important,
* Query is least important (user can mess with it the most in the url)

This is done by overwriting elements (so the last one has highest priority):

In [8]:
m1 = {}
for k in query:
    m1[k] = query[k]
for k in post:
    m1[k] = post[k]
for k in route:
    m1[k] = route[k]

print(m1)

{'id': 271, 'render_fast': True, 'email': 'j@j.com', 'name': 'Jeff', 'title': 'Fast apps'}


There's a few ways of doing it, but the 'most pythonic' is:

In [9]:
m4 = {**query, **post, **route}
print(m4)

{'id': 271, 'render_fast': True, 'email': 'j@j.com', 'name': 'Jeff', 'title': 'Fast apps'}


This has the exact same output and is much smaller.

## [Keyword Arguments](https://youtu.be/o0LohkA3UR4?t=10m8s)

(Python 3 only)

With functions where it's really important that the correct values are being sent to the correct keywords in a function call, it can be forced using:

In [10]:
def connect_v2(*, user, server, replicate, use_ssl):
    print("Connect v2, called with: ")
    print(f"User = {user}")
    print(f"Server = {server}")
    print(f"Replicate = {replicate}")
    print(f"Use SSL = {use_ssl}")
    print()

connect_v2(user='mkennedy', server='db_svr', replicate=True, use_ssl=False)

Connect v2, called with: 
User = mkennedy
Server = db_svr
Replicate = True
Use SSL = False



In this example, we can avoid potential issues where replication and ssl are accidentally swapped (or have their order swapped during later development). I probably wouldn't use this *every* time but it seems really useful for key functions.

## [On demand computations with yield](https://youtu.be/o0LohkA3UR4?t=11m50s)

(Python 3 only)

With big sequences, using ````yeild```` rather than ````return```` means the computation is left to the last minute. This can improve performance with recursion too:

In [11]:
def get_files(folder):
    for item in os.listdir(folder):

        full_item = os.path.join(folder, item)
        if os.path.isfile(full_item):
            yield full_item
        elif os.path.isdir(full_item):
            yield from get_files(full_item)

Using ````yield from```` says to take all of the results and add them to the set.

These Python 3 syntax structures are a major reason to switch, countering the libraries only available in Python 2.

## [Counting Iterable](https://youtu.be/o0LohkA3UR4?t=14m18s)

If we have an iterable that's not got ````len```` defined for it - the obvious way to count would be a for loop.

In [12]:
import collections
import uuid

Measurement = collections.namedtuple('Measurement', 'id x y value')

measurements = [
    Measurement(str(uuid.uuid4()), 1, 1, 72),
    Measurement(str(uuid.uuid4()), 2, 1, 40),
    Measurement(str(uuid.uuid4()), 3, 1, 11),
    Measurement(str(uuid.uuid4()), 2, 1, 90),
    Measurement(str(uuid.uuid4()), 2, 2, 60),
    Measurement(str(uuid.uuid4()), 2, 3, 73),
    Measurement(str(uuid.uuid4()), 3, 1, 40),
    Measurement(str(uuid.uuid4()), 3, 2, 44),
    Measurement(str(uuid.uuid4()), 3, 3, 90)
]

high_values = (
    m.value
    for m in measurements
    if m.value >= 70
)

Instead an in-line generator is a lot quicker. By convention, a variable we don't care about should be an underscore.

In [13]:
print(sum(1 for _ in high_values))
print(sum(1 for m in measurements if m.value >= 70))

4
4


## [Slicing infinity](https://youtu.be/o0LohkA3UR4?t=15m22s)

Python's one-ended slices (e.g. ````the_list[:5]````) are really useful, but don't work without a ````__getitem__```` method by default. The ````itertools```` library adds this functionality back in.

In [14]:
def generator_fibonacci():
    current, nxt = 0, 1

    while True:
        current, nxt = nxt, nxt + current
        yield current

import itertools
print(list(itertools.islice(generator_fibonacci(), 5)))

[1, 1, 2, 3, 5]


## [Hacking Python's memory with slots](https://youtu.be/o0LohkA3UR4?t=17m25s)

[The blog post mentioned](http://tech.oyster.com/save-ram-with-python-slots/)

Python's ````__slots__```` method when defining a class is a way of pre-defining a set of variables, rather than the default dynamic way. However it does totally fix the attributes of the class (*can't* have more).

By default Python uses a dictionary to store the names of attributes of a class, and generates a new dictionary for *each instance* of the class. Normally not a problem but at large scales it starts to fill memory with copies of the attribute names that are surplus to requirements.

This is not great practise for general use as it's a big hit to maintainability - a case of 'Although practicality beats purity.' compared to 'Special cases aren't special enough to break the rules.'.

In [15]:
class ImmutableThing:
    __slots__ = ['a', 'b', 'c', 'd']

    def __init__(self, a, b, c, d):
        self.a = a
        self.b = b
        self.c = c
        self.d = d

In the talk, the speaker ran a few tests, with the following results:

test name       | memory (MB) | Execution time (s) 
----------------|-------------|--------------------  
straight tuple  | 207         | 0.528455
named tuple     | 215         | 1.519358
class (dynamic) | 370         | 1.680248
slot class      | 120         | 1.438989

So not more time efficient (as there's still more going on than in a straight tuple), but it's significantly more efficient.
