## Python fundamentals

**Efficient Python Code**<br>

### Python Zen

In [1]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


### Range, enumerate, map

In [18]:
# Built-in practice: range(), enumerate(), map()
display([*range(1,13,2)])

[*range(10, 60, 10)]

[1, 3, 5, 7, 9, 11]

[10, 20, 30, 40, 50]

In [10]:
names = ['Jerry', 'Kramer', 'Elaine', 'George', 'Newman']
display([(i,name) for i,name in enumerate(names,3)])
[*enumerate(names, 1)]

[(3, 'Jerry'), (4, 'Kramer'), (5, 'Elaine'), (6, 'George'), (7, 'Newman')]

[(1, 'Jerry'), (2, 'Kramer'), (3, 'Elaine'), (4, 'George'), (5, 'Newman')]

In [94]:
nums = [1, 2, 3, 4, 5]
list(map(lambda x: x ** 2, nums))
 

[1, 4, 9, 16, 25]

In [95]:
[*zip(names,nums)]

[('Jerry', 1), ('Kramer', 2), ('Elaine', 3), ('George', 4), ('Newman', 5)]

In [16]:
# Numpy arrays
import numpy as np

nums = np.array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10]])

# Print second row of nums
print(nums[1,:])

# Print all elements of nums that are greater than six
print(nums[nums > 6])

# Double every element of nums
nums_dbl = nums * 2
print(nums_dbl)

# Replace the third column of nums
nums[:,2] = nums[:,2] + 1
print(nums)

[ 6  7  8  9 10]
[ 7  8  9 10]
[[ 2  4  6  8 10]
 [12 14 16 18 20]]
[[ 1  2  4  4  5]
 [ 6  7  9  9 10]]


### Magic commands

In [111]:
%lsmagic

Available line magics:
%alias  %alias_magic  %autoawait  %autocall  %automagic  %autosave  %bookmark  %cat  %cd  %clear  %colors  %conda  %config  %connect_info  %cp  %debug  %dhist  %dirs  %doctest_mode  %ed  %edit  %env  %gui  %hist  %history  %killbgscripts  %ldir  %less  %lf  %lk  %ll  %load  %load_ext  %loadpy  %logoff  %logon  %logstart  %logstate  %logstop  %lprun  %ls  %lsmagic  %lx  %macro  %magic  %man  %matplotlib  %mkdir  %more  %mv  %notebook  %page  %pastebin  %pdb  %pdef  %pdoc  %pfile  %pinfo  %pinfo2  %pip  %popd  %pprint  %precision  %prun  %psearch  %psource  %pushd  %pwd  %pycat  %pylab  %qtconsole  %quickref  %recall  %rehashx  %reload_ext  %rep  %rerun  %reset  %reset_selective  %rm  %rmdir  %run  %save  %sc  %set_env  %store  %sx  %system  %tb  %time  %timeit  %unalias  %unload_ext  %who  %who_ls  %whos  %xdel  %xmode

Available cell magics:
%%!  %%HTML  %%SVG  %%bash  %%capture  %%debug  %%file  %%html  %%javascript  %%js  %%latex  %%markdown  %%perl  %%prun  %%

### Code profiling: Time

In [29]:
%time
# Create a list of integers (0-50) using list comprehension
nums_list_comp = [num for num in range(51)]
print(nums_list_comp)

print(' ')
%time
# Create a list of integers (0-50) by unpacking range
nums_unpack = [*range(51)]
print(nums_unpack)
print(' ')


CPU times: user 5 µs, sys: 1e+03 ns, total: 6 µs
Wall time: 11 µs
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50]
 
CPU times: user 4 µs, sys: 1 µs, total: 5 µs
Wall time: 8.82 µs
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50]
 


In [73]:
# Time loops: You'd like to analyze the runtime for converting this heroes list into a set.
# Instead of relying on the default settings for %timeit,
# you'd like to only use 5 runs and 25 loops per each run.

heroes = []
with open('assets/learn_python/heroes.txt', 'r') as file:
    for i in file:
        heroes.append(i.split(',')[0][2:-1])

In [75]:
%timeit -r5 -n25 set(heroes)

39.5 µs ± 18.1 µs per loop (mean ± std. dev. of 5 runs, 25 loops each)


Python allows you to create data structures using either a formal name or a literal syntax.<br>
Literals are faster than using names<br>
e.g. [] instead of list, {} instead of dict, () instead of tuple<br>

Time line by line

In [88]:
def convert_units(heroes, heights, weights):

    new_hts = [ht * 0.39370  for ht in heights]
    new_wts = [wt * 2.20462  for wt in weights]

    hero_data = {}

    for i,hero in enumerate(heroes):
        hero_data[hero] = (new_hts[i], new_wts[i])

    return hero_data

heroes = ['Batman', 'Superman', 'Wonder Woman']
hts = np.array([188.0, 191.0, 183.0])
wts = np.array([ 95.0, 101.0,  74.0])

%timeit convert_units(heroes, hts, wts)

2.45 µs ± 9.33 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [80]:
import line_profiler

In [86]:
%load_ext line_profiler

The line_profiler extension is already loaded. To reload it, use:
  %reload_ext line_profiler


In [90]:
%lprun -f convert_units convert_units(heroes, hts, wts)

### Code profiling: Memory

In [92]:
import sys
sys.getsizeof([*range(1000)])


9104

In [None]:
%load_ext memory_profiler

%lprun -f convert_units convert_units(heroes, hts, wts)

### Combinations

In [97]:
pokemon=['Geodude', 'Cubone', 'Lickitung', 'Persian', 'Diglett']

In [98]:
# Import combinations from itertools
from itertools import combinations

# Create a combination object with pairs of Pokémon
combos_obj = combinations(pokemon, 2)
print(type(combos_obj), '\n')

# Convert combos_obj to a list by unpacking
combos_2 = [*combos_obj]
print(combos_2, '\n')

# Collect all possible combinations of 4 Pokémon directly into a list
combos_4 = [*combinations(pokemon, 4)]
print(combos_4)

<class 'itertools.combinations'> 

[('Geodude', 'Cubone'), ('Geodude', 'Lickitung'), ('Geodude', 'Persian'), ('Geodude', 'Diglett'), ('Cubone', 'Lickitung'), ('Cubone', 'Persian'), ('Cubone', 'Diglett'), ('Lickitung', 'Persian'), ('Lickitung', 'Diglett'), ('Persian', 'Diglett')] 

[('Geodude', 'Cubone', 'Lickitung', 'Persian'), ('Geodude', 'Cubone', 'Lickitung', 'Diglett'), ('Geodude', 'Cubone', 'Persian', 'Diglett'), ('Geodude', 'Lickitung', 'Persian', 'Diglett'), ('Cubone', 'Lickitung', 'Persian', 'Diglett')]


In [103]:
ash_pokedex   =  ['Pikachu', 'Bulbasaur', 'Koffing', 'Spearow', 'Vulpix', 'Wigglytuff', 'Zubat', 'Rattata', 'Psyduck', 'Squirtle'] 

misty_pokedex = ['Krabby', 'Horsea', 'Slowbro', 'Tentacool', 'Vaporeon', 'Magikarp', 'Poliwag', 'Starmie', 'Psyduck', 'Squirtle']

### Sets

In [104]:
# Convert both lists to sets
ash_set = set(ash_pokedex)
misty_set = set(misty_pokedex)

# Find the Pokémon that exist in both sets
both = ash_set.intersection(misty_set)
print(both)

# Find the Pokémon that Ash has and Misty does not have
ash_only = ash_set.difference(misty_set)
print(ash_only)

# Find the Pokémon that are in only one set (not both)
unique_to_set = ash_set.symmetric_difference(misty_set)
print(unique_to_set)

{'Psyduck', 'Squirtle'}
{'Rattata', 'Zubat', 'Bulbasaur', 'Koffing', 'Wigglytuff', 'Spearow', 'Vulpix', 'Pikachu'}
{'Krabby', 'Horsea', 'Starmie', 'Rattata', 'Zubat', 'Poliwag', 'Bulbasaur', 'Tentacool', 'Slowbro', 'Magikarp', 'Koffing', 'Wigglytuff', 'Spearow', 'Vaporeon', 'Vulpix', 'Pikachu'}


### Zip

In [106]:
poke_names = ['Rattata', 'Zubat', 'Bulbasaur', 'Koffing', 'Wigglytuff']
poke_gens = [1,2,2,1,1]

# Collect Pokémon that belong to generation 1 or generation 2
gen1_gen2_pokemon = [name for name,gen in zip(poke_names, poke_gens) if gen < 3]

# Create a map object that stores the name lengths
name_lengths_map = map(len, gen1_gen2_pokemon)

# Combine gen1_gen2_pokemon and name_lengths_map into a list
gen1_gen2_name_lengths = [*zip(name_lengths_map, gen1_gen2_pokemon)]

print(gen1_gen2_name_lengths[:5])

[(7, 'Rattata'), (5, 'Zubat'), (9, 'Bulbasaur'), (7, 'Koffing'), (10, 'Wigglytuff')]


In [109]:
sorted(gen1_gen2_name_lengths, key=lambda x: x[0], reverse=True)[:3]

[(10, 'Wigglytuff'), (9, 'Bulbasaur'), (7, 'Rattata')]

### Counter

In [110]:
# Import Counter
from collections import Counter

# Collect the count of each generation
gen_counts = Counter(gen1_gen2_name_lengths)
gen_counts

Counter({(7, 'Rattata'): 1,
         (5, 'Zubat'): 1,
         (9, 'Bulbasaur'): 1,
         (7, 'Koffing'): 1,
         (10, 'Wigglytuff'): 1})

### Pandas iterrows

In [114]:
import pandas as pd

pid = pd.DataFrame([ash_pokedex,misty_pokedex]).T
pid.columns = ['Ash','Misty']
pid

Unnamed: 0,Ash,Misty
0,Pikachu,Krabby
1,Bulbasaur,Horsea
2,Koffing,Slowbro
3,Spearow,Tentacool
4,Vulpix,Vaporeon
5,Wigglytuff,Magikarp
6,Zubat,Poliwag
7,Rattata,Starmie
8,Psyduck,Psyduck
9,Squirtle,Squirtle


In [117]:
for i,row in pid.iterrows():
    print(i)
    print(row)
    print(type(row))

0
Ash      Pikachu
Misty     Krabby
Name: 0, dtype: object
<class 'pandas.core.series.Series'>
1
Ash      Bulbasaur
Misty       Horsea
Name: 1, dtype: object
<class 'pandas.core.series.Series'>
2
Ash      Koffing
Misty    Slowbro
Name: 2, dtype: object
<class 'pandas.core.series.Series'>
3
Ash        Spearow
Misty    Tentacool
Name: 3, dtype: object
<class 'pandas.core.series.Series'>
4
Ash        Vulpix
Misty    Vaporeon
Name: 4, dtype: object
<class 'pandas.core.series.Series'>
5
Ash      Wigglytuff
Misty      Magikarp
Name: 5, dtype: object
<class 'pandas.core.series.Series'>
6
Ash        Zubat
Misty    Poliwag
Name: 6, dtype: object
<class 'pandas.core.series.Series'>
7
Ash      Rattata
Misty    Starmie
Name: 7, dtype: object
<class 'pandas.core.series.Series'>
8
Ash      Psyduck
Misty    Psyduck
Name: 8, dtype: object
<class 'pandas.core.series.Series'>
9
Ash      Squirtle
Misty    Squirtle
Name: 9, dtype: object
<class 'pandas.core.series.Series'>


In [118]:
for x in pid.iterrows():
    print(x)

(0, Ash      Pikachu
Misty     Krabby
Name: 0, dtype: object)
(1, Ash      Bulbasaur
Misty       Horsea
Name: 1, dtype: object)
(2, Ash      Koffing
Misty    Slowbro
Name: 2, dtype: object)
(3, Ash        Spearow
Misty    Tentacool
Name: 3, dtype: object)
(4, Ash        Vulpix
Misty    Vaporeon
Name: 4, dtype: object)
(5, Ash      Wigglytuff
Misty      Magikarp
Name: 5, dtype: object)
(6, Ash        Zubat
Misty    Poliwag
Name: 6, dtype: object)
(7, Ash      Rattata
Misty    Starmie
Name: 7, dtype: object)
(8, Ash      Psyduck
Misty    Psyduck
Name: 8, dtype: object)
(9, Ash      Squirtle
Misty    Squirtle
Name: 9, dtype: object)


### Pandas itertuples

In [122]:
# Loop over the DataFrame and calculate each row's run differential
for row in pid.itertuples():
    print(row.Ash, row.Misty)

Pikachu Krabby
Bulbasaur Horsea
Koffing Slowbro
Spearow Tentacool
Vulpix Vaporeon
Wigglytuff Magikarp
Zubat Poliwag
Rattata Starmie
Psyduck Psyduck
Squirtle Squirtle


In [129]:
pid.apply(lambda row: len(row),axis=1)

0    2
1    2
2    2
3    2
4    2
5    2
6    2
7    2
8    2
9    2
dtype: int64