In [1]:
# importing libs
import pandas as pd
import numpy as np

## 1. Writing efficient Python code

### 1.1 Defining efficient 

* Minimal completion time (fast runtime)
* Minimal resource consumption (small memory footprint)

### 1.2 Defining Pythonic
* Focus on readability
* Using Python's constructs as intended

In [2]:
# Print the list created using the Non-Pythonic approach

names = ['Jerry', 'Kramer', 'Elaine', 'George', 'Newman']

i = 0
new_list = []
while i < len(names):
    if len(names[i]) >= 6:
        new_list.append(names[i])
    i += 1

print("Print the list created using the Non-Pythonic approach")
print(new_list, "\n")

# Print the list created by looping over the contents of names
better_list = []
for name in names:
    if len(name) >= 6:
        better_list.append(name)
print("Print the list created by looping over the contents of names")
print(better_list, "\n")

#The best Pythonic way of doing this is by using list comprehension.
best_list = [name for name in names if len(name) >= 6]
print("The best Pythonic way of doing this is by using list comprehension.")
print(best_list)


Print the list created using the Non-Pythonic approach
['Kramer', 'Elaine', 'George', 'Newman'] 

Print the list created by looping over the contents of names
['Kramer', 'Elaine', 'George', 'Newman'] 

The best Pythonic way of doing this is by using list comprehension.
['Kramer', 'Elaine', 'George', 'Newman']


In [3]:
# Zen of Python
import this
### 1.3 Building with built-ins

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


* Built-in types
  * list, tuple, set, dict and others.
* Built-in functions
  * print() , len() , range() , round() , enumerate() , map() , zip() , and others.
* Built-in modules 
  * os , sys , itertools , collections , math , and others.

1.3.1 Built-in functions
### range() 
```codeblock
  # Explicitly typing a list of numbers 
  nums = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

  #Using range() to create the same list 
  range(start,stop) 
  nums = range(0,11) 
  nums_list = list(nums) 
  # output: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] 

  # Using range() with a step value
  even_nums = range(2, 11, 2) 
  even_nums_list = list(even_nums) 
  # output: [2, 4, 6, 8, 10] 
```

### enumerate()

Creates an indexed list of objects 
```codeblock
  letters = ['a', 'b', 'c', 'd' ]
  indexed_letters = enumerate(letters)
  indexed_letters_list = list(indexed_letters) print(indexed_letters_list) 
  # output: [(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd')] 
```

Can specify a start value 
```codeblock
  letters = ['a', 'b', 'c', 'd' ]
  indexed_letters2 = enumerate(letters, start=5)
  indexed_letters2_list = list(indexed_letters2)
  # output: [(5, 'a'), (6, 'b'), (7, 'c'), (8, 'd')] 
```

### map()

Applies a function over an object 
```codeblock
  nums = [1.5, 2.3, 3.4, 4.6, 5.0] 
  rnd_nums = list(map(round, nums))
  output: [2, 2, 3, 5, 5] 
```
with lambda (anonymous function) 
```codeblock
  nums = [1, 2, 3, 4, 5] 
  sqrd_nums = list(map(lambda x: x ** 2, nums))
  output: [1, 4, 9, 16, 25] 
```

In [4]:
# Example with range()

# Create a range object that goes from 0 to 5
nums = range(0, 6)
print(type(nums))

# Convert nums to a list
nums_list = list(nums)
print(nums_list)

# Create a new list of odd numbers from 1 to 11 by unpacking a range object, (*) unpacking a range object using the star character (*).
nums_list2 = [*range(1, 12, 2)]
print(nums_list2)

<class 'range'>
[0, 1, 2, 3, 4, 5]
[1, 3, 5, 7, 9, 11]


In [5]:
# Example with enumerate  ()
names = ['Jerry', 'Kramer', 'Elaine', 'George', 'Newman']

# Rewrite the for loop to use enumerate
indexed_names = []
for i, name in enumerate(names):
    index_name = (i,name)
    indexed_names.append(index_name) 
print(indexed_names)

# Rewrite the above for loop using list comprehension
indexed_names_comp = [(i, name) for i,name in enumerate(names)]
print(indexed_names_comp)

# Unpack an enumerate object with a starting index of one
indexed_names_unpack = [*enumerate(names, start=1)]
print(indexed_names_unpack)

[(0, 'Jerry'), (1, 'Kramer'), (2, 'Elaine'), (3, 'George'), (4, 'Newman')]
[(0, 'Jerry'), (1, 'Kramer'), (2, 'Elaine'), (3, 'George'), (4, 'Newman')]
[(1, 'Jerry'), (2, 'Kramer'), (3, 'Elaine'), (4, 'George'), (5, 'Newman')]


In [6]:
# Example with map  ()
names = ['Jerry', 'Kramer', 'Elaine', 'George', 'Newman']

# Use map to apply str.upper to each element in names
names_map  = map(str.upper, names)

# Print the type of the names_map
print(list(names_map))

# Unpack names_map into a list
names_uppercase = [*names_map]

# Print the list created above / not work
print(names_uppercase)

['JERRY', 'KRAMER', 'ELAINE', 'GEORGE', 'NEWMAN']
[]


### 1.4 The power of NumPy arrays
Alternative to Python lists

```codeblock
    nums_list = list(range(5)) #output [0,1,2,3,4]

    nums_np = np.array(range(5))  #output array([0,1,2,3,4])
```

1. NumPy array homogeneity = unique type 
2. NumPy array broadcasting 
   ```codeblock
   # Python lists don't support broadcasting
   nums = [-2, -1, 0, 1, 2] 
   nums ** 2 
   output: TypeError: unsupported operand type(s) for ** or pow(): 'list' and 'int' 

    # List comprehension (better option but not best) 
    nums = [-2, -1, 0, 1, 2] 
    sqrd_nums  = [num ** 2 for num in nums] 
    output: [4, 1, 0, 1, 4] 

    #NumPy array broadcasting for the win! 
    nums_np = np.array([-2, -1, 0, 1, 2]) 
    nums_np ** 2 
    
   ```
3. Indexing easy
   ```codeblock
   # 2-D list 
   
   #With list 
   nums2 = [ [1, 2, 3], [4, 5, 6] ] 
   [row[0] for row in nums2] 
   output: [1, 4] 

   #With numpy array
   nums2_np = np.array(nums2)
   nums2_np[:,0] 
   array([1, 4]) 
   ```


### 2. Timing and profiling code


Ipython packages used:
pip install line_profiler

### 2.1 %timeit
* %timeit - Time processing one line.
* %%timeit - Time processing for More lines.

Seing the number of runs (-r ) and/or loops (-n ).

Saving the output to a variable (-o ) 


In [7]:
!pip install line_profiler
!pip install memory_profiler

%load_ext line_profiler
%load_ext memory_profiler




[notice] A new release of pip available: 22.3 -> 22.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip





[notice] A new release of pip available: 22.3 -> 22.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [8]:
# %timeit
import numpy as np

%timeit randnums = np.random.rand(1000)
%timeit -r2 -n10 rand_nums = np.random.rand(1000) 

7.31 µs ± 213 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
The slowest run took 5.02 times longer than the fastest. This could mean that an intermediate result is being cached.
19.5 µs ± 13 µs per loop (mean ± std. dev. of 2 runs, 10 loops each)


In [9]:
# %timeit Saving the output to a variable
times = %timeit -o rand_nums = np.random.rand(1000) 

print("timings:", times.timings)
print("best time:", times.best)
print("worst time:", times.worst)


7.15 µs ± 107 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
timings: [7.332626000006712e-06, 7.143642000000909e-06, 7.1302379999997354e-06, 7.226225999993403e-06, 7.166563000000678e-06, 7.06107100000736e-06, 6.96833300000435e-06]
best time: 6.96833300000435e-06
worst time: 7.332626000006712e-06


In [10]:
# %timeit diference average

f_time = %timeit -o formal_dict = dict() 
l_time = %timeit -o literal_dict = {} 

diff = (f_time.average - l_time.average) * (10**9)
print('l_time better than f_time by {} ns'.format(diff))

50.4 ns ± 0.856 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
23.9 ns ± 0.603 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
l_time better than f_time by 26.580888571412032 ns


### 2.2 Code profiling: line_profiler

* Magic command for line-by-line times
    *
```codeblock
%lprun -f convert_units convert_units(heroes, hts, wts) 
```

* Detailed stats on memory consumption: needs to be a separate file
    *
```codeblock
%lprun -f convert_units convert_units(heroes, hts, wts) 
```


In [13]:
%load_ext line_profiler

heroes = ['Batman', 'Superman', 'Wonder Woman']
hts = np.array([188.0, 191.0, 183.0])
wts = np.array([95.0, 101.0,  74.0])


def convert_units(heroes, heights, weights):
    new_hts = [ht * 0.39370 for ht in heights]
    new_wts = [wt * 2.20462 for wt in weights]
    hero_data = {}
    for i, hero in enumerate(heroes):
        hero_data[hero] = (new_hts[i], new_wts[i])

    return hero_data


%lprun - f convert_units convert_units(heroes, hts, wts)


The line_profiler extension is already loaded. To reload it, use:
  %reload_ext line_profiler


NameError: name 'heroes' is not defined

In [12]:
%load_ext memory_profiler 

from hero_funcs import convert_units

%mprun -f convert_units convert_units(heroes, hts, wts) 

The memory_profiler extension is already loaded. To reload it, use:
  %reload_ext memory_profiler


NameError: name 'heroes' is not defined