<a href="https://colab.research.google.com/github/Imppel-9704/de_track_datacamp/blob/main/l13_Writing_Efficient_Python_Code.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Writing Efficient Python Code
In this course
- How to write clean, fast and efficient Python code.
- How to profile your code for bottlenecks.
- How to eliminate bottlenecks and bad design patterns.

Defining efficient
- Minimal completion time (fast runtime)
- Minimal resource consumption (small memory footprint)

Defining Pythonic
- Focus on readability.
- Using Python's construcst as intended (i.e. Pythonic)

```
# Non-Pythonic
doubled_numbers = []

for i in range(len(numbers)):
  doubled_numbers.append(numbers[i] * 2)

# Pythonic
doubled_numbers = [x * 2 for x in numbers]
```

In [None]:
names = ['Kramer', 'Elaine', 'George', 'Newman']

# Print the list created using the Non-Pythonic approach
i = 0
new_list= []
while i < len(names):
  if len(names[i]) >= 6:
    new_list.append(names[i])
  i += 1
print(new_list)

['Kramer', 'Elaine', 'George', 'Newman']


In [None]:
names = ['Kramer', 'Elaine', 'George', 'Newman']

# Print the list created by looping over the contents of names
better_list = []
for name in names:
  if len(name) >= 6:
    better_list.append(name)
print(better_list)

['Kramer', 'Elaine', 'George', 'Newman']


In [None]:
names = ['Kramer', 'Elaine', 'George', 'Newman']

# Print the list created by using list comprehension
best_list = [name for name in names if len(name) >= 6]
print(best_list)

['Kramer', 'Elaine', 'George', 'Newman']


## Building with built-ins
Built-in components are referred to as the Python Standard library.

- Python 3.6 Standard Library
  - Part of every standard Python installation.
- Built-in types
  - list, tuple, set, dict, and others.
- Built-in functions
  - print(), len(), range(), round(), enumerate(), map(), zip(), and others.
- Built-in modules
  - os, sys, itertools, collections, math, and others.

In [None]:
# Built-in Function: range()
# Explicitly typing a list of numbers.

nums = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# Using range() to create the same list
# range(start, stop)
nums = range(0, 11)
num_list = list(nums)
print(num_list)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


In [None]:
# range(stop)
nums = range(11)
num_list = list(nums)
print(num_list)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


In [None]:
# Using range() with a step value
even_nums = range(2, 11, 2)

even_num_list = list(even_nums)
print(even_num_list)

[2, 4, 6, 8, 10]


In [None]:
# Built-in Function: enumerate()
# enumerate creates an index item pair for each item in the object provided.

letters = ['a', 'b', 'c', 'd']

indexed_letters = enumerate(letters)
indexed_letters_list = list(indexed_letters)
print(indexed_letters_list)

[(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd')]


In [None]:
letters = ['a', 'b', 'c', 'd']

# start the indexs at 5
indexed_letters2 = enumerate(letters, start=5)

indexed_letters2_list = list(indexed_letters2)
print(indexed_letters2_list)

[(5, 'a'), (6, 'b'), (7, 'c'), (8, 'd')]


In [None]:
# Built-in Function: map()
# map applies a function to each element in an object.
# map takes 2 arguments 1. the function you'd like to apply 2. the object that you'd like to apply that function on.
nums = [1.5, 2.3, 3.4, 4.6, 5.0]

rnd_nums = map(round, nums)

print(list(rnd_nums))

[2, 2, 3, 5, 5]


In [None]:
# map() with lambda (anonymous function)
nums = [1, 2, 3, 4, 5]

sqrd_nums = map(lambda x: x ** 2, nums)

print(list(sqrd_nums))

[1, 4, 9, 16, 25]


In [None]:
# Exercise 1
# Create a range object that goes from 0 to 5
nums = range(0 ,6)
print(type(nums))

# Convert nums to a list
nums_list = list(nums)
print(nums_list)

# Create a new list of odd numbers from 1 to 11 by unpacking a range object
nums_list2 = [*range(1, 12, 2)]
print(nums_list2)

<class 'range'>
[0, 1, 2, 3, 4, 5]
[1, 3, 5, 7, 9, 11]


In [None]:
# Exercise 2
names = ['Jerry', 'Kramer', 'Elaine', 'George', 'Newman']

# Rewrite the for loop to use enumerate
indexed_names = []
for i, name in enumerate(names):
    index_name = (i,name)
    indexed_names.append(index_name)
print(indexed_names)

# Rewrite the above for loop using list comprehension
indexed_names_comp = [(i ,name) for i,name in enumerate(names)]
print(indexed_names_comp)

# Unpack an enumerate object with a starting index of one
indexed_names_unpack = [*enumerate(names, 1)]
print(indexed_names_unpack)

[(0, 'Jerry'), (1, 'Kramer'), (2, 'Elaine'), (3, 'George'), (4, 'Newman')]
[(0, 'Jerry'), (1, 'Kramer'), (2, 'Elaine'), (3, 'George'), (4, 'Newman')]
[(1, 'Jerry'), (2, 'Kramer'), (3, 'Elaine'), (4, 'George'), (5, 'Newman')]


In [None]:
# Exercise 3
names = ['Jerry', 'Kramer', 'Elaine', 'George', 'Newman']

# Use map to apply str.upper to each element in names
names_map  = map(str.upper, names)

# Print the type of the names_map
print(type(names_map))

# Unpack names_map into a list
names_uppercase = [*names_map]

# Print the list created above
print(names_uppercase)

<class 'map'>
['JERRY', 'KRAMER', 'ELAINE', 'GEORGE', 'NEWMAN']


## The power of NumPy arrays
NumPy arrays provide a fast and memory efficient alternative to Python lists.

- Alternative Python lists

```
num_list = list(range(5))
```

```
import numpy as np

num_list = np.array(range(5))
```

NumPy arrays are homogeneous, which means it must contain elements of the same type.

```
nums_np_ints = np.array([1, 2, 3])

nums_np_ints.dtype
```



In [None]:
import numpy as np

num_list = np.array(range(5))

num_list

array([0, 1, 2, 3, 4])

In [None]:
# NumPy arrays are homogeneous
nums_np_ints = np.array([1, 2, 3])
print(nums_np_ints)

nums_np_ints.dtype

[1 2 3]


dtype('int64')

In [None]:
nums_np_float = np.array([1, 2.5, 3])

print(nums_np_float)
nums_np_float.dtype

[1.  2.5 3. ]


dtype('float64')

In [None]:
# NumPy array broadcasting
# Python lists don't support broadcasting
nums = [-2, -1, 0, 1, 2]
nums ** 2

TypeError: ignored

In [None]:
# List approach
# For loop (inefficient option)
sqrd_nums = []
for num in nums:
  sqrd_nums.append(num ** 2)
print(sqrd_nums)

[4, 1, 0, 1, 4]


In [None]:
# List comprehension (better option but not the best)
sqrd_nums = [num ** 2 for num in nums]
print(sqrd_nums)

[4, 1, 0, 1, 4]


In [None]:
# NumPy array broadcasting
# NumPy array broadcasting for the win!

nums_np = np.array([-2, -1, 0, 1, 2])
nums_np ** 2

array([4, 1, 0, 1, 4])

In [None]:
# Basic 1-D indexing (lists)
nums = [-2, -1, 0, 1, 2]

print(nums[2])
print(nums[-1])
print(nums[1:4])

# Basic 1-D indexing (arrays)
nums_np = np.array(nums)

print(nums_np[2])
print(nums_np[-1])
print(nums_np[1:4])

# When comparing basic indexing, the capabilities are identical.

0
2
[-1, 0, 1]
0
2
[-1  0  1]


In [None]:
# 2-D list
nums2 = [[1, 2, 3], [4, 5, 6]]

print(nums2[0][1])
print([row[0] for row in nums2])

# 2-D arrays
nums2_np = np.array(nums2)

print(nums2_np[0, 1])
print(nums2_np[:, 0])

2
[1, 4]
2
[1 4]


In [None]:
# NumPy arrays boolean indexing
nums = [-2, -1, 0, 1, 2]
nums_np = np.array(nums)

# boolean indexing
nums_np > 0

array([False, False, False,  True,  True])

In [None]:
nums_np[nums_np > 0]

array([1, 2])

In [None]:
# No boolean indexing for lists
# For loop (inefficient option)
pos = []
for num in nums:
  if num > 0:
    pos.append(num)
print(pos)

[1, 2]


In [None]:
# List comprehension (better option but not the best)
pos = [num for num in nums if num > 0]
print(pos)

[1, 2]


In [None]:
# Exercise 1

nums = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])
# Print second row of nums
print(nums[1,:])

# Print all elements of nums that are greater than six
print(nums[nums > 6])

# Double every element of nums
nums_dbl = nums * 2
print(nums_dbl)

# Replace the third column of nums
nums[:,2] = nums[:,2] + 1
print(nums)

[ 6  7  8  9 10]
[ 7  8  9 10]
[[ 2  4  6  8 10]
 [12 14 16 18 20]]
[[ 1  2  4  4  5]
 [ 6  7  9  9 10]]


In [None]:
# Exercise 2
# Create a list of arrival times
arrival_times = [*range(10,60,10)]

# Convert arrival_times to an array and update the times
arrival_times_np = np.array(arrival_times)
new_times = arrival_times_np - 3

# Use list comprehension and enumerate to pair guests to new times
guest_arrivals = [(names[i],time) for i,time in enumerate(new_times)]

print(guest_arrivals)

[('Jerry', 7), ('Kramer', 17), ('Elaine', 27), ('George', 37), ('Newman', 47)]


## Examining runtime

## Why should we time our code?
- Allows us to pick the optimal coding approach
- Faster code == more efficient code!

How can we time our code?
- Calculate runtime with IPython magic command %timeit - Magic commands: enhancements on top of normal Python syntax
  - Prefixed by the "%" character.
  - See all available magic commads with %lsmagic.

### Using %timeit
Code to be timed
```
import numpy as np
rand_nums = np.random.rand(1000)
```

Timing with %timeit
```
%timeit rand_nums = np.random.rand(1000)
```

### Specifying number of runs/loops
- The number of runs represents how many iterations you'd like to use to estimate the runtime.
- The number of loops represents how many times you'd like to code to be executed per run.

Setting the number of runs (-r) and/or loops (-n)

In [None]:
import numpy as np
rand_nums = np.random.rand(1000)

%timeit rand_nums = np.random.rand(1000)

10.8 µs ± 2.89 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [None]:
# Set number of runs to 2 (-r2)
# Set number of loops to 10 (-n10)

%timeit -r2 -n10 rand_nums = np.random.rand(1000)

The slowest run took 6.78 times longer than the fastest. This could mean that an intermediate result is being cached.
64.7 µs ± 48.1 µs per loop (mean ± std. dev. of 2 runs, 10 loops each)


In [None]:
# Using %timeit in line magic mode
# line magic (%timeit)

# single line of code
%timeit nums = [x for x in range(10)]

926 ns ± 320 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [None]:
# Using %timeit in cell magic mode
# cell magic (%%timeit)

# multiple lines of code
%%timeit
nums = []
for x in range(10):
  nums.append(x)

1.75 µs ± 98.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [None]:
# Saving output
# Saving the output to a variable (-o)

times = %timeit -o rand_nums = np.random.rand(1000)

9.56 µs ± 68.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [None]:
# We can identify more
times.timings

[9.606058449999182e-06,
 9.682676109996465e-06,
 9.562995670003147e-06,
 9.573142919998645e-06,
 9.555966589996388e-06,
 9.524797049998597e-06,
 9.439050000000861e-06]

In [None]:
times.best

9.439050000000861e-06

In [None]:
times.worst

9.682676109996465e-06

## Let's try using %timeit with some of Python's built-in data structures

Comparing times Python data structures can be created using for name
```
formal_list = list()
formal_dict = dict()
formal_tuple = tuple()
```

Python data structures can be created using literal syntax
```
literal_list = []
literal_dict = {}
literal_tuple = ()
```

In [None]:
# If we wanted to compare the runtime between creating a dict using the formal name and creating using the literal syntax
f_time = %timeit -o formal_dict = dict()
l_time = %timeit -o literal_dict = {}

diff = (f_time.average - l_time.average) * (10**9)
print('l_time better that f_time by {} ns'.format(diff))

129 ns ± 49.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
57.7 ns ± 18.6 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
l_time better that f_time by 71.32582678553133 ns


In [None]:
# Exercise 1
# compare the runtimes for creating a list of integers from 0 to 50 using list comprehension vs. unpacking the range object.
%timeit nums_list_comp = [num for num in range(51)]
%timeit nums_unpack = [*range(51)]

2.17 µs ± 537 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
589 ns ± 153 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


## Code profiling for runtime
Code profiling allows us to analyze code more efficiently
- Detailed stats on frequency and duration of function calls
- Line-by-line analyses
- Package used: ```line_profiler```

```
pip install line_profiler
```

In [None]:
pip install line_profiler

Collecting line_profiler
  Downloading line_profiler-4.1.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (714 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/714.8 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m194.6/714.8 kB[0m [31m5.6 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━━━[0m [32m655.4/714.8 kB[0m [31m9.3 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m714.8/714.8 kB[0m [31m8.6 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: line_profiler
Successfully installed line_profiler-4.1.2


In [None]:
heroes = ['Batman', 'Superman', 'Wonder Woman']

hts = np.array([188.0, 191.0, 183.0])
wts = np.array([95.0, 101.0, 74.0])

def convert_unit(heroes, heights, weights):
  new_hts = [ht * 0.39370 for ht in heights]
  new_wts = [wt * 2.20462 for wt in weights]
  hero_data = {}

  for i, hero in enumerate(heroes):
    hero_data[hero] = (new_hts[i], new_wts[i])
  return hero_data

convert_unit(heroes, hts, wts)

{'Batman': (74.01559999999999, 209.4389),
 'Superman': (75.19669999999999, 222.66661999999997),
 'Wonder Woman': (72.0471, 163.14188)}

In [None]:
%timeit new_hts = [ht * 0.39370 for ht in hts]

1.92 µs ± 494 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [None]:
%timeit new_wts = [wt * 2.20462 for wt in wts]

1.92 µs ± 539 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [None]:
# Using line_profiler package
# First load it into the session.

%load_ext line_profiler

# magic command for line-by-line times
%lprun -f convert_unit convert_unit(heroes, hts, wts)

The line_profiler extension is already loaded. To reload it, use:
  %reload_ext line_profiler


In [None]:
heroes = ['Batman', 'Superman', 'Wonder Woman']

heights = np.array([188.0, 191.0, 183.0])
weights = np.array([95.0, 101.0, 74.0])

def convert_units_broadcast(heroes, heights, weights):

  # Array broadcasting instead of list comprehension
  new_hts = heights * 0.39370
  new_wts = weights * 2.20462

  hero_data = {}

  for i,hero in enumerate(heroes):
    hero_data[hero] = (new_hts[i], new_wts[i])

  return hero_data
convert_units_broadcast(heroes, heights, weights)

{'Batman': (74.01559999999999, 209.4389),
 'Superman': (75.19669999999999, 222.66661999999997),
 'Wonder Woman': (72.0471, 163.14188)}

In [None]:
%lprun -f convert_units_broadcast convert_units_broadcast(heroes, heights, weights)

## Code profiling for memory usage

Quick and dirty approach is using ```sys```

```
import sys
```

sys.getsizeof() is a quick and dirty way to see the size of an object

```
import sys

nums_list = [*range(1000)]
sys.getsizeof(nums_list)
```

Code profiling: memory
- Detailed stats on memory consumption
- Line-by-line analyses
- Package used: ```memory_profiler```

```
pip install memory_profiler
```

- using ```memory_profiler``` package

```
%load_ext memory_profiler
%mprun -f convert_units convert_units(heroes, hts, wtf)
```

- Functions must be imported when using ```memory_profiler```
  - Let's say convert_units function was placed in a file named hero_funcs.py

```
from hero_funcs import convert_units

%load_ext memory_profiler
%mprun -f convert_units convert_units(heroes, hts, wts)
```

%mprun output caveats
- Inspects memory by querying the OS
- Results may differ between platforms and runs
  - Can still observe how each line of code compares to others based on memory consumption.

In [None]:
import sys
nums_list = [*range(1000)]
sys.getsizeof(nums_list)

8056

In [None]:
import numpy as np
nums_np = np.array(range(1000))
sys.getsizeof(nums_np)

8112

In [None]:
pip install memory_profiler

Collecting memory_profiler
  Downloading memory_profiler-0.61.0-py3-none-any.whl (31 kB)
Installing collected packages: memory_profiler
Successfully installed memory_profiler-0.61.0


In [None]:
%load_ext memory_profiler
%mprun -f convert_unit convert_unit(heroes, hts, wts)

The memory_profiler extension is already loaded. To reload it, use:
  %reload_ext memory_profiler
ERROR: Could not find file <ipython-input-100-3aab3182610d>
NOTE: %mprun can only be used on functions defined in physical files, and not in the IPython environment.



## Efficiently combining, counting, and iterating

In [1]:
# Combining objects

names = ['Bulbasaur', 'Charmander', 'Squirtle']
hps = [45, 39, 44]

combined = []
for i, pokemon in enumerate(names):
  combined.append((pokemon, hps[i]))

print(combined)

[('Bulbasaur', 45), ('Charmander', 39), ('Squirtle', 44)]


In [7]:
# Combining objects with zip
# The name "zip" describes how this function combine objects like a zipper. (making 2 separate things become 1)

names = ['Bulbasaur', 'Charmander', 'Squirtle']
hps = [45, 39, 44]

combined_zip = zip(names, hps)
print(type(combined_zip))

combined_zip_list = [*combined_zip]
print(combined_zip_list)

<class 'zip'>
[('Bulbasaur', 45), ('Charmander', 39), ('Squirtle', 44)]


### The collections module
- Part of Python's standard library (built-in module)
- Specialized container datatypes
  - Alternatives to general purpose dict, list, set, and tuple
- Notable:
  - namedtuple: tuple subclasses with named fields
  - deque: list-like container with fast appends and pops.
  - Counter: dict for counting hashable objects.
  - OrderedDict: dict that retains order of entries.
  - defaultdict: dict that calls a factory function to supple missing values.

### Counting with loop
```
# Each Pokemon's type (720 Total)
poke_types = ['Grass', 'Dark', 'Fire', 'Fire', ...]
type_count = {}
for poke_type in poke_types:
  if poke_type not in type_count:
    type_counts[poke_type] = 1
  else:
    type_counts[poke_type] += 1
print(type_counts)
```

### Using collections.Counter() is more efficient approach
```
# Each Pokemon's type (720 Total)
poke_types = ['Grass', 'Dark', 'Fire', 'Fire', ...]
from collections from Counter
type_counts = Counter(poke_types)
print(type_counts)

## output is the same as above code
```

### The itertools module
- Part of Python's standard library (built-in module)
- Functional tools for creating and using iterators.
- Notable:
  - Infinite iterators: count, cycle, repeat
  - Finite iterators: accumulate, chain, zip_longest, ect.
  - Combination generators: product, permutations, combinations

In [8]:
# Combinations with loop
# Nested for loop that iterates the poke_types list twice.

poke_types = ['Bug', 'Fire', 'Grass', 'Water', 'Ghost']
combos = []

for x in poke_types:
  for y in poke_types:
    if x == y:
      continue
    if ((x, y) not in combos) & ((y, x) not in combos):
      combos.append((x, y))
print(combos)

# We want one of these pair not both.

[('Bug', 'Fire'), ('Bug', 'Grass'), ('Bug', 'Water'), ('Bug', 'Ghost'), ('Fire', 'Grass'), ('Fire', 'Water'), ('Fire', 'Ghost'), ('Grass', 'Water'), ('Grass', 'Ghost'), ('Water', 'Ghost')]


In [10]:
# itertools.combinations()
poke_types = ['Bug', 'Fire', 'Grass', 'Water', 'Ghost']

from itertools import combinations

combos_obj = combinations(poke_types, 2)
print(type(combos_obj))

combos = [*combos_obj]
print(combos)

<class 'itertools.combinations'>
[('Bug', 'Fire'), ('Bug', 'Grass'), ('Bug', 'Water'), ('Bug', 'Ghost'), ('Fire', 'Grass'), ('Fire', 'Water'), ('Fire', 'Ghost'), ('Grass', 'Water'), ('Grass', 'Ghost'), ('Water', 'Ghost')]


### Exercise 1
1
```
# Combine names and primary_types
names_type1 = [*zip(names, primary_types)]

print(*names_type1[:5], sep='\n')
```
2
```
# Combine all three lists together
names_types = [*zip(names, primary_types, secondary_types)]

print(*names_types[:5], sep='\n')
```
3
```
# Combine five items from names and three items from primary_types
differing_lengths = [*zip(names[:5], primary_types[:3])]

print(*differing_lengths, sep='\n')
```

### Exercise 2
1
```
# Collect the count of primary types
type_count = Counter(primary_types)
print(type_count, '\n')

# Collect the count of generations
gen_count = Counter(generations)
print(gen_count, '\n')

# Use list comprehension to get each Pokémon's starting letter
starting_letters = [name[0] for name in names]

# Collect the count of Pokémon for each starting_letter
starting_letters_count = Counter(starting_letters)
print(starting_letters_count)
```
2
```
# Import combinations from itertools
from itertools import combinations

# Create a combination object with pairs of Pokémon
combos_obj = combinations(pokemon, 2)
print(type(combos_obj), '\n')

# Convert combos_obj to a list by unpacking
combos_2 = [*combos_obj]
print(combos_2, '\n')

# Collect all possible combinations of 4 Pokémon directly into a list
combos_4 = combinations(pokemon, 4)
print([*combos_4])
```

## Set theory

- Branch of mathematics applied to collections of objects
  - i.e., sets
- Python has built-in set datatype with accompanying methods:
  - intersections(): all elements that are in both sets
  - difference(): all elements in one set but not the other
  - symmetric_difference(): all elements in exactly one set
  - union(): all elements that are in either set
- Fast membership testing
  - Check if a value exists in a sequence or not
  - Using the in operator

```
# 720 Pokemon primary types corresponding to each Pokemon
primary_types = ['Grass', 'Psychic', 'Dark', 'Bug']

unique_types = []

for prim_type in primary_types:
  if prim_type not in unique_types:
    unique_types.append(prim_type)
print(unique_types)
```

```
# Better way
# 720 Pokemon primary types corresponding to each Pokemon
primary_types = ['Grass', 'Psychic', 'Dark', 'Bug']
unique_types_set = set(primary_types)
print(unique_types_set)
```

In [11]:
list_a = ['Bulbasaur', 'Charmander', 'Squirtle']
list_b = ['Caterpie', 'Pidgey', 'Squirtle']

in_common = []

for pokemon_a in list_a:
  for pokemon_b in list_b:
    if pokemon_a == pokemon_b:
      in_common.append(pokemon_a)

print(in_common)

['Squirtle']


In [13]:
list_a = ['Bulbasaur', 'Charmander', 'Squirtle']
list_b = ['Caterpie', 'Pidgey', 'Squirtle']

set_a = set(list_a)
print(set_a)

set_b = set(list_b)
print(set_b)

set_a.intersection(set_b)

{'Charmander', 'Squirtle', 'Bulbasaur'}
{'Caterpie', 'Squirtle', 'Pidgey'}


{'Squirtle'}

In [15]:
set_a.difference(set_b)

{'Bulbasaur', 'Charmander'}

In [16]:
set_b.intersection(set_a)

{'Squirtle'}

In [17]:
set_a.symmetric_difference(set_b)

{'Bulbasaur', 'Caterpie', 'Charmander', 'Pidgey'}

In [18]:
set_a.union(set_b)

{'Bulbasaur', 'Caterpie', 'Charmander', 'Pidgey', 'Squirtle'}

In [19]:
# Exercise 1
ash_pokedex = ['Pikachu', 'Bulbasaur', 'Koffing', 'Spearow', 'Vulpix', 'Wigglytuff', 'Zubat', 'Rattata', 'Psyduck', 'Squirtle']

misty_pokedex = ['Krabby', 'Horsea', 'Slowbro', 'Tentacool', 'Vaporeon', 'Magikarp', 'Poliwag', 'Starmie', 'Psyduck', 'Squirtle']

# Convert both lists to sets
ash_set = set(ash_pokedex)
misty_set = set(misty_pokedex)

# Find the Pokémon that exist in both sets
both = ash_set.intersection(misty_set)
print(both)

# Find the Pokémon that Ash has and Misty does not have
ash_only = ash_set.difference(misty_set)
print(ash_only)

# Find the Pokémon that are in only one set (not both)
unique_to_set = ash_set.symmetric_difference(misty_set)
print(unique_to_set)

{'Squirtle', 'Psyduck'}
{'Vulpix', 'Zubat', 'Koffing', 'Bulbasaur', 'Rattata', 'Wigglytuff', 'Pikachu', 'Spearow'}
{'Vulpix', 'Koffing', 'Starmie', 'Tentacool', 'Zubat', 'Magikarp', 'Bulbasaur', 'Vaporeon', 'Krabby', 'Rattata', 'Slowbro', 'Wigglytuff', 'Horsea', 'Pikachu', 'Spearow', 'Poliwag'}


In [21]:
# Exercise 2
brock_pokedex = ['Dugtrio', 'Tauros', 'Kabutops', 'Geodude', 'Omastar', 'Onix', 'Machop', 'Vulpix', 'Golem', 'Zubat']

# Convert Brock's Pokédex to a set
brock_pokedex_set = set(brock_pokedex)
print(brock_pokedex_set)

# Convert Brock's Pokédex to a set
brock_pokedex_set = set(brock_pokedex)
print(brock_pokedex_set)

# Check if Psyduck is in Ash's list and Brock's set
print('Psyduck' in ash_pokedex)
print('Psyduck' in brock_pokedex_set)

# Check if Machop is in Ash's list and Brock's set
print('Machop' in ash_pokedex)
print('Machop' in brock_pokedex_set)

{'Vulpix', 'Zubat', 'Dugtrio', 'Onix', 'Tauros', 'Kabutops', 'Golem', 'Geodude', 'Machop', 'Omastar'}
{'Vulpix', 'Zubat', 'Dugtrio', 'Onix', 'Tauros', 'Kabutops', 'Golem', 'Geodude', 'Machop', 'Omastar'}
True
False
False
True


### Exercise 3
```
# Use find_unique_items() to collect unique Pokémon names
uniq_names_func = find_unique_items(names)
print(len(uniq_names_func))

# Convert the names list to a set to collect unique Pokémon names
uniq_names_set = set(names)
print(len(uniq_names_set))

# Check that both unique collections are equivalent
print(sorted(uniq_names_func) == sorted(uniq_names_set))

# Use the best approach to collect unique primary types and generations
uniq_types = set(primary_types)
uniq_gens = set(generations)
print(uniq_types, uniq_gens, sep='\n')
```

## Eliminating loops
### Looping in Python
- Looping patterns:
  - for loop: iterate over sequence peice-by-peice
  - while loop: repeat loop as long as condition is met
  - "nested" loops: use one loop inside another loop

### Benefits of eliminating loops
- Fewer lines of code
- Better code readability
  - "Flat is better than nested"
- Efficient gains


```
# List of Pokemon's HP, ATK, DEF, SPD
poke_stats = [
  [90, 92, 75, 60],
  [25, 20, 15, 90],
  [65, 130, 60, 75],
  ...
]

# 1. For loop appoarch
for row in poke_stats:
  totals.append(sum(row))

# 2. List comprehension // Fast
totals_comp = [sum(row) for row in poke_stats]

# 3. Built-in map() // Faster
totals_map = [*map(sum, poke_stats)]
```

In [23]:
poke_types = ['Bug', 'Fire', 'Grass', 'Water', 'Ghost']

# Nested for loop
combos = []

for x in poke_types:
  for y in poke_types:
    if x == y:
      continue
    if ((x, y) not in combos) & ((y, x) not in combos):
      combos.append((x, y))
print(combos)

# Better use this
from itertools import combinations
combos2 = [*combinations(poke_types, 2)]
print(combos2)

[('Bug', 'Fire'), ('Bug', 'Grass'), ('Bug', 'Water'), ('Bug', 'Ghost'), ('Fire', 'Grass'), ('Fire', 'Water'), ('Fire', 'Ghost'), ('Grass', 'Water'), ('Grass', 'Ghost'), ('Water', 'Ghost')]
[('Bug', 'Fire'), ('Bug', 'Grass'), ('Bug', 'Water'), ('Bug', 'Ghost'), ('Fire', 'Grass'), ('Fire', 'Water'), ('Fire', 'Ghost'), ('Grass', 'Water'), ('Grass', 'Ghost'), ('Water', 'Ghost')]


### Eliminate loops with Numpy
```
# List of Pokemon's HP, ATK, DEF, SPD
poke_stats = np.array([
  [90, 92, 75, 60],
  [25, 20, 15, 90],
  [65, 130, 60, 75],
  ...
])

avgs = []
for row in poke_stats:
  avg = np.mean(row)
  avgs.append(avg)
print(avgs)

# same result but faster using loop
avgs_np = poke_stats.mean(axis=1)
print(avgs_np)
```

In [25]:
### Exercise 1
poke_names = ['Abomasnow', 'Abra', 'Absol', 'Accelgor', 'Aerodactyl', 'Aggron', 'Aipom', 'Alakazam', 'Alomomola', 'Altaria', 'Amaura', 'Ambipom', 'Amoonguss', 'Ampharos', 'Anorith', 'Arbok', 'Arcanine', 'Arceus', 'Archen', 'Archeops', 'Ariados', 'Armaldo', 'Aromatisse', 'Aron', 'Articuno', 'Audino', 'Aurorus', 'Avalugg', 'Axew', 'Azelf', 'Azumarill', 'Azurill', 'Bagon', 'Baltoy', 'Banette', 'Barbaracle', 'Barboach', 'Basculin', 'Bastiodon', 'Bayleef', 'Beartic', 'Beautifly', 'Beedrill', 'Beheeyem', 'Beldum', 'Bellossom', 'Bellsprout', 'Bergmite', 'Bibarel', 'Bidoof', 'Binacle', 'Bisharp', 'Blastoise', 'Blaziken', 'Blissey', 'Blitzle', 'Boldore', 'Bonsly', 'Bouffalant', 'Braixen', 'Braviary', 'Breloom', 'Bronzong', 'Bronzor', 'Budew', 'Buizel', 'Bulbasaur', 'Buneary', 'Bunnelby', 'Burmy', 'Butterfree', 'Cacnea', 'Cacturne', 'Camerupt', 'Carbink', 'Carnivine', 'Carracosta', 'Carvanha', 'Cascoon', 'Castform', 'Caterpie', 'Celebi', 'Chandelure', 'Chansey', 'Charizard', 'Charmander', 'Charmeleon', 'Chatot', 'Cherrim', 'Cherubi', 'Chesnaught', 'Chespin', 'Chikorita', 'Chimchar', 'Chimecho', 'Chinchou', 'Chingling', 'Cinccino', 'Clamperl', 'Clauncher', 'Clawitzer', 'Claydol', 'Clefable', 'Clefairy', 'Cleffa', 'Cloyster', 'Cobalion', 'Cofagrigus', 'Combee', 'Combusken', 'Conkeldurr', 'Corphish', 'Corsola', 'Cottonee', 'Cradily', 'Cranidos', 'Crawdaunt', 'Cresselia', 'Croagunk', 'Crobat', 'Croconaw', 'Crustle', 'Cryogonal', 'Cubchoo', 'Cubone', 'Cyndaquil', 'Darkrai', 'DarmanitanStandard Mode', 'DarmanitanZen Mode', 'Darumaka', 'Dedenne', 'Deerling', 'Deino', 'Delcatty', 'Delibird', 'Delphox', 'Dewgong', 'Dewott', 'Dialga', 'Diancie', 'Diggersby', 'Diglett', 'Ditto', 'Dodrio', 'Doduo', 'Donphan', 'Doublade', 'Dragalge', 'Dragonair', 'Dragonite', 'Drapion', 'Dratini', 'Drifblim', 'Drifloon', 'Drilbur', 'Drowzee', 'Druddigon', 'Ducklett', 'Dugtrio', 'Dunsparce', 'Duosion', 'Durant', 'Dusclops', 'Dusknoir', 'Duskull', 'Dustox', 'Dwebble', 'Eelektrik', 'Eelektross', 'Eevee', 'Ekans', 'Electabuzz', 'Electivire', 'Electrike', 'Electrode', 'Elekid', 'Elgyem', 'Emboar', 'Emolga', 'Empoleon', 'Entei', 'Escavalier', 'Espeon', 'Espurr', 'Excadrill', 'Exeggcute', 'Exeggutor', 'Exploud', "Farfetch'd", 'Fearow', 'Feebas', 'Fennekin', 'Feraligatr', 'Ferroseed', 'Ferrothorn', 'Finneon', 'Flaaffy', 'Flabébé', 'Flareon', 'Fletchinder', 'Fletchling', 'Floatzel', 'Floette', 'Florges', 'Flygon', 'Foongus', 'Forretress', 'Fraxure', 'Frillish', 'Froakie', 'Frogadier', 'Froslass', 'Furfrou', 'Furret', 'Gabite', 'Gallade', 'Galvantula', 'Garbodor', 'Garchomp', 'Gardevoir', 'Gastly', 'Gastrodon', 'Genesect', 'Gengar', 'Geodude', 'Gible', 'Gigalith', 'Girafarig', 'Glaceon', 'Glalie', 'Glameow', 'Gligar', 'Gliscor', 'Gloom', 'Gogoat', 'Golbat', 'Goldeen', 'Golduck', 'Golem', 'Golett', 'Golurk', 'Goodra', 'Goomy', 'Gorebyss', 'Gothita', 'Gothitelle', 'Gothorita', 'Granbull', 'Graveler', 'Greninja', 'Grimer', 'Grotle', 'Groudon', 'GroudonPrimal Groudon', 'Grovyle', 'Growlithe', 'Grumpig', 'Gulpin', 'Gurdurr', 'Gyarados', 'Happiny', 'Hariyama', 'Haunter', 'Hawlucha', 'Haxorus', 'Heatmor', 'Heatran', 'Heliolisk', 'Helioptile', 'Heracross', 'Herdier', 'Hippopotas', 'Hippowdon', 'Hitmonchan', 'Hitmonlee', 'Hitmontop', 'Ho-oh', 'Honchkrow', 'Honedge', 'Hoothoot', 'Hoppip', 'Horsea', 'Houndoom', 'Houndour', 'Huntail', 'Hydreigon', 'Hypno', 'Igglybuff', 'Illumise', 'Infernape', 'Inkay', 'Ivysaur', 'Jellicent', 'Jigglypuff', 'Jirachi', 'Jolteon', 'Joltik', 'Jumpluff', 'Jynx', 'Kabuto', 'Kabutops', 'Kadabra', 'Kakuna', 'Kangaskhan', 'Karrablast', 'Kecleon', 'Kingdra', 'Kingler', 'Kirlia', 'Klang', 'Klefki', 'Klink', 'Klinklang', 'Koffing', 'Krabby', 'Kricketot', 'Kricketune', 'Krokorok', 'Krookodile', 'Kyogre', 'KyogrePrimal Kyogre', 'Kyurem', 'KyuremBlack Kyurem', 'KyuremWhite Kyurem', 'Lairon', 'Lampent', 'Lanturn', 'Lapras', 'Larvesta', 'Larvitar', 'Latias', 'Latios', 'Leafeon', 'Leavanny', 'Ledian', 'Ledyba', 'Lickilicky', 'Lickitung', 'Liepard', 'Lileep', 'Lilligant', 'Lillipup', 'Linoone', 'Litleo', 'Litwick', 'Lombre', 'Lopunny', 'Lotad', 'Loudred', 'Lucario', 'Ludicolo', 'Lugia', 'Lumineon', 'Lunatone', 'Luvdisc', 'Luxio', 'Luxray', 'Machamp', 'Machoke', 'Machop', 'Magby', 'Magcargo', 'Magikarp', 'Magmar', 'Magmortar', 'Magnemite', 'Magneton', 'Magnezone', 'Makuhita', 'Malamar', 'Mamoswine', 'Manaphy', 'Mandibuzz', 'Manectric', 'Mankey', 'Mantine', 'Mantyke', 'Maractus', 'Mareep', 'Marill', 'Marowak', 'Marshtomp', 'Masquerain', 'Mawile', 'Medicham', 'Meditite', 'MeowsticFemale', 'MeowsticMale', 'Meowth', 'Mesprit', 'Metagross', 'Metang', 'Metapod', 'Mew', 'Mewtwo', 'Mienfoo', 'Mienshao', 'Mightyena', 'Milotic', 'Miltank', 'Mime Jr.', 'Minccino', 'Minun', 'Misdreavus', 'Mismagius', 'Moltres', 'Monferno', 'Mothim', 'Mr. Mime', 'Mudkip', 'Muk', 'Munchlax', 'Munna', 'Murkrow', 'Musharna', 'Natu', 'Nidoking', 'Nidoqueen', 'Nidoran♀', 'Nidoran♂', 'Nidorina', 'Nidorino', 'Nincada', 'Ninetales', 'Ninjask', 'Noctowl', 'Noibat', 'Noivern', 'Nosepass', 'Numel', 'Nuzleaf', 'Octillery', 'Oddish', 'Omanyte', 'Omastar', 'Onix', 'Oshawott', 'Pachirisu', 'Palkia', 'Palpitoad', 'Pancham', 'Pangoro', 'Panpour', 'Pansage', 'Pansear', 'Paras', 'Parasect', 'Patrat', 'Pawniard', 'Pelipper', 'Persian', 'Petilil', 'Phanpy', 'Phantump', 'Phione', 'Pichu', 'Pidgeot', 'Pidgeotto', 'Pidgey', 'Pidove', 'Pignite', 'Pikachu', 'Piloswine', 'Pineco', 'Pinsir', 'Piplup', 'Plusle', 'Politoed', 'Poliwag', 'Poliwhirl', 'Poliwrath', 'Ponyta', 'Poochyena', 'Porygon', 'Porygon-Z', 'Porygon2', 'Primeape', 'Prinplup', 'Probopass', 'Psyduck', 'Pupitar', 'Purrloin', 'Purugly', 'Pyroar', 'Quagsire', 'Quilava', 'Quilladin', 'Qwilfish', 'Raichu', 'Raikou', 'Ralts', 'Rampardos', 'Rapidash', 'Raticate', 'Rattata', 'Rayquaza', 'Regice', 'Regigigas', 'Regirock', 'Registeel', 'Relicanth', 'Remoraid', 'Reshiram', 'Reuniclus', 'Rhydon', 'Rhyhorn', 'Rhyperior', 'Riolu', 'Roggenrola', 'Roselia', 'Roserade', 'Rotom', 'RotomFan Rotom', 'RotomFrost Rotom', 'RotomHeat Rotom', 'RotomMow Rotom', 'RotomWash Rotom', 'Rufflet', 'Sableye', 'Salamence', 'Samurott', 'Sandile', 'Sandshrew', 'Sandslash', 'Sawk', 'Sawsbuck', 'Scatterbug', 'Sceptile', 'Scizor', 'Scolipede', 'Scrafty', 'Scraggy', 'Scyther', 'Seadra', 'Seaking', 'Sealeo', 'Seedot', 'Seel', 'Seismitoad', 'Sentret', 'Serperior', 'Servine', 'Seviper', 'Sewaddle', 'Sharpedo', 'Shedinja', 'Shelgon', 'Shellder', 'Shellos', 'Shelmet', 'Shieldon', 'Shiftry', 'Shinx', 'Shroomish', 'Shuckle', 'Shuppet', 'Sigilyph', 'Silcoon', 'Simipour', 'Simisage', 'Simisear', 'Skarmory', 'Skiddo', 'Skiploom', 'Skitty', 'Skorupi', 'Skrelp', 'Skuntank', 'Slaking', 'Slakoth', 'Sliggoo', 'Slowbro', 'Slowking', 'Slowpoke', 'Slugma', 'Slurpuff', 'Smeargle', 'Smoochum', 'Sneasel', 'Snivy', 'Snorlax', 'Snorunt', 'Snover', 'Snubbull', 'Solosis', 'Solrock', 'Spearow', 'Spewpa', 'Spheal', 'Spinarak', 'Spinda', 'Spiritomb', 'Spoink', 'Spritzee', 'Squirtle', 'Stantler', 'Staraptor', 'Staravia', 'Starly', 'Starmie', 'Staryu', 'Steelix', 'Stoutland', 'Stunfisk', 'Stunky', 'Sudowoodo', 'Suicune', 'Sunflora', 'Sunkern', 'Surskit', 'Swablu', 'Swadloon', 'Swalot', 'Swampert', 'Swanna', 'Swellow', 'Swinub', 'Swirlix', 'Swoobat', 'Sylveon', 'Taillow', 'Talonflame', 'Tangela', 'Tangrowth', 'Tauros', 'Teddiursa', 'Tentacool', 'Tentacruel', 'Tepig', 'Terrakion', 'Throh', 'Timburr', 'Tirtouga', 'Togekiss', 'Togepi', 'Togetic', 'Torchic', 'Torkoal', 'Torterra', 'Totodile', 'Toxicroak', 'Tranquill', 'Trapinch', 'Treecko', 'Trevenant', 'Tropius', 'Trubbish', 'Turtwig', 'Tympole', 'Tynamo', 'Typhlosion', 'Tyranitar', 'Tyrantrum', 'Tyrogue', 'Tyrunt', 'Umbreon', 'Unfezant', 'Unown', 'Ursaring', 'Uxie', 'Vanillish', 'Vanillite', 'Vanilluxe', 'Vaporeon', 'Venipede', 'Venomoth', 'Venonat', 'Venusaur', 'Vespiquen', 'Vibrava', 'Victini', 'Victreebel', 'Vigoroth', 'Vileplume', 'Virizion', 'Vivillon', 'Volbeat', 'Volcanion', 'Volcarona', 'Voltorb', 'Vullaby', 'Vulpix', 'Wailmer', 'Wailord', 'Walrein', 'Wartortle', 'Watchog', 'Weavile', 'Weedle', 'Weepinbell', 'Weezing', 'Whimsicott', 'Whirlipede', 'Whiscash', 'Whismur', 'Wigglytuff', 'Wingull', 'Wobbuffet', 'Woobat', 'Wooper', 'WormadamPlant Cloak', 'WormadamSandy Cloak', 'WormadamTrash Cloak', 'Wurmple', 'Wynaut', 'Xatu', 'Xerneas', 'Yamask', 'Yanma', 'Yanmega', 'Yveltal', 'Zangoose', 'Zapdos', 'Zebstrika', 'Zekrom', 'Zigzagoon', 'Zoroark', 'Zorua', 'Zubat', 'Zweilous']
poke_gens = [4, 1, 3, 5, 1, 3, 2, 1, 5, 3, 6, 4, 5, 2, 3, 1, 1, 4, 5, 5, 2, 3, 6, 3, 1, 5, 6, 6, 5, 4, 2, 3, 3, 3, 3, 6, 3, 5, 4, 2, 5, 3, 1, 5, 3, 2, 1, 6, 4, 4, 6, 5, 1, 3, 2, 5, 5, 4, 5, 6, 5, 3, 4, 4, 4, 4, 1, 4, 6, 4, 1, 3, 3, 3, 6, 4, 5, 3, 3, 3, 1, 2, 5, 1, 1, 1, 1, 4, 4, 4, 6, 6, 2, 4, 3, 2, 4, 5, 3, 6, 6, 3, 1, 1, 2, 1, 5, 5, 4, 3, 5, 3, 2, 5, 3, 4, 3, 4, 4, 2, 2, 5, 5, 5, 1, 2, 4, 5, 5, 5, 6, 5, 5, 3, 2, 6, 1, 5, 4, 6, 6, 1, 1, 1, 1, 2, 6, 6, 1, 1, 4, 1, 4, 4, 5, 1, 5, 5, 1, 2, 5, 5, 3, 4, 3, 3, 5, 5, 5, 1, 1, 1, 4, 3, 1, 2, 5, 5, 5, 4, 2, 5, 2, 6, 5, 1, 1, 3, 1, 1, 3, 6, 2, 5, 5, 4, 2, 6, 1, 6, 6, 4, 6, 6, 3, 5, 2, 5, 5, 6, 6, 4, 6, 2, 4, 4, 5, 5, 4, 3, 1, 4, 5, 1, 1, 4, 5, 2, 4, 3, 4, 2, 4, 1, 6, 1, 1, 1, 1, 5, 5, 6, 6, 3, 5, 5, 5, 2, 1, 6, 1, 4, 3, 3, 3, 1, 3, 3, 5, 1, 4, 3, 1, 6, 5, 5, 4, 6, 6, 2, 5, 4, 4, 1, 1, 2, 2, 4, 6, 2, 2, 1, 2, 2, 3, 5, 1, 2, 3, 4, 6, 1, 5, 1, 3, 1, 5, 2, 1, 1, 1, 1, 1, 1, 5, 3, 2, 1, 3, 5, 6, 5, 5, 1, 1, 4, 4, 5, 5, 3, 3, 5, 5, 5, 3, 5, 2, 1, 5, 2, 3, 3, 4, 5, 2, 2, 4, 1, 5, 3, 5, 5, 3, 6, 5, 3, 4, 3, 3, 4, 3, 2, 4, 3, 3, 4, 4, 1, 1, 1, 2, 2, 1, 1, 4, 1, 1, 4, 3, 6, 4, 4, 5, 3, 1, 2, 4, 5, 2, 2, 1, 3, 3, 3, 3, 3, 6, 6, 1, 4, 3, 3, 1, 1, 1, 5, 5, 3, 3, 2, 4, 5, 3, 2, 4, 1, 4, 4, 1, 3, 1, 4, 5, 2, 5, 2, 1, 1, 1, 1, 1, 1, 3, 1, 3, 2, 6, 6, 3, 3, 3, 2, 1, 1, 1, 1, 5, 4, 4, 5, 6, 6, 5, 5, 5, 1, 1, 5, 5, 3, 1, 5, 2, 6, 4, 2, 1, 1, 1, 5, 5, 1, 2, 2, 1, 4, 3, 2, 1, 1, 1, 1, 3, 1, 4, 2, 1, 4, 4, 1, 2, 5, 4, 6, 2, 2, 6, 2, 1, 2, 3, 4, 1, 1, 1, 3, 3, 4, 3, 3, 3, 2, 5, 5, 1, 1, 4, 4, 5, 3, 4, 4, 4, 4, 4, 4, 4, 5, 3, 3, 5, 5, 1, 1, 5, 5, 6, 3, 2, 5, 5, 5, 1, 1, 1, 3, 3, 1, 5, 2, 5, 5, 3, 5, 3, 3, 3, 1, 4, 5, 4, 3, 4, 3, 2, 3, 5, 3, 5, 5, 5, 2, 6, 2, 3, 4, 6, 4, 3, 3, 6, 1, 2, 1, 2, 6, 2, 2, 2, 5, 1, 3, 4, 2, 5, 3, 1, 6, 3, 2, 3, 4, 3, 6, 1, 2, 4, 4, 4, 1, 1, 2, 5, 5, 4, 2, 2, 2, 2, 3, 3, 5, 3, 3, 5, 3, 2, 6, 5, 6, 3, 6, 1, 4, 1, 2, 1, 1, 5, 5, 5, 5, 5, 4, 2, 2, 3, 3, 4, 2, 4, 5, 3, 3, 6, 3, 5, 4, 5, 5, 2, 2, 6, 2, 6, 2, 5, 2, 2, 4, 5, 5, 5, 1, 5, 1, 1, 1, 4, 3, 5, 1, 3, 1, 5, 6, 3, 6, 5, 1, 5, 1, 3, 3, 3, 1, 5, 4, 1, 1, 1, 5, 5, 3, 3, 1, 3, 2, 5, 2, 4, 4, 4, 3, 3, 2, 6, 5, 2, 4, 6, 3, 1, 5, 5, 3, 5, 5, 1, 5]
# Collect Pokémon that belong to generation 1 or generation 2
gen1_gen2_pokemon = [name for name,gen in zip(poke_names, poke_gens) if gen < 3]

# Create a map object that stores the name lengths
name_lengths_map = map(len, gen1_gen2_pokemon)

# Combine gen1_gen2_pokemon and name_lengths_map into a list
gen1_gen2_name_lengths = [*zip(gen1_gen2_pokemon, name_lengths_map)]

print(gen1_gen2_name_lengths[:5])

[('Abra', 4), ('Aerodactyl', 10), ('Aipom', 5), ('Alakazam', 8), ('Ampharos', 8)]


### Exercise 2
```
# Create a total stats array
total_stats_np = stats.sum(axis=1)

# Create an average stats array
avg_stats_np = stats.mean(axis=1)

# Combine names, total_stats_np, and avg_stats_np into a list
poke_list_np = [*zip(names, total_stats_np, avg_stats_np)]

print(poke_list_np == poke_list, '\n')
print(poke_list_np[:3])
print(poke_list[:3], '\n')
top_3 = sorted(poke_list_np, key=lambda x: x[1], reverse=True)[:3]
print('3 strongest Pokémon:\n{}'.format(top_3))
```

## Writing better loops
- Understand what is being done with each loop iteration
- Move one-time calculation outside (above) the loop
- Use holistic conversions outside (below) the loop
- Anything that is done once should be outside the loop

In [26]:
import numpy as np

names = ['Absol', 'Aron', 'Jynx', 'Natu', 'Onix']
attacks = np.array([130, 70, 50, 50, 45])
# Calculate total avg once (outside the loop)
total_attack_avg = attacks.mean()
for pokemon, attack in zip(names, attacks):
  if attack > total_attack_avg:
    print("{}'s attacks {} > avg: {}!"
    .format(pokemon, attack, total_attack_avg)
    )

Absol's attacks 130 > avg: 69.0!
Aron's attacks 70 > avg: 69.0!


In [28]:
# Using holistic conversions
names = ['Pikachu', 'Squirtle', 'Articuno']
legend_status = [False, False, True]
generations = [1, 1, 1]
poke_data_tuples = []
for poke_tuple in zip(names, legend_status, generations):
  poke_data_tuples.append(poke_tuple)
poke_data = [*map(list, poke_data_tuples)]
print(poke_data)

[['Pikachu', False, 1], ['Squirtle', False, 1], ['Articuno', True, 1]]


### Exercise
1
```
# Import Counter
from collections import Counter

# Collect the count of each generation
gen_counts = Counter(generations)

# Improve for loop by moving one calculation above the loop
total_count = len(generations)

for gen,count in gen_counts.items():
    gen_percent = round(count / total_count * 100, 2)
    print('generation {}: count = {:3} percentage = {}'
          .format(gen, count, gen_percent))
```
2
```
# Collect all possible pairs using combinations()
possible_pairs = [*combinations(pokemon_types, 2)]

# Create an empty list called enumerated_tuples
enumerated_tuples = []

# Append each enumerated_pair_tuple to the empty list above
for i,pair in enumerate(possible_pairs, 1):
    enumerated_pair_tuple = (i,) + pair
    enumerated_tuples.append(enumerated_pair_tuple)

# Convert all tuples in enumerated_tuples to a list
enumerated_pairs = [*map(list, enumerated_tuples)]
print(enumerated_pairs)
```
3
```
# Calculate the total HP avg and total HP standard deviation
hp_avg = hps.mean()
hp_std = hps.std()

# Use NumPy to eliminate the previous for loop
z_scores = (hps - hp_avg)/hp_std

# Combine names, hps, and z_scores
poke_zscores2 = [*zip(names, hps, z_scores)]
print(*poke_zscores2[:3], sep='\n')

# Use list comprehension with the same logic as the highest_hp_pokemon code block
highest_hp_pokemon2 = [(name, hp, z_scores) for name,hp,z_scores in poke_zscores2 if z_scores > 2]
print(*highest_hp_pokemon2, sep='\n')
```

## Intro to pandas DataFrame iteration
### Pandas
- Library userd for data analysis
- Main data structure is the DataFrame
  - Tabular data with labeled rows and columns
  - Built on top of the Numpy array structure
- Chapter objective:
  -Best practice for iterating over a pandas DataFrame

iterating with .iterrows() \
.iterrows() return each df row as a tuple (index, pandas series) pairs.
```
import pandas as pd
import numpy as np

baseball_df = pd.read_csv('baseball_df.csv')

def calc_win_perc(wins, games_played):
  
  win_perc = wins / games_played

return np.round(win_perc, 2)

win_perc_list = []

for i, row in baseball_df.iterrows():
  wins = row['W'] # col named W
  games_played = row['G'] # col named G

  win_perc = calc_win_perc(wins, games_played)
  win_perc_list.append(win_perc)

baseball_df['WP'] = win_perc_list
```

### Exercise
```
# Create an empty list to store run differentials
run_diffs = []

def calc_run_diff(runs_scored, runs_allowed):

    run_diff = runs_scored - runs_allowed

    return run_diff

# Write a for loop and collect runs allowed and runs scored for each row
for i,row in giants_df.iterrows():
    runs_scored = row['RS']
    runs_allowed = row['RA']
    
    # Use the provided function to calculate run_diff for each row
    run_diff = calc_run_diff(runs_scored, runs_allowed)
    
    # Append each run differential to the output list
    run_diffs.append(run_diff)

giants_df['RD'] = run_diffs
print(giants_df)
```

## Another iterator method: .itertuples()
.itertuples() is often more efficient than iterrows() \
We could use .itertuples to loop over df rows instead. It returns each df row as namedtuple

- To print column from DF

```
## .iterrows()
teams_wins_df = pd.read_csv('teams_wins_df.csv')

for row_tuple in team_wins_df.iterrows():
  print(row_tuple[1]['Team'])
```

```
## .itertuples()
teams_wins_df = pd.read_csv('teams_wins_df.csv')

for row_tuple in team_wins_df.itertuples():
  print(row_tuple.Team)
```

### Exercise
1
```
# Loop over the DataFrame and print each row's Index, Year and Wins (W)

for row in rangers_df.itertuples():
  i = row.Index
  year = row.Year
  wins = row.W
  
  # Check if rangers made Playoffs (1 means yes; 0 means no)
  if row.Playoffs == 1:
    print(i, year, wins)
```
2
```
yankees_df = pd.read_csv('yankees_df.csv')

run_diffs = []

def calc_run_diff(runs_scored, runs_allowed):

  run_diff = runs_scored - runs_allowed

  return run_diff

# Loop over the DataFrame and calculate each row's run differential
for row in yankees_df.itertuples():
  
  runs_scored = row.RS
  runs_allowed = row.RA

  run_diff = calc_run_diff(runs_scored, runs_allowed)
  
  run_diffs.append(run_diff)

# Append new column
yankees_df['RD'] = run_diffs
print(yankees_df)
```

### pandas alternative to looping
To avoid using loop. This lesson will explore alternative to using .iterrows() and .itertuples()

### This is not the most efficient option to add new col
```
baseball_df = pd.read_csv('baseball_df.csv')

run_diffs_iterrows = []

def calc_run_diff(runs_scored, runs_allowed):

    run_diff = runs_scored - runs_allowed

    return run_diff

for i,row in baseball_df.iterrows():

    run_diff = calc_run_diff(row['RS'], row['RA'])
    
    run_diffs_iterrows.append(run_diff)

baseball_df['RD'] = run_diffs_iterrows
print(baseball_df)
```

### Use .apply() method
- Takes a function and applies it to a df
  - Must specify an axis to apply ( 0 for cols; 1 for rows)
- Can be used with anonymous function (lambda functions)

Example:
```
run_diffs_apply = baseball_df.apply(
    lambda row: calc_run_diff(row['RS'], row['RA']),
    axis=1)

baseball_df['RD'] = run_diffs_iterrows
print(baseball_df)
```

### Exercise
1
```
# Gather total runs scored in all games per year
total_runs_scored = rays_df[['RS', 'RA']].apply(sum, axis=1)
print(total_runs_scored)
```
2
```
# Display the first five rows of the DataFrame
print(dbacks_df.head())

# Create a win percentage Series
win_percs = dbacks_df.apply(lambda row: calc_win_perc(row['W'], row['G']), axis=1)
print(win_percs, '\n')

# Append a new column to dbacks_df
dbacks_df['WP'] = win_percs
print(dbacks_df, '\n')

# Display dbacks_df where WP is greater than 0.50
print(dbacks_df[dbacks_df['WP'] >= 0.50])
```

### Optimal pandas iterating
pandas internals
- Eliminating loops applies to using pandas as well
- pandas is built on NumPy
  - Take advantage of NumPy array efficiencies

```
wins_np = baseball_df['W'].vales
print(type(wins_np))
## return <class 'numpy.ndarray'>
```

Power of vectorization
- Broadcasting (vectorizing) is extremely efficient!
- Instead of looping over a DF, treating each row with .iterrows or .itertuple, and .apply. We can perform calculation on underlying NumPy arrays.

```
baseball_df['RS'].values - baseball_df['RA'].values
```

Run differentials with arrays
```
run_diffs_np = baseball_df['RS'].values - baseball_df['RA'].values
baseball_df['RD'] = run_diffs_np
print(baseball_df)
```

### Exercise
1
```
# Use the W array and G array to calculate win percentages
win_percs_np = calc_win_perc(baseball_df['W'].values, baseball_df['G'].values)

# Append a new column to baseball_df that stores all win percentages
baseball_df['WP'] = win_percs_np

print(baseball_df.head())
```
2
```
win_perc_preds_loop = []

# 1. Use a loop and .itertuples() to collect each row's predicted win percentage
for row in baseball_df.itertuples():
    runs_scored = row.RS
    runs_allowed = row.RA
    win_perc_pred = predict_win_perc(runs_scored, runs_allowed)
    win_perc_preds_loop.append(win_perc_pred)
baseball_df['WP_preds_loop'] = win_perc_preds_loop

# 2. Apply predict_win_perc to each row of the DataFrame
win_perc_preds_apply = baseball_df.apply(lambda row: predict_win_perc(row['RS'], row['RA']), axis=1)
baseball_df['WP_preds_apply'] = win_perc_preds_apply

# 3. Calculate the win percentage predictions using NumPy arrays
win_perc_preds_np = predict_win_perc(baseball_df['RS'].values, baseball_df['RA'].values)
baseball_df['WP_preds'] = win_perc_preds_np
print(baseball_df.head())
```