### Efficiently combining, counting, and iterating


1. Efficiently combining, counting, and iterating

In this lesson, we'll cover combining, counting, and iterating over objects efficiently. Let's begin by exploring a dataset from the popular Nintendo video game Pokémon.

2. Pokémon Overview

The Pokémon game centers around players called trainers that try to collect fictional animals called Pokémon.
These Pokémon animals roam the fictional universe where the game takes place. When a trainer encounters a Pokémon, they try to capture that Pokémon to add to their collection.
If a trainer successfully captures a Pokémon, it is stored in a tool called a Pokédex.

3. Pokémon Description

Each Pokémon comes with its own set of metadata.
This metadata contains a name for each Pokémon. It also has the generation of each Pokémon specifying what version of the game the Pokémon appears in. Here, Squirtle, a Pokémon from generation one, is shown.
The metadata also includes the Pokémon's Type and whether or not it belongs to a special category called Legendary.
Each Pokémon has a set of statistics that are numerical values for certain categories like Health Points (called HP), Attack, and others. We'll use a dataset that contains pieces of this metadata for the remainder of the chapter.

4. Combining objects

Suppose we have two lists: one of Pokémon names and another of each Pokémon's Health Points. We want to combine these lists so that each Pokémon is stored next to its Health Points. We can iterate over the names list using enumerate and grab each Pokémon's corresponding Health Points using the index variable.

    > names = p'Bulbasaur', 'Charmander', 'Squirtle']
    
    > hps = [45, 39, 44]
    
5. Combining objects with zip

But Python's built-in function zip provides a more elegant solution. The name "zip" describes how this function combines objects like a zipper on a jacket (making two separate things become one). zip returns a zip object that must be unpacked into a list and printed to see the contents. Each item is a tuple of elements from the original lists.

    > combined_zip = zip(names,hps)
    
    > combined_zip_list = [*combined_zip]

    > print(combined_zip_list)

        >> [('Bulbasaur', 45), ('Charmander', 39), ('Squirtle', 44)]
    
6. The collections module

Python comes with a number of efficient built-in modules. The collections module contains specialized datatypes that can be used as alternatives to standard dictionaries, lists, sets, and tuples.
A few notable specialized datatypes are listed here. Let's dig deeper into the Counter object.

7. Counting with loop

Our Pokémon dataset describes 720 characters. Here is a list of each Pokémon's corresponding type. We'd like to create a dictionary where each key is a Pokémon type, and each value is the count of characters that belong to that type. Using a standard dictionary approach, we have to instantiate an empty output dictionary. Then, we iterate over the poke_types list and check whether or not each poke_type exists within the type_counts dictionary. If the poke_type is not in the dictionary, we create a new key and initialize its count value as one. If the poke_type is already in the dictionary, we update the count by one.

    > poke_types = ['Grass', 'Dark', 'Fire', 'Fire', ...]

8. collections.Counter()

Using Counter from the collections module is a more efficient approach. Just import Counter and provide the object to be counted. No need for a loop! Counter returns a Counter dictionary of key-value pairs. When printed, it's ordered by highest to lowest counts. If comparing runtime times, we'd see that using Counter takes half the time as the standard dictionary approach!

    > from collections import Counter
    
    > type_counts = Counter(poke_types)
    
9. The itertools module

Another built-in module, itertools, contains functional tools for working with iterators. A subset of these tools is listed here.
We'll focus on one piece of this module: the combinatoric generators. These generators efficiently yield Cartesian products, permutations, and combinations of objects. Let's explore an example.

10. Combinations with loop

Suppose we want to gather all combination pairs of Pokémon types possible. We can do this with a nested for loop that iterates over the poke_types list twice. Notice that a conditional statement is used to skip pairs having the same type twice. For example, if x is 'Bug' and y is 'Bug', we want to skip this pair. Since we're interested in combinations (where order doesn't matter), another statement is used to ensure either order of the pair doesn't already exist within the combos list before appending it. For example, the pair ('Bug', 'Fire') is the same as the pair ('Fire', 'Bug'). We want one of these pairs, not both.

    > poke_types = ['Bug', 'Fire', 'Ghost', 'Grass, 'Water']
    
    > combos = []

11. itertools.combinations()

The combinations generator from itertools provides a more efficient solution. First, we import combinations and then create a combinations object by providing the poke_types list and the length of combinations we desire. combinations returns a combinations object, which we unpack into a list and print to see the result. If comparing runtimes, we'd see using combinations is significantly faster than the nested loop.

    > from itertools import combinations
    
    > combos_obj = combinations(poke_types, 2)
     
    > combos = [*combos_obj]
    
    > print(combos)
    
    >> [('Bug', 'Fire'), ('Bug', 'Ghost'), ('Bug', 'Grass'), ('Bug', 'Water'), ('Fire', 'Ghost'), ('Fire', 'Grass'), ('Fire', 'Water'), ('Ghost', 'Grass'), ('Ghost', 'Water'), ('Grass', 'Water')]
    
Let's practice

In [4]:
names = ['Bulbasaur', 'Charmander', 'Squirtle']
hps = [45, 39, 44]
combined_zip = zip(names,hps)
combined_zip_list = [*combined_zip]
print(combined_zip_list)

[('Bulbasaur', 45), ('Charmander', 39), ('Squirtle', 44)]


In [7]:
from itertools import combinations
poke_types = ['Bug', 'Fire', 'Ghost', 'Grass', 'Water']
combos_obj = combinations(poke_types, 2)
combos = [*combos_obj]
print(combos)

[('Bug', 'Fire'), ('Bug', 'Ghost'), ('Bug', 'Grass'), ('Bug', 'Water'), ('Fire', 'Ghost'), ('Fire', 'Grass'), ('Fire', 'Water'), ('Ghost', 'Grass'), ('Ghost', 'Water'), ('Grass', 'Water')]


### Exercise

#### Combining Pokémon names and types
Three lists have been loaded into your session from a dataset that contains 720 Pokémon:

The names list contains the names of each Pokémon.
The primary_types list contains the corresponding primary type of each Pokémon.
The secondary_types list contains the corresponding secondary type of each Pokémon (nan if the Pokémon has only one type).
We want to combine each Pokémon's name and types together so that you easily see a description of each Pokémon. Practice using zip() to accomplish this task.

In [18]:
import numpy as np
names = [
 'Accelgor',
 'Aerodactyl',
 'Aggron',
 'Aipom',
 'Alakazam',
 'Alomomola',
 'Altaria',
 'Amaura',
 'Ambipom',
 'Amoonguss']
primary_types = ['Grass',
 'Psychic',
 'Dark',
 'Bug',
 'Rock',
 'Steel',
 'Normal',
 'Psychic',
 'Water',
 'Dragon']
secondary_types = ['Ice', np.nan, np.nan, np.nan, 'Flying', 'Rock', np.nan, np.nan, np.nan, 'Flying']

In [13]:
# Combine names and primary_types
names_type1 = [*zip(names, primary_types)]

print(*names_type1[:5], sep='\n')

('Accelgor', 'Grass')
('Aerodactyl', 'Psychic')
('Aggron', 'Dark')
('Aipom', 'Bug')
('Alakazam', 'Rock')


In [19]:
# Combine all three lists together
names_types = [*zip(names, primary_types, secondary_types)]

print(*names_types[:5], sep='\n')

('Accelgor', 'Grass', 'Ice')
('Aerodactyl', 'Psychic', nan)
('Aggron', 'Dark', nan)
('Aipom', 'Bug', nan)
('Alakazam', 'Rock', 'Flying')


In [17]:
# Combine five items from names and three items from primary_types
differing_lengths = [*zip(names[:5], primary_types[:3])]

print(*differing_lengths, sep='\n')

('Accelgor', 'Grass')
('Aerodactyl', 'Psychic')
('Aggron', 'Dark')


#### Counting Pokémon from a sample
A sample of 500 Pokémon has been generated, and three lists from this sample have been loaded into your session:

The names list contains the names of each Pokémon in the sample.
The primary_types list containing the corresponding primary type of each Pokémon in the sample.
The generations list contains the corresponding generation of each Pokémon in the sample.
You want to quickly gather a few counts from these lists to better understand the sample that was generated. Use Counter from the collections module to explore what types of Pokémon are in your sample, what generations they come from, and how many Pokémon have a name that starts with a specific letter.

Counter has already been imported into your session for convenience.

In [None]:
from collections import Counter
# Collect the count of primary types
type_count = Counter(primary_types)
print(type_count, '\n')

# Collect the count of generations
gen_count = Counter(generations)
print(gen_count, '\n')

# Use list comprehension to get each Pokémon's starting letter
starting_letters = [name[0] for name in names]

# Collect the count of Pokémon for each starting_letter
starting_letters_count = Counter(starting_letters)
print(starting_letters_count)

#### Combinations of Pokémon
Ash, a Pokémon trainer, encounters a group of five Pokémon. These Pokémon have been loaded into a list within your session (called pokemon) and printed into the console for your convenience.

Ash would like to try to catch some of these Pokémon, but his Pokédex can only store two Pokémon at a time. Let's use combinations from the itertools module to see what the possible pairs of Pokémon are that Ash could catch.

In [None]:
# Import combinations from itertools
from itertools import combinations

# Create a combination object with pairs of Pokémon
combos_obj = combinations(pokemon, 2)
print(type(combos_obj), '\n')

# Convert combos_obj to a list by unpacking
combos_2 = [*combos_obj]
print(combos_2, '\n')

# Collect all possible combinations of 4 Pokémon directly into a list
combos_4 = [*combinations(pokemon, 4)]
print(combos_4)

### Set theory

1. Set theory

Often, we'd like to compare two objects to observe similarities and differences between their contents. When doing this type of comparison, it's best to leverage a branch of mathematics called set theory. As you know, Python comes with a built-in set data type. Sets come with some handy methods we can use for comparing. We'll explore each of these methods later on. The main takeaway is that when we'd like to compare objects multiple times and in different ways, we should consider storing our data in sets to leverage these elegant and efficient methods. Another nice feature of Python sets is their ability to quickly check if a value exists within its members. We call this membership testing. In this lesson, we'll show that using the in operator with a set is much faster than using it with a list or tuple. Let's explore a few examples.

    * intersection()
    * difference()
    * symmetric_difference()
    * union()
    * in
    
2. Comparing objects with loops

Suppose we had two lists of Pokémon: list_a and list_b.
We'd like to compare these lists to see which Pokémon appear in both lists.
We could use a nested for loop to compare each item in list_a to each item in list_b and collect only those items that appear in both lists. But, iterating over each item in both lists is extremely inefficient.

3. Comparing objects with set theory

Instead, we should use Python's set data type to compare these lists. By converting each list into a set, we can use the dot-intersection method to collect the Pokémon shared between the two sets. One simple line of code and no need for a loop!

    > set_a = set(list_a)
    
    > set_b = set(list_b)
    
    > set_a.intersection(set_b)
    
When comparing runtimes, we see that using sets is a much faster approach.

4. Set method: difference

We can also use a set method to see Pokémon that exist in one set but not in another. To gather Pokémon that exist in set_a but not in set_b, use:

    > set_a.difference(set_b)
    
If we want the Pokémon in set_b, but not in set_a, we use set_b-dot-difference(set_a).

    > set_b.difference(set_a)

5. Set method: symmetric difference

To collect Pokémon that exist in exactly one of the sets (but not both), we can use a method called the symmetric difference.

    > set_a.symmetric_difference(set_b)
    
6. Set method: union

Finally, we can combine these sets using the dot-union method. This collects all of the unique Pokémon that appear in either or both sets.

    > set_a.union(set_b)
    
7. Membership testing with sets

Another nice efficiency gain when using sets is the ability to quickly check if a specific item is a member of a set's elements. Consider our collection of 720 Pokémon names stored as a list, tuple, and set.
We want to check whether or not the character, Zubat, is in each of these data structures.

    > names_set = {'Bla', 'BlaBLa', 'BBBA',...}
    
When comparing runtimes, it's clear that membership testing with a set is significantly faster than a list or a tuple.

    > %timeit 'Bla' in names_set
    
8. Uniques with sets

One final efficiency gain when using sets comes from the definition of set itself. A set is defined as a collection of distinct elements. Thus, we can use a set to collect unique items from an existing object. Let's revisit the primary_types list, which contains the primary types of each Pokémon. If we wanted to collect the unique Pokémon types within this list, we could write a for loop to iterate over the list, and only append the Pokémon types that haven't already been added to the unique_types list.

Using a set makes this much easier. All we have to do is convert the primary_types list into a set, and we have our solution: a set of distinct Pokémon types.

    > unique_types_set = set(primary_types)
    
9. Let's practice

### Exercise
#### Comparing Pokédexes
Two Pokémon trainers, Ash and Misty, would like to compare their individual collections of Pokémon. Let's see what Pokémon they have in common and what Pokémon Ash has that Misty does not.

Both Ash and Misty's Pokédex (their collection of Pokémon) have been loaded into your session as lists called ash_pokedex and misty_pokedex. They have been printed into the console for your convenience.

In [None]:
# Convert both lists to sets
ash_set = set(ash_pokedex)
misty_set = set(misty_pokedex)

# Find the Pokémon that exist in both sets
both = ash_set.intersection(misty_set)
print(both)

# Find the Pokémon that Ash has and Misty does not have
ash_only = ash_set.difference(misty_set)
print(ash_only)

# Find the Pokémon that are in only one set (not both)
unique_to_set = ash_set.symmetric_difference(misty_set)
print(unique_to_set)

#### Searching for Pokémon
Two Pokémon trainers, Ash and Brock, have a collection of ten Pokémon each. Each trainer's Pokédex (their collection of Pokémon) has been loaded into your session as lists called ash_pokedex and brock_pokedex respectively.

You'd like to see if certain Pokémon are members of either Ash or Brock's Pokédex.

Let's compare using a set versus using a list when performing this membership testing. 

In [None]:
# Convert Brock's Pokédex to a set
brock_pokedex_set = set(brock_pokedex)
print(brock_pokedex_set)

# Check if Psyduck is in Ash's list and Brock's set
print('Psyduck' in ash_pokedex)
print('Psyduck' in brock_pokedex_set)

# Check if Machop is in Ash's list and Brock's set
print('Machop' in ash_pokedex)
print('Machop' in brock_pokedex_set)

Within your IPython console, use %timeit to compare membership testing for 'Psyduck' in ash_pokedex, 'Psyduck' in brock_pokedex_set, 'Machop' in ash_pokedex, and 'Machop' in brock_pokedex_set (a total of four different timings).

Don't include the print() function. Only time the commands that you wrote inside the print() function in the previous steps.

In [None]:
%timeit 'Psyduck' in ash_pokedex
%timeit 'Psyduck' in brock_pokedex_set
%timeit 'Machop' in ash_pokedex
%timeit 'Machop' in brock_pokedex_set

Awesome! Membership testing is much faster when you use sets. Did you notice that using a set for member testing is faster than using a list regardless if the item you are checking is in the set? Checking for 'Psyduck' (which was not in Brock's set) is still faster than checking for 'Psyduck' in Ash's list!

#### Gathering unique Pokémon
A sample of 500 Pokémon has been created with replacement (meaning a Pokémon could be selected more than once and duplicates exist within the sample).

Three lists have been loaded into your session:

The names list contains the names of each Pokémon in the sample.
The primary_types list containing the corresponding primary type of each Pokémon in the sample.
The generations list contains the corresponding generation of each Pokémon in the sample.
The below function was written to gather unique values from each list:

In [20]:
def find_unique_items(data):
    uniques = []

    for item in data:
        if item not in uniques:
            uniques.append(item)

    return uniques

Let's compare the above function to using the set data type for collecting unique items.

In [None]:
# Use find_unique_items() to collect unique Pokémon names
uniq_names_func = find_unique_items(names)
print(len(uniq_names_func))

# Convert the names list to a set to collect unique Pokémon names
uniq_names_set = set(names)
print(len(uniq_names_set))

# Check that both unique collections are equivalent
print(sorted(uniq_names_func) == sorted(uniq_names_set))

Within your IPython console, use %timeit to compare the find_unique_items() function with using a set data type to collect unique Pokémon character names in names.

Which membership testing was faster?

In [None]:
%timeit uniq_names_func = find_unique_items(names)
%timeit uniq_names_set = set(names)

> 1.38 ms +- 52 us per loop (mean +- std. dev. of 7 runs, 1000 loops each)

> 12.4 us +- 333 ns per loop (mean +- std. dev. of 7 runs, 100000 loops each)

Nice work! Using a set data type to collect unique values is much faster than using a for loop (like in the find_unique_items() function). Since a set is defined as a collection of distinct elements, it is an efficient way to collect unique items from an existing object. Here you took advantage of a set to find the distinct Pokémon from the sample (eliminating duplicate Pokémon) and saw what unique Pokémon types and generations were included in the sample.

### Eliminating loops
1. Looping in Python

Python comes with a few looping patterns that can be used when we want to iterate over an object's contents. For loops iterate over elements of a sequence piece-by-piece. While loops execute a loop repeatedly as long as some Boolean condition is met. Nested loops use multiple loops inside one another. Although all of these looping patterns are supported by Python, we should be careful when using them. Because most loops are evaluated in a piece-by-piece manner, they are often an inefficient solution.

    * for
    * while

2. Benefits of eliminating loops

We should try to avoid looping as much as possible when writing efficient code. Eliminating loops usually results in fewer lines of code that are easier to interpret. Recall in a previous lesson we referred to the Zen of Python - a collection of idioms for writing Pythonic code. One of these idioms was "flat is better than nested." Striving to eliminate loops in our code will help us follow this idiom. In the following examples, we'll see that there are often more efficient approaches that can be used instead of using a loop.

    > "Flat is better than nested"
    
3. Eliminating loops with built-ins

Suppose we have a list of lists, called poke_stats, that contains statistical values for each Pokémon. Each row corresponds to a Pokémon, and each column corresponds to a Pokémon's specific statistical value. Here, the columns represent a Pokémon's Health Points, Attack, Defense, and Speed.

We want to do a simple sum of each of these rows in order to collect the total stats for each Pokémon. If we were to use a loop to calculate row sums, we would have to iterate over each row and append the row's sum to the totals list. We can accomplish the same task, in fewer lines of code, with a list comprehension. Or, we could use the built-in map function that we've discussed previously.

> for row in poke_stats:
    totals.append(sum(row))

> totals_comp = [sum(row) for row in poke_stats]

> totals_map = [*map(sum, poke_stats)]

Each of these approaches will return the same list, but using a list comprehension or the map function takes one line of code, and has a faster runtime.

4. Eliminating loops with built-in modules

We've also covered a few built-in modules that can help us eliminate loops. Remember when we collected the different combinations of Pokémon types? Instead of using the nested for loop, we used combinations from the itertools module for a cleaner, more efficient solution.

> from itertools import combinations

> combos2 = [*combinations(poke_types, 2)]

5. Eliminate loops with NumPy

Another powerful technique for eliminating loops is to use the NumPy package. Suppose we had the same collection of statistics we used in a previous example but stored in a NumPy array instead of a list of lists.

We'd like to collect the average stat value for each Pokémon (or row) in our array. We could use a loop to iterate over the array and collected the row averages. But, NumPy arrays allow us to perform calculations on entire arrays all at once. Here, we use the dot-mean method and specifying an axis equal to 1 to calculate the mean for each row (meaning we calculate an average across the column values). This eliminates the need for a loop and is much more efficient.

> avgs_np = poke_stats.mean(axis=1)

When comparing runtimes, we see that using the dot-mean method on the entire array and specifying an axis is significantly faster than using a loop.

6. Let's practice!

### Exercise
#### Gathering Pokémon without a loop
A list containing 720 Pokémon has been loaded into your session as poke_names. Another list containing each Pokémon's corresponding generation has been loaded as poke_gens.

A for loop has been created to filter the Pokémon that belong to generation one or two, and collect the number of letters in each Pokémon's name:

In [None]:
gen1_gen2_name_lengths_loop = []

for name,gen in zip(poke_names, poke_gens):
    if gen < 3:
        name_length = len(name)
        poke_tuple = (name, name_length)
        gen1_gen2_name_lengths_loop.append(poke_tuple)

In [None]:
# Collect Pokémon that belong to generation 1 or generation 2
gen1_gen2_pokemon = [name for name,gen in zip(poke_names, poke_gens) if gen < 3]

# Create a map object that stores the name lengths
name_lengths_map = map(len, gen1_gen2_pokemon)

# Combine gen1_gen2_pokemon and name_lengths_map into a list
gen1_gen2_name_lengths = [*zip(gen1_gen2_pokemon, name_lengths_map)]

print(gen1_gen2_name_lengths_loop[:5])
print(gen1_gen2_name_lengths[:5])

If you're an experienced Pythonista, you may have noticed that you could replace the entire for loop with one list comprehension:
> [(name, len(name)) for name,gen in zip(poke_names, poke_gens) if gen < 3]

#### Pokémon totals and averages without a loop
A list of 720 Pokémon has been loaded into your session called names. Each Pokémon's corresponding statistics has been loaded as a NumPy array called stats. Each row of stats corresponds to a Pokémon in names and each column represents an individual Pokémon stat (HP, Attack, Defense, Special Attack, Special Defense, and Speed respectively.)

You want to gather each Pokémon's total stat value (i.e., the sum of each row in stats) and each Pokémon's average stat value (i.e., the mean of each row in stats) so that you find the strongest Pokémon.

The below for loop was written to collect these values:

In [None]:
poke_list = []

for pokemon,row in zip(names, stats):
    total_stats = np.sum(row)
    avg_stats = np.mean(row)
    poke_list.append((pokemon, total_stats, avg_stats))

In [None]:
# Create a total stats array
total_stats_np = stats.sum(axis=1)

# Create an average stats array
avg_stats_np = stats.mean(axis=1)

# Combine names, total_stats_np, and avg_stats_np into a list
poke_list_np = [*zip(names, total_stats_np, avg_stats_np)]

print(poke_list_np == poke_list, '\n')
print(poke_list_np[:3])
print(poke_list[:3], '\n')
top_3 = sorted(poke_list_np, key=lambda x: x[1], reverse=True)[:3]
print('3 strongest Pokémon:\n{}'.format(top_3))

Great work! You used NumPy's .sum() and .mean() methods with a specific axis to eliminate a for loop. With this approach, you were able to quickly see that 'GroudonPrimal Groudon', 'KyogrePrimal Kyogre', and 'Arceus' were the strongest Pokémon in your list based on total stats.

If you were to gather run times, the for loop would have taken _milliseconds_ to execute while the NumPy approach would have taken _microseconds_ to execute. This is quite an improvement!

### Writing better loops
1. Writing better loops

We've discussed how loops can be costly and inefficient. But, sometimes you can't eliminate a loop. In this lesson, we'll explore how to make loops more efficient when looping is unavoidable.

2. Lesson caveat

Before diving in, some of the loops we'll discuss can be eliminated using techniques covered in previous lessons. For demonstrative purposes, we'll assume the use cases shown here are instances where a loop is unavoidable.

3. Writing better loops

The best way to make a loop more efficient is to analyze what's being done within the loop. We want to make sure that we aren't doing unnecessary work in each iteration. 
   * If a calculation is performed for each iteration of a loop, but its value doesn't change with each iteration, it's best to move this calculation outside (or above) the loop. 
   * If a loop is converting data types with each iteration, it's possible that this conversion can be done outside (or below) the loop using a map function. Anything that can be done once should be moved outside of a loop. 

Let's explore a few examples.

4. Moving calculations above a loop

We have a list of Pokémon names and an array of each Pokémon's corresponding attack value. We'd like to print the names of each Pokémon with an attack value greater than the average of all attack values. To do this, we'll use a loop that iterates over each Pokémon and their attack value. For each iteration, the total attack average is calculated by finding the mean value of all attacks. Then, each Pokémon's attack value is evaluated to see if it exceeds the total average. Can you spot the inefficiency? The total_attack_avg variable is being created with each iteration of the loop. But, this calculation doesn't change between iterations since it is an overall average. We only need to calculate this value once.

5. Moving calculations above a loop

By moving this calculation outside (or above) the loop, we calculate the total attack average only once. We get the same output, but this is a more efficient approach.

Comparing runtimes, we see that keeping the total_attack_avg calculation within the loop takes roughly 75 microseconds.
Moving the calculation takes about half the time.

6. Using holistic conversions

Another way to make loops more efficient is to use holistic conversions outside (or below) the loop. 

We have three lists from our dataset of 720 Pokémon: a list of each Pokémon's name, a list corresponding to whether or not a Pokémon has a legendary status, and a list of each Pokémon's generation. We want to combine these objects so that each name, status, and generation is stored in an individual list. 

To do this, we'll use a loop that iterates over the output of the zip function. Remember, zip returns a collection of tuples, so we need to convert each tuple into a list since we want to create a list of lists as our output. Now, we append each individual poke_list to our poke_data output variable. By printing the result, we see our desired list of lists. However, converting each tuple to a list within the loop is not very efficient.

Instead, we should collect all of our poke_tuples together, and use the map function to convert each tuple to a list. The loop no longer converts tuples to lists with each iteration. Instead, we moved this tuple to list conversion outside (or below) the loop. That way, we convert data types all at once (or holistically) rather than converting in each iteration.

> poke_data_tuples = []

> for poke_tuple in zip(names, legend_status, generations):
    poke_data_tuples.append(poke_tuple)

> poke_data = [*map(list, poke_data_tuples)]

Runtimes show that converting each tuple to a list outside of the loop is more efficient.

11. Time for some practice!

### Exercise
#### One-time calculation loop
A list of integers that represents each Pokémon's generation has been loaded into your session called generations. You'd like to gather the counts of each generation and determine what percentage each generation accounts for out of the total count of integers.

The below loop was written to accomplish this task:

In [None]:
for gen,count in gen_counts.items():
    total_count = len(generations)
    gen_percent = round(count / total_count * 100, 2)
    print(
      'generation {}: count = {:3} percentage = {}'
      .format(gen, count, gen_percent)
    )

Let's make this loop more efficient by moving a one-time calculation outside the loop.

In [None]:
# Import Counter
from collections import Counter

# Collect the count of each generation
gen_counts = Counter(generations)

# Improve for loop by moving one calculation above the loop
total_count = len(generations)

for gen,count in gen_counts.items():
    gen_percent = round(count / total_count * 100, 2)
    print('generation {}: count = {:3} percentage = {}'
          .format(gen, count, gen_percent))

#### Holistic conversion loop
A list of all possible Pokémon types has been loaded into your session as pokemon_types. It's been printed in the console for convenience.

You'd like to gather all the possible pairs of Pokémon types. You want to store each of these pairs in an individual list with an enumerated index as the first element of each list. This allows you to see the total number of possible pairs and provides an indexed label for each pair.

The below loop was written to accomplish this task:

In [None]:
enumerated_pairs = []

for i,pair in enumerate(possible_pairs, 1):
    enumerated_pair_tuple = (i,) + pair
    enumerated_pair_list = list(enumerated_pair_tuple)
    enumerated_pairs.append(enumerated_pair_list)

Let's make this loop more efficient using a holistic conversion.

In [None]:
from itertools import combinations
# Collect all possible pairs using combinations()
possible_pairs = [*combinations(pokemon_types, 2)]

# Create an empty list called enumerated_tuples
enumerated_tuples = []

# Append each enumerated_pair_tuple to the empty list above
for i,pair in enumerate(possible_pairs, 1):
    enumerated_pair_tuple = (i,) + pair
    enumerated_tuples.append(enumerated_pair_tuple)

# Convert all tuples in enumerated_tuples to a list
enumerated_pairs = [*map(list, enumerated_tuples)]
print(enumerated_pairs)

### Bringing it all together: Pokémon z-scores
A list of 720 Pokémon has been loaded into your session as names. Each Pokémon's corresponding Health Points is stored in a NumPy array called hps. You want to analyze the Health Points using the z-score to see how many standard deviations each Pokémon's HP is from the mean of all HPs.

The below code was written to calculate the HP z-score for each Pokémon and gather the Pokémon with the highest HPs based on their z-scores:

In [None]:
poke_zscores = []

for name,hp in zip(names, hps):
    hp_avg = hps.mean()
    hp_std = hps.std()
    z_score = (hp - hp_avg)/hp_std
    poke_zscores.append((name, hp, z_score))
highest_hp_pokemon = []

for name,hp,zscore in poke_zscores:
    if zscore > 2:
        highest_hp_pokemon.append((name, hp, zscore))

Use NumPy to eliminate the for loop used to create the z-scores.
Then, combine the names, hps, and z_scores objects together into a list called poke_zscores2.

In [None]:
# Calculate the total HP avg and total HP standard deviation
hp_avg = hps.mean()
hp_std = hps.std()

# Use NumPy to eliminate the previous for loop
z_scores = (hps - hp_avg)/hp_std

# Combine names, hps, and z_scores
poke_zscores2 = [*zip(names, hps, z_scores)]
print(*poke_zscores2[:3], sep='\n')

In [None]:
# Use list comprehension with the same logic as the highest_hp_pokemon code block
highest_hp_pokemon2 = [(names, hps, z_scores) for names,hps,z_scores in poke_zscores2 if z_scores > 2]
print(*highest_hp_pokemon2, sep='\n')

Use %%timeit (cell magic mode) within your IPython console to compare the runtimes between the original code blocks and the new code you developed using NumPy and list comprehension.

In [None]:
%timeit highest_hp_pokemon2 = [(names, hps, z_scores) for names,hps,z_scores in poke_zscores2 if z_scores > 2]

Great job! You're Catching 'Em All (efficiencies that is). You eliminated two loops using NumPy broadcasting and list comprehension. Did you notice how much faster the approach you developed was compared to the original loops? What a great improvement!

Remember the techniques you've learned throughout this chapter as you continue writing Python code outside this course. Keep in mind the built-in functions and modules you covered to eliminate loops and remember to check your unavoidable loops for things that can be moved outside.