# Lesson 4. Introduction to List Comprehensions in Python: Write More Efficient Loops

After completing this chapter, you will be able to:

- Modify values in a list using a list comprehension
- Apply a function to values in a list using a list comprehension
- Use conditional statements within a list comprehension to control list outputs

## List Comprehension Basics
Loops, as you’ve seen, can be a very powerful tool to manipulate and create data. However, they’re not the only option when it comes to these types of operations. Another popular method is list comprehension. It’s a concise and quick way to modify values in a list and create a new list from the output. It works in a similar way to a for loop, but has slightly different syntax. One can be translated to the other fairly easily!

To perform list comprehension, you have to put the for loop and the desired outcome inside of a list. So this:

In [2]:
#new_list = []
#for i in list:
#    new_list.append(i*i)

becomes this:

In [4]:
#new_list = [i*i for i in list]

You can see that the code takes up less space, and uses similar words to the for loop. However, the execution is different.

#### Benefits and Downsides of List Comprehension
There are many pros and cons to consider when using list comprehension.

Pros:

- Generally faster than for loops, especially for large datasets.
- Takes less code to write and fits in a smaller space than a for loop.

Cons:

- Can be less legible in certain situations.
- Can be harder to implement for complicated operations in for loops.

###   Time Saved with List Comprehension

In [5]:
%%time
# Time a cell using a for loop
for_list = []
for i in range(50000):
    for_list.append(i*i)

Wall time: 9.97 ms


In [6]:
%%time
# Time a cell using list comprehension
comp_list = [i*i for i in range(50000)]

Wall time: 3.99 ms


### Modify Values with List Comprehension

In [7]:
# Create list of average monthly precip (inches) in Boulder, CO
avg_monthly_precip_in = [0.70,  0.75, 1.85, 2.93, 3.05, 2.02, 
                         1.93, 1.62, 1.84, 1.31, 1.39, 0.84]

# Convert each item in list from in to mm
[month * 25.4 for month in avg_monthly_precip_in]

[17.779999999999998,
 19.049999999999997,
 46.99,
 74.422,
 77.46999999999998,
 51.308,
 49.022,
 41.148,
 46.736,
 33.274,
 35.306,
 21.336]

## Apply a Function to a List
Similar to modifying a value in a list, it’s possible to use list comprehension to apply a function to every value in a list. This can be useful for more complicated operations that need to be performed. This can also be done with the map function. More info on mapping can be found in the Data Tip below.

##### Data Tip: map in Python

While a list comprehension is one way to apply a function to every variable in a list, Python has functions more suited for this type of operation, namely the map() function. Although it can be more complicated, it is very useful for the type of situation where you would be applying a complicated function to every variable in a list, pandas DataFrame, or other data storage object. For further reading on map(), see this website explaining the fundamentals.

In [8]:
# Function written to convert from inches to mm
def convert_in_to_mm(num):
    return num * 25.4

# Using list comprehension to convert all the variables in the list
[convert_in_to_mm(month) for month in avg_monthly_precip_in]

[17.779999999999998,
 19.049999999999997,
 46.99,
 74.422,
 77.46999999999998,
 51.308,
 49.022,
 41.148,
 46.736,
 33.274,
 35.306,
 21.336]

#### If Condition Only
Conditionals can be implemented in list comprehension. This is can be an easy way to filter out unwanted variables from a list. If the conditional doesn’t have an else statement, the if condition is put after the for loop.

In [9]:
# Filtering out values in a month that are less than 1.5
[month for month in avg_monthly_precip_in if month > 1.5]

[1.85, 2.93, 3.05, 2.02, 1.93, 1.62, 1.84]

#### If Else Conditionals
If your conditional has an else statement, it is formatted differently. In this case, it would go before the for loop, with the operation for the if condition going before if, and the operation for the else condition going after else.

In [10]:
# Performing two different operations on the variables depending on if they are more or less than 1.5. 
# If they are more then 1.5, they are multiplied by negative 2. Otherwise, they are multiplied by positive 2. 
[month * -2 if month > 1.5 else month * 2 for month in avg_monthly_precip_in]

[1.4, 1.5, -3.7, -5.86, -6.1, -4.04, -3.86, -3.24, -3.68, 2.62, 2.78, 1.68]

# Lesson 5. Loops in Python Exercise

## Challenge 1: Print Numbers in a list
The list below contains temperature values for a location in Boulder, Colorado. Create a for loop that loops through each value in the list and prints the value like this: `

temp: 47

In [11]:
# Data to convert to celsius

boulder_avg_high_temp_f = [
    47,
    49,
    57,
    64,
    72,
    83,
    89,
    87,
    79,
    67,
    55,
    47
]

boulder_avg_high_temp_f

[47, 49, 57, 64, 72, 83, 89, 87, 79, 67, 55, 47]

In [20]:
[ print("temp:", month) for month in boulder_avg_high_temp_f]

temp: 47
temp: 49
temp: 57
temp: 64
temp: 72
temp: 83
temp: 89
temp: 87
temp: 79
temp: 67
temp: 55
temp: 47


[None, None, None, None, None, None, None, None, None, None, None, None]

In [21]:
#for loop
for x in boulder_avg_high_temp_f:
    print("temp:", x)

temp: 47
temp: 49
temp: 57
temp: 64
temp: 72
temp: 83
temp: 89
temp: 87
temp: 79
temp: 67
temp: 55
temp: 47


# 2: Modify Numeric Values in a List
Below is a list of values that represents the average monthly high temperature in Boulder, CO., collected by NOAA. They are currently in Fahrenheit, but can be converted to Celsius by subtracting 32, and multiplying by 5/9.

#### celcius = (fahrenheit - 32) * 5/9

Create a new list with these same temperatures converted to Celsius using a for loop. Call your new list: **boulder_avg_high_temp_c**
    
HINT: to complete this challenge you may want to create a new empty list first. Then you can use list_name.append() in each loop iteration to add a new value to your list.

In [23]:
boulder_avg_high_temp_c = []
for fahrenheit in boulder_avg_high_temp_f:
    celcius = (fahrenheit - 32) * 5/9
    boulder_avg_high_temp_c.append(celcius)

print(boulder_avg_high_temp_c)

[8.333333333333334, 9.444444444444445, 13.88888888888889, 17.77777777777778, 22.22222222222222, 28.333333333333332, 31.666666666666668, 30.555555555555557, 26.11111111111111, 19.444444444444443, 12.777777777777779, 8.333333333333334]


## Challenge 3: Round Values In a List
Create a loop that rounds the values in the list that you created above: boulder_avg_high_temp_c to only two decimal places.

To round your data, you can use the Python function round(). The first argument in the round() function is the number to round, and the second argument is the number of decimals you want after it’s been rounded. See how this works below.

In [24]:
# exemple
c = 7.3848234
round(c, 2)

7.38

In [25]:
[ round( month, 2) for month in boulder_avg_high_temp_c]

[8.33,
 9.44,
 13.89,
 17.78,
 22.22,
 28.33,
 31.67,
 30.56,
 26.11,
 19.44,
 12.78,
 8.33]

## Challenge 4: Print A List of Directories
The code below creates a list of directories called all_dirs. Create a for loop that prints each directory name.

In [31]:
import os 
import pandas as pd
from glob import glob
import earthpy as et 

# Download data on average monthly temp for two California sites
file_url = "https://ndownloader.figshare.com/files/21894528"
out_path = et.data.get_data(url = file_url)


# Set working directory to earth-analytics
os.chdir(os.path.join(et.io.HOME, 
                      "earth-analytics", 
                      "data",
                      "earthpy-downloads"))

# Creating all_dirs list of directories to loop through

data_dirs = os.path.join(out_path, "*")
all_dirs = glob(data_dirs)


In [33]:
# files
for a_dir in all_dirs:
    dir_path = os.path.join(a_dir, "*")
    all_file_paths = (glob(dir_path))
    # Create a nested loop which loops through each directory
    for a_file_path in all_file_paths:
        print(a_file_path)
        # Read the file into a pandas dataframe and assign it to a variable
        temp_data_df = pd.read_csv(a_file_path)
temp_data_df

C:\Users\34639\earth-analytics\data\earthpy-downloads\avg-monthly-temp-fahr\San-Diego\San-Diego-1999-temp.csv
C:\Users\34639\earth-analytics\data\earthpy-downloads\avg-monthly-temp-fahr\San-Diego\San-Diego-2000-temp.csv
C:\Users\34639\earth-analytics\data\earthpy-downloads\avg-monthly-temp-fahr\San-Diego\San-Diego-2001-temp.csv
C:\Users\34639\earth-analytics\data\earthpy-downloads\avg-monthly-temp-fahr\San-Diego\San-Diego-2002-temp.csv
C:\Users\34639\earth-analytics\data\earthpy-downloads\avg-monthly-temp-fahr\San-Diego\San-Diego-2003-temp.csv
C:\Users\34639\earth-analytics\data\earthpy-downloads\avg-monthly-temp-fahr\Sonoma\Sonoma-1999-temp.csv
C:\Users\34639\earth-analytics\data\earthpy-downloads\avg-monthly-temp-fahr\Sonoma\Sonoma-2000-temp.csv
C:\Users\34639\earth-analytics\data\earthpy-downloads\avg-monthly-temp-fahr\Sonoma\Sonoma-2001-temp.csv
C:\Users\34639\earth-analytics\data\earthpy-downloads\avg-monthly-temp-fahr\Sonoma\Sonoma-2002-temp.csv
C:\Users\34639\earth-analytics\dat

Unnamed: 0,Year,January,February,March,April,May,June,July,August,September,October,November,December
0,2003,58.9,61.8,66.4,61.5,74.2,81.1,87,83.5,85,82.7,61,56.4


In [41]:
# dirs
for a_dir in all_dirs:
    print(a_dir)

C:\Users\34639\earth-analytics\data\earthpy-downloads\avg-monthly-temp-fahr\San-Diego
C:\Users\34639\earth-analytics\data\earthpy-downloads\avg-monthly-temp-fahr\Sonoma
