# 02 Exercise - Temperature Data in New York

In the time period 1981 - 2010, temperature data was recorded at Albany International Airport in New York. Below, you are given the record low temperatures (in Fahrenheit) for each month during this time period. You will apply what you have learned to draw conclusions about this small dataset.

In [1]:
# DON'T MODIFY THIS CELL, ONLY RUN IT.
record_low_list = [-28, -22, -21, 9, 25, 35, 40, 34, 24, 16, -11, -22]

The number 25 at index 4 in the list <b>record_low_list</b> indicates that the record low temperature in the fifth month (May) in the time period 1981 - 2010 was 25 degrees Fahrenheit (Brrr!).

## Exercise 1:  Importing NumPy

Import NumPy with the alias <b>np</b> so that we can start working with the data:

In [2]:
# Import NumPy
import numpy as np

## Exercise 2: Convert List to a NumPy Array

Convert the list called <b>record_low_list</b> into a NumPy array called <b>record_low_array</b>:

In [3]:
# Convert the list into a NumPy array
record_low_array = np.array(record_low_list)

record_low_array

array([-28, -22, -21,   9,  25,  35,  40,  34,  24,  16, -11, -22])

## Exercise 3: Working with Celsius

We would like to convert the data into Celsius (°C) instead of Fahrenheit (°F). The conversion formula is $$C = \frac{5}{9}(F - 32),$$
where $C$ denotes  degrees in Celsius and $F$ denotes degrees in Fahrenheit.
Convert the <b>record_low_array</b> into a new array that measures degrees in Celsius. Call the new array for <b>record_low_celsius</b>:

In [22]:
# Convert the data to Celsius (°C)
record_low_celsius = (5 / 9) * (record_low_array - 32)

[print(f'Fahrenheit {f:3} => Celsius {c:.2f}') for f, c in zip(record_low_array, record_low_celsius)]

Fahrenheit -28 => Celsius -33.33
Fahrenheit -22 => Celsius -30.00
Fahrenheit -21 => Celsius -29.44
Fahrenheit   9 => Celsius -12.78
Fahrenheit  25 => Celsius -3.89
Fahrenheit  35 => Celsius 1.67
Fahrenheit  40 => Celsius 4.44
Fahrenheit  34 => Celsius 1.11
Fahrenheit  24 => Celsius -4.44
Fahrenheit  16 => Celsius -8.89
Fahrenheit -11 => Celsius -23.89
Fahrenheit -22 => Celsius -30.00


[None, None, None, None, None, None, None, None, None, None, None, None]

## Exercise 4: Extract Values

It is important to be able to access information. Print out the record low degrees in Celsius in May and in September. You should now be working with the array <b>record_low_celsius</b>:

In [29]:
# Print out the record low degrees in Celsius in May and in September
print(f'Temperature in May: {record_low_celsius[4]:.2f} C')
print(f'Temperature in Sep: {record_low_celsius[8]:.2f} C')

Temperature in May: -3.89 C
Temperature in Sep: -4.44 C


## Exercise 5: Extract Multiple Values

We would like to save the three values corresponding to the spring months (March, April, and May) into a new array variable called <b>record_low_celsius_spring</b>. Do this by using slicing:

<i>Tip:</i> Remember that Python (and hence NumPy) counts from zero when indexing. Hence the value for March will have index value two, not three.

In [33]:
# Extract the values corresponding to the spring month.
record_low_celsius_spring = record_low_celsius[2:5]

[print(f'Temperature in {month}: {temp:.2f}') for month, temp in zip(['Mar', 'Apr', 'May'], record_low_celsius_spring)]

Temperature in Mar: -29.44
Temperature in Apr: -12.78
Temperature in May: -3.89


[None, None, None]

## Exercise 6: Sorting

Sometimes it is nice to have the values sorted after size. Make a new variable called <b>sorted_celsius</b> that has the values in <b>record_low_celsius</b> in sorted order (starting with the lowest, and ending with the largest value).

In [34]:
# Set up a new variable that is sorted
sorted_celsius = np.sort(record_low_celsius)

sorted_celsius

array([-33.33333333, -30.        , -30.        , -29.44444444,
       -23.88888889, -12.77777778,  -8.88888889,  -4.44444444,
        -3.88888889,   1.11111111,   1.66666667,   4.44444444])

## Exercise 7: Finding Maximum and Minimum Values

We often want to find maximum and minimum values in our data. Use the <b>max</b> and <b>min</b> methods to find the largest and smallest temperatures in <b>record_low_celsius</b>:

In [38]:
# Find the largest temperature
print(f'Largest temperature: {np.max(record_low_celsius):.2f} C')

# Find the smallest temperature
print(f'Smallest temperature: {np.min(record_low_celsius):.2f} C')

Largest temperature: 4.44 C
Smallest temperature: -33.33 C


## Exercise 8: Which Months do They Correspond to?

In the last exercise, you found that the largest and smallest temperature in <b>record_low_celsius</b>. But which months do they correspond to? Use the <b>argmax</b> and <b>argmin</b> methods to determine this:

In [37]:
# Find the month with the largest temperature
idx_largest_celsius = np.argmax(record_low_celsius)
print(f'Celsius largest temperature in index {idx_largest_celsius} with {record_low_celsius[idx_largest_celsius]:.2f} C')

# Find the month with the smallest temperature
idx_smallest_celsius = np.argmin(record_low_celsius)
print(f'Celsius smallest temperature in index {idx_smallest_celsius} with {record_low_celsius[idx_smallest_celsius]:.2f} C')

Celsius largest temperature in index 6 with 4.44 C
Celsius smallest temperature in index 0 with -33.33 C


## Exercise 9: Calculating Means

A value of importance is the mean (or average). Calculate the average lowest temperature in the whole year (in Celsius). You will need the <b>record_low_celsius</b> array and the <b>mean</b> method:

In [42]:
# Calculate the average lowest temperature
print(f'Average temperatures: {np.mean(record_low_celsius):.2f} C')

Average temperatures: -14.12 C


What is the average temperature in the spring?

<i>Hint:</i> Use the variable <b>record_low_celsius_spring</b> that you have defined earlier.

In [43]:
# Calculate the average lowest temperature in the spring
print(f'Average temperatures in spring: {np.mean(record_low_celsius_spring):.2f} C')

Average temperatures in spring: -15.37 C


## Exercise 10: Challenge

I want to find the month where the lowest temperature measured is closest to -10 degrees Celsius:

In [60]:
# Find value that is closest to -10
record_low_celsius_plus_ten = np.array([*record_low_celsius, 10])

np.argmin(np.abs(record_low_celsius_plus_ten))
print(record_low_celsius_plus_ten[np.argmin(np.abs(record_low_celsius_plus_ten))])

1.1111111111111112

<summary>
<i>Hints</i> (try to think on the problem before looking at the hints):
<details>
1) Add 10 to the array <b>record_low_celsius</b>. Now the problem has been changed to find the value in the new array that is closest to zero.

2) We are only interested in the distance from zero, not the sign of the number (whether it is positive or negative). Use the <b>abs</b> method to only consider positive numbers.

3) Use <b>argmin</b> to find the answer:
</details>
</summary>

## Moral of the Story

We've seen how to work with simple temperature data in NumPy. In the beginning, it is tempting to shrug off our achievements away by thinking that we can just look at the data, and then answer the questions. But this is only possible if the data is small.

Image that you were asked similar questions, but the temperatures was collection for each day over a period of 10 years. That is over 3500 values! The simple technique of "looking though the data with yours eyes" to find max/min/argmax/argmin/mean is not possible anymore (unless you want to spend your whole day scrolling). However, what we have learned in this section can easily handle these questions.

In the next section, Stine will teach you plotting and how this also can help us to find structure in data!