# Python NumPy

The `numpy` library (pronounced num - pie) is a library of numerical functions and tools that are useful in scientific computing, data science and machine learning. In particular, the library has a data structure called `array` which is really useful.

To import the library, you use the following code:
```
import numpy as np
```
The `as` keyword allows us to give the library an alias (in thise case `np`) which just means we don't have to type as many letters when we refer to it.

Let's import it first. We will then use some of its features to look at some local historical weather data.

In [None]:
# Import numpy library
import numpy as np

## Looking at Weather Data

The file "Denver_Daily_Temps_2025.csv" contains the daily high, low and average temperatures for Denver for 2025.The units are degrees celsius. These data are taken from the [Meteostat](https://meteostat.net/en/) website.

First, copy the file over from Github if you haven't already:

- Copy this URL: https://raw.githubusercontent.com/guyfrancis/dat1001/refs/heads/main/Denver_Daily_Temps_2025.csv
- Use **File > Open from URL...** and paste in the URL above.
- Before you run any Python code, take a look at the CSV file. You'll see that there are four columns: day, average temp (tavg), minimum temperate (tmin) and maximum temperature (tmax).

We are going to use numpy arrays to help us answer some basic questions about the temperature last year:
1. What was the highest temperature in deg F?
2. What was the lowest temperature in deg F?
3. What was the average temperature in deg F over the whole year?
4. What the daily temperature range (max - min) in deg F for each day of the year?
5. What was the maximum temperature range on any day and on what day did this occur?

We want answers in deg F, so we will need to convert the units for our data from deg C to deg F.


In [None]:
# Load Denver temperature data and set print options
file = "Denver_Daily_Temps_2025.csv"
den_temps_25 = np.loadtxt(file, usecols=(1, 2, 3), skiprows=1, delimiter=",")
np.set_printoptions(floatmode='fixed', precision=1)
print(den_temps_25)

### Reshape data
The data are currently organized into 365 rows of 1 x 3 arrays. Each row corresponds to one day. We are going to reshape these data into 3 rows of 365 days, with each row representing one type of temperature measurement: average, min and max. To do this we can **transpose** the data to reshape it.

In [None]:
print("Original shape of data:", den_temps_25.shape)
den_temps_25 = den_temps_25.T
# den_temps_25 = np.reshape(den_temps_25, (365,3))
print("New shape of data:", den_temps_25.shape)
print(den_temps_25)

### Convert units to deg F

To convert degrees Celsius to degrees Farenheit, we need to multiply every value by `9/5` and add `32`. With numpy arrays, this can be done in one line.

In [None]:
# Change units to deg F from deg C
den_temps_25 = 9/5*den_temps_25 + 32
print(den_temps_25)

### Create separate arrays for average, min and max
To make answering the questions easy and to make the code readable, we will create separate one-dimensional arrays for each temperature measurement.

In [None]:
tavg = den_temps_25[0]
tmin = den_temps_25[1]
tmax = den_temps_25[2]
print(tavg.shape, tmin.shape, tmax.shape)

Now we can answer the first three questions with a couple of lines of code.

In [None]:
# Find highest temperature for year
print(f"Highest temp in 2025 was {tmax.max():.1f} deg F.")

# Find the lowest temperature of the year
print(f"Lowest temp in 2025 was {tmin.min():.1f} deg F.")

# Find average temperature for the year by averaging the daily average temperatures
print(f"Average temp in 2025 was {tavg.mean():.1f} deg F.")

### Daily Temperature Ranges
We can create a new numpy array giving the daily temperature range by calculating `tmax - tmin` for each day in the year. We can then find the maximum temperature range and we can use the `numpy.where()` method to find what day this was.

In [None]:
# Create numpy array of daily temperature ranges
trange = tmax - tmin

# Find the greatest temperature range
print(f"Max daily temp range in 2025 was {trange.max():.1f} deg F.")

# Find what day this was on. Numpy.where returns a tuple for each sub-array containing a list of all locations where the value occurs, so to get
# just the first day when this temp range occurs we need to use the [0][0] index.
max_range_day = np.where(trange == trange.max())[0][0]
print(f"This happened on day {max_range_day} of the year.")

## Exercise 1

Let's use what we have learned to answer some more questions using a larger dataset.

The file "Denver_Daily_Temps_2021-25.csv" contains the daily high, low and average temperatures for Denver for the years 2021-2025, so from Jan 01 2021 to Dec 31 2025. The units are degrees celsius. These data are taken from the [Meteostat](https://meteostat.net/en/) website.

First, copy the file over from Github:

Copy this URL: https://raw.githubusercontent.com/guyfrancis/dat1001/refs/heads/main/Denver_Daily_Temps_2021-25.csv
Use File > Open from URL... and paste in the URL above.
Before you run any Python code, take a look at the CSV file. You'll see that there are 11 columns. We used columns 1, 2, 3, which cover average temp (tavg), minimum temperate (tmin), maximum temperature (tmax) as in the example above. Remember you will need to convert the temperature units to degrees Fahrenheit.

See if you can answer the questions below. 
1. What was the highest temperature during the 5-year period in deg F and when did it occur?
2. What was the lowest temperature in deg F and when did it occur?
3. What was the average temperature over the 5-year period?
4. What day in the 5-year period had the smallest temperature range, and what was it, in deg F?
5. (Challenge) Find the maximum temperature range in each year 2021, 2022, 2023, 2024 and 2025. Do these all occur roughly around the same time of year?

In [None]:
# Type your code here - add more code cells as needed


## Exercise 2

For this exercise, you can go straight to the [Meteostat](https://meteostat.net/en/) website to find some weather data for a time period and location of your choosing. download the CSV and use some Python code to analyze and describe your data along the lines of what we have done above. You don't have to restrict yourself to looking at temperature data - you can also use the windspeed and pressure data as well. 

In [None]:
# Type your code here


## Extension

If you complete all of these tasks, there is a separate NumPy extension activity on D2L.

In [13]:
# Type your code here
# Load Denver temperature data and set print options
import numpy as np
file = "Denver_Daily_Temps_2021-25.csv"
den_temps = np.loadtxt(file, usecols=(1, 2, 3, 7), skiprows=1, delimiter=",")
np.set_printoptions(floatmode='fixed', precision=1)
print(den_temps)

[[-1.5 -5.2  3.5  5.6]
 [-2.0 -7.7  4.9 10.0]
 [ 1.1 -4.6  8.2 15.1]
 ...
 [-3.0 -9.5  4.4  8.2]
 [ 0.7 -6.6  8.8  8.3]
 [ 4.6 -4.8 15.2  6.9]]


In [14]:
den_temps = den_temps.T

In [15]:
print(den_temps)

[[-1.5 -2.0  1.1 ... -3.0  0.7  4.6]
 [-5.2 -7.7 -4.6 ... -9.5 -6.6 -4.8]
 [ 3.5  4.9  8.2 ...  4.4  8.8 15.2]
 [ 5.6 10.0 15.1 ...  8.2  8.3  6.9]]


In [16]:
den_temps[0:3] = den_temps[0:3]*9/5 + 32

In [17]:
print(den_temps)

[[29.3 28.4 34.0 ... 26.6 33.3 40.3]
 [22.6 18.1 23.7 ... 14.9 20.1 23.4]
 [38.3 40.8 46.8 ... 39.9 47.8 59.4]
 [ 5.6 10.0 15.1 ...  8.2  8.3  6.9]]


In [18]:
tavg = den_temps[0]
tmin = den_temps[1]
tmax = den_temps[2]

In [21]:
print(tmax.max())
max_day = np.where(tmax == 99.14)

99.14


In [22]:
print(max_day)

(array([165]),)


In [24]:
print(tmin.min(), np.where(tmin == tmin.min()))

-5.440000000000005 (array([1110]),)


In [26]:
1110-3*365

15

In [28]:
print(tavg.mean())

52.148368017524646


In [29]:
trange = tmax - tmin
print(trange.max())

41.22


In [31]:
print(np.where(trange == trange.max()))

(array([1481]),)


In [32]:
print(1481 % 365)

21


In [34]:
windspeed = den_temps[3]*0.6214

In [37]:
print(windspeed[0:365].mean())

8.066793479452054


In [43]:
max_ranges = []
max_ranges.append((2021, trange[0:365].max(), np.where(trange==trange[0:365].max())))
max_ranges.append((2022, trange[365:730].max(), np.where(trange==trange[365:730].max())))
max_ranges.append((2023, trange[730:1095].max(), np.where(trange==trange[730:1095].max())))
max_ranges.append((2024, trange[1095:1461].max(), np.where(trange==trange[1095:1461].max())))
max_ranges.append((2025, trange[1461:1826].max(), np.where(trange==trange[1461:1826].max())))

In [44]:
print(max_ranges)

[(2021, np.float64(34.920000000000016), (array([172]),)), (2022, np.float64(39.06), (array([422]),)), (2023, np.float64(37.260000000000005), (array([699, 777]),)), (2024, np.float64(34.02), (array([1197]),)), (2025, np.float64(41.22), (array([1481]),))]


In [52]:
yr = 2021
yr_start = 0
while yr <= 2025:
    print("Year:", yr)
    yr_end = yr_start+365
    if yr==2024:
        yr_end+=1
    print(f"Max temperature range in {yr} was {trange[yr_start:yr_end].max():.2f} def F")
    when = np.where(trange[yr_start:yr_end]==trange[yr_start:yr_end].max())[0][0]
    print("This occurred on day:", when)
    yr+=1
    yr_start = yr_end

Year: 2021
Max temperature range in 2021 was 34.92 def F
This occurred on day: 172
Year: 2022
Max temperature range in 2022 was 39.06 def F
This occurred on day: 57
Year: 2023
Max temperature range in 2023 was 37.26 def F
This occurred on day: 47
Year: 2024
Max temperature range in 2024 was 34.02 def F
This occurred on day: 102
Year: 2025
Max temperature range in 2025 was 41.22 def F
This occurred on day: 20


In [None]:
for 