# Lab exercise, week 3 - numpy & matplotlib basics

Contents:
- Overview of numpy arrays
- Exercises for numpy
- Exercises for plotting

## NumPy - overview

Python lists are somewhat weird creatures. In contrast to basic array types in other languages like C# and Java, they can hold objects of different types and new elements can be inserted in the middle. NumPy arrays are much more like C# arrays - all elements have the same type.

In [1]:
# by convention numpy is always imported as np
import numpy as np

### Common ways of creating numpy arrays

In [3]:
a = np.array([5, 2, 17])  # Convert a Python list into a numpy array
a

array([ 5,  2, 17])

In [4]:
# List of lists gets converted into a 2D array
np.array([[5, 7, 2],
          [9, 4, 1]])


array([[5, 7, 2],
       [9, 4, 1]])

In [6]:
np.arange(5)  # Same as Python's range() but creates a numpy array

array([0, 1, 2, 3, 4])

In [7]:
np.zeros(5)  # Create a numpy array with five elements, all set to zero

array([0., 0., 0., 0., 0.])

In [8]:
c = np.ones(7)  # Create a numpy array with five elements, all set to 1
c


array([1., 1., 1., 1., 1., 1., 1.])

In [9]:
# Array of 6 random numbers between 0 and 1
np.random.rand(6)

array([0.14654318, 0.24300736, 0.74564537, 0.58652819, 0.39433085,
       0.94389319])

In [10]:
# array with 6 random integers between 0 and 100 (not including 100 as usual)
np.random.randint(100, size=6)

array([64, 97,  1, 69, 74, 72])

### Array properties

In [11]:
b = np.array([[5, 7, 2],
              [9, 4, 1]])
b

array([[5, 7, 2],
       [9, 4, 1]])

In [12]:
# Data type of the array
a.dtype

dtype('int64')

In [13]:
# number of dimensions
b.ndim 

2

In [14]:
# array shape is a Python tuple, in this case it's (2, 3) because b is a 2 by 3 array.
b.shape

(2, 3)

In [15]:
# The total number of elements
b.size 

6

In [16]:
# Note that zeros byt default uses the float64 data type
z = np.zeros(7)
z.dtype

dtype('float64')

In [17]:
# But data type can be set explicitly, almost all numpy functions that create arrays take an optional dtype parameter
# Let's set it to an 8 bit integer
z = np.zeros(7, dtype=np.int8)  
z.dtype

dtype('int8')

## Exercises

Read section 2.2 of the book (The Basics of NumPy Arrays) and complete the tasks below.


### Numpy Array Creation

##### Convert a list into a numpy array

In [113]:
lst = [5, 3, 8, 4]
lst

[5, 3, 8, 4]

In [None]:
import numpy as np
x = np.array(lst)
x, type(x), x.dtype

#### What happens when you multiply a list by 3, what a about an array multiplied by 3?

In [109]:
# List * 3 repeats the list; array * 3 multiplies each element by 3
lst_times_3 = lst * 3
arr_times_3 = np.array(lst) * 3
lst_times_3, arr_times_3

#### Create an array of 10 ones [1, 1, 1, ... ]

In [None]:
np.ones(10, dtype=int)

#### Create an array of 10 fives [5, 5, 5, .... ]

In [None]:
np.full(10, 5, dtype=int)

#### Create an array of the integers from 10 to 50 (including 50)

In [None]:
np.arange(10, 51)

#### Create an array of 10 random numbers between 0 and 5

In [None]:
np.random.uniform(0, 5, size=10)

#### Read the help for np.linspace function and create an array of 11 evenly spaced elements between 0 and 2
use `np.linspace?` or `?np.linspace` to show the help

In [None]:
# Help for linspace (uncomment in Jupyter to show doc)
# np.linspace?
# Create 5 points between 0 and 2 inclusive
np.linspace(0, 2, 5)

#### Create a 3 by 4 array of ones

In [None]:
np.ones((3,4), dtype=int)

#### Create a 3 by 4 array of fives

In [None]:
np.full((3,4), 5, dtype=int)

### Array Indexing
Using the following arrays `a` and `m` for the questions below

In [19]:
a = np.arange(10,21)
a

array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20])

In [20]:
m = np.arange(1,22).reshape((3,7))
m

array([[ 1,  2,  3,  4,  5,  6,  7],
       [ 8,  9, 10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19, 20, 21]])

### Create an array containing... 
#### the first 4 elements of a

In [125]:
a[:4]

array([10, 11, 12, 13])

#### the last 3 elements of a

In [None]:
a[-3:]

#### The middle elements of a from 15 to 18 inclusive

In [None]:
a[(a>=15) & (a<=18)]  # or a[5:9]

#### The first column of m

In [None]:
m[:, 0]

#### The middle row of m

In [None]:
m[1, :]  # middle row (row index 1)

#### The left 3 columns of m

In [130]:
m[:, -1]

array([[ 1,  2,  3],
       [ 8,  9, 10],
       [15, 16, 17]])

#### The bottom-right 2 by 2 square

In [131]:
m[0, -2:]

array([[13, 14],
       [20, 21]])

#### (bonus) every other element of a  

In [132]:
m[:-1, 1:3]

array([10, 12, 14, 16, 18, 20])

### Array Math

m[::2, ::2]  # every second element in both dims

array([ 5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15])

#### Create an array containing squares of all numbers from 1 to 10 (inclusive)

In [None]:
np.arange(1, 11) ** 2

#### Create an array containing all powers of 2 from $2^0$ to $2^{10}$ (inclusive)

In [None]:
2 ** np.arange(0, 11)

#### Same as above (powers of two), but subtract one from each element, that is $a_k = 2^k - 1$

In [None]:
2 ** np.arange(0, 11) - 1

### Bonus task
Write code that lists all available dtypes with specified number of bits 

In [104]:
# List common integer and float dtypes with their itemsize (in bits)
dtypes = [np.int8, np.int16, np.int32, np.int64, np.uint8, np.uint16, np.uint32, np.uint64,
          np.float16, np.float32, np.float64]
[(dt.__name__, np.dtype(dt).itemsize * 8) for dt in dtypes]

numpy.int8

## Plotting Basics
 - Use [this tutorial](https://matplotlib.org/users/pyplot_tutorial.html) as reference when (if) you get stuck
 - Execute the cells with imports, otherwise you won't have numpy and matplotlib imported and Python will complain

In [1]:
import numpy as np
import matplotlib.pyplot as plt

In [4]:
# Load weather data as float array; columns: Day, MaxC, MinC, MeanC, Precip
weather = np.loadtxt("OshawaWeather2016.csv", delimiter=',', skiprows=1)
weather[:5]

In [6]:
weather.shape

(349, 5)

In [7]:
weather.dtype

dtype('float64')

#### Crate separate 1D arrays for each column (e.g: day, maxt, mint and so on)

In [None]:
day = weather[:, 0].astype(int)
max_c = weather[:, 1]
min_c = weather[:, 2]
mean_c = weather[:, 3]
precip = weather[:, 4]
(day[:5], max_c[:5], min_c[:5], mean_c[:5], precip[:5])

#### Plot the minimum temperature as a function of day number

In [None]:
import matplotlib.pyplot as plt
plt.figure()
plt.plot(day, min_c)
plt.xlabel('Day of Year (2016)')
plt.ylabel('Min Temp (°C)')
plt.title('Daily Minimum Temperature')
plt.show()

#### Plot the max temperature in degrees Fahrenheit

In [None]:
max_f = max_c * 9/5 + 32
plt.figure()
plt.plot(day, max_f)
plt.xlabel('Day of Year (2016)')
plt.ylabel('Max Temp (°F)')
plt.title('Daily Maximum Temperature (°F)')
plt.show()

#### Plot the difference between the maximum and the minimum day temperature as a function of day number

In [None]:
diff_c = max_c - min_c
plt.figure()
plt.plot(day, diff_c)
plt.xlabel('Day of Year (2016)')
plt.ylabel('Diurnal Range (°C)')
plt.title('Max - Min Temperature by Day')
plt.show()

#### Plot both, the minimum and maximum temperature on the same figure

In [None]:
plt.figure()
plt.plot(day, min_c, label='Min °C')
plt.plot(day, max_c, label='Max °C')
plt.xlabel('Day of Year (2016)')
plt.ylabel('Temperature (°C)')
plt.title('Daily Min and Max Temperatures')
plt.legend()
plt.show()

#### Add axis labels and a legend to the plot above
Use the following functions:
 - plt.xlabel
 - plt.ylabel
 - plt.title
 - plt.legend

#### Plot the precipitation as a function of max temperature
You probably don't want a line connecting the dots.
Are there any patterns you can spot?

In [None]:
# Aggregate precipitation by month (2016 is leap year)
month_lengths = np.array([31,29,31,30,31,30,31,31,30,31,30,31])
month_edges = np.concatenate([[0], np.cumsum(month_lengths)])
# Map each day index to month number 1..12 (assuming Day 0 = Jan 1)
month = np.zeros_like(day)
for i in range(12):
    start, end = month_edges[i], month_edges[i+1]
    mask = (day >= start) & (day < end)
    month[mask] = i+1
# Total precip per month
monthly_precip = np.array([precip[month==m].sum() for m in range(1,13)])
plt.figure()
plt.plot(np.arange(1,13), monthly_precip, marker='o')
plt.xlabel('Month (1=Jan ... 12=Dec)')
plt.ylabel('Total Precipitation (mm)')
plt.title('Monthly Precipitation (2016)')
plt.show()
monthly_precip

#### Plot the precipitation as a function of (t_max - t_min)

In [None]:
plt.figure()
plt.scatter(diff_c, precip)
plt.xlabel('Max - Min (°C)')
plt.ylabel('Precipitation (mm)')
plt.title('Precipitation vs. Daily Temperature Range')
plt.show()

#### Read about plt.hist() function and plot a histogram of the maximum temperatures

In [None]:
# plt.hist?  # help
plt.figure()
plt.hist(max_c, bins=20)
plt.xlabel('Max Temp (°C)')
plt.ylabel('Frequency (days)')
plt.title('Histogram of Daily Max Temperatures')
plt.show()

#### Plot a histogram of the differences between the min and max temperature

In [None]:
plt.figure()
plt.hist(diff_c, bins=20)
plt.xlabel('Max - Min (°C)')
plt.ylabel('Frequency (days)')
plt.title('Histogram of Daily Temperature Range')
plt.show()

#### For each day calculate the average of the 3 temperatures in the data (min max and avg)


In [None]:
avg_of_three = (max_c + min_c + mean_c) / 3.0
avg_of_three[:10]

#### Calculate the total amount of precipitation for the whole year
Does the number seem reasonable? The annual average precipitation in Toronto is 831 mm according to https://en.wikipedia.org/wiki/Geography_of_Toronto

In [None]:
# Total precip overall and in winter months (Dec, Jan, Feb)
total_precip = precip.sum()
# Winter months: Dec(12), Jan(1), Feb(2)
winter_mask = (month == 12) | (month == 1) | (month == 2)
total_winter_precip = precip[winter_mask].sum()
(total_precip, total_winter_precip)

#### Calculate the total precipitation on all odd numbered days (day 1, 3, 5 and so on)

In [None]:
odd_precip_total = precip[day % 2 == 1].sum()
odd_precip_total