# Lab exercise, week 3 - numpy & matplotlib basics

Contents:
- Overview of numpy arrays
- Exercises for numpy
- Exercises for plotting

## NumPy - overview

Python lists are somewhat weird creatures. In contrast to basic array types in other languages like C# and Java, they can hold objects of different types and new elements can be inserted in the middle. NumPy arrays are much more like C# arrays - all elements have the same type.

In [None]:
# by convention numpy is always imported as np
import numpy as np

### Common ways of creating numpy arrays

In [None]:
a = np.array([5, 2, 17])  # Convert a Python list into a numpy array
a

array([ 5,  2, 17])

In [None]:
# List of lists gets converted into a 2D array
np.array([[5, 7, 2],
          [9, 4, 1]])


array([[5, 7, 2],
       [9, 4, 1]])

In [None]:
np.arange(5)  # Same as Python's range() but creates a numpy array

array([0, 1, 2, 3, 4])

In [None]:
np.zeros(5)  # Create a numpy array with five elements, all set to zero

array([0., 0., 0., 0., 0.])

In [None]:
c = np.ones(7)  # Create a numpy array with five elements, all set to 1
c


array([1., 1., 1., 1., 1., 1., 1.])

In [None]:
# Array of 6 random numbers between 0 and 1
np.random.rand(6)

array([0.14654318, 0.24300736, 0.74564537, 0.58652819, 0.39433085,
       0.94389319])

In [None]:
# array with 6 random integers between 0 and 100 (not including 100 as usual)
np.random.randint(100, size=6)

array([64, 97,  1, 69, 74, 72])

### Array properties

In [None]:
b = np.array([[5, 7, 2],
              [9, 4, 1]])
b

array([[5, 7, 2],
       [9, 4, 1]])

In [None]:
# Data type of the array
a.dtype

dtype('int64')

In [None]:
# number of dimensions
b.ndim

2

In [None]:
# array shape is a Python tuple, in this case it's (2, 3) because b is a 2 by 3 array.
b.shape

(2, 3)

In [None]:
# The total number of elements
b.size

6

In [None]:
# Note that zeros byt default uses the float64 data type
z = np.zeros(7)
z.dtype

dtype('float64')

In [None]:
# But data type can be set explicitly, almost all numpy functions that create arrays take an optional dtype parameter
# Let's set it to an 8 bit integer
z = np.zeros(7, dtype=np.int8)
z.dtype

dtype('int8')

## Exercises

Read section 2.2 of the book (The Basics of NumPy Arrays) and complete the tasks below.


### Numpy Array Creation

##### Convert a list into a numpy array

In [None]:
lst = [5, 3, 8, 4]
lst

[5, 3, 8, 4]

In [None]:
# Convert list to numpy array with clear variable name
import numpy as np
x = np.array(lst)
x

#### What happens when you multiply a list by 3, what a about an array multiplied by 3?

In [None]:
# Demonstrate list*3 vs array*3
lst_times3 = lst * 3  # concatenates list 3 times
arr_times3 = np.array(lst) * 3  # element-wise multiply
lst_times3, arr_times3

#### Create an array of 10 ones [1, 1, 1, ... ]

In [None]:
ones10 = np.ones(10, dtype=int)
ones10

#### Create an array of 10 fives [5, 5, 5, .... ]

In [None]:
fives10 = np.full(10, 5, dtype=int)
fives10

#### Create an array of the integers from 10 to 50 (including 50)

In [None]:
ints_10_to_50 = np.arange(10, 51)
ints_10_to_50

#### Create an array of 10 random numbers between 0 and 5

In [None]:
np.random.seed(42)
rand_0_5 = np.random.uniform(0, 5, size=10)
rand_0_5

#### Read the help for np.linspace function and create an array of 11 evenly spaced elements between 0 and 2
use `np.linspace?` or `?np.linspace` to show the help

In [None]:
# np.linspace creates evenly spaced numbers over an interval
lin_0_2_11 = np.linspace(0, 2, 11)
lin_0_2_11

#### Create a 3 by 4 array of ones

In [None]:
ones_3x4 = np.ones((3, 4), dtype=int)
ones_3x4

#### Create a 3 by 4 array of fives

In [None]:
fives_3x4 = np.full((3, 4), 5, dtype=int)
fives_3x4

### Array Indexing
Using the following arrays `a` and `m` for the questions below

In [None]:
a = np.arange(10,21)
a

array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20])

In [None]:
m = np.arange(1,22).reshape((3,7))
m

array([[ 1,  2,  3,  4,  5,  6,  7],
       [ 8,  9, 10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19, 20, 21]])

### Create an array containing...
#### the first 4 elements of a

In [None]:
a[:4]  # first 4 elements

#### the last 3 elements of a

In [None]:
a[-3:]  # last 3 elements

#### The middle elements of a from 15 to 18 inclusive

In [None]:
a[5:9]  # 15..18 inclusive since a = 10..20

#### The first column of m

In [None]:
m[:, 0]  # first column

#### The middle row of m

In [None]:
m[1, :]  # middle row (row index 1)

#### The left 3 columns of m

In [None]:
m[:, :3]  # left 3 columns

#### The bottom-right 2 by 2 square

In [None]:
m[-2:, -2:]  # bottom-right 2x2 square

#### (bonus) every other element of a  

In [None]:
a[::2]  # every other element

### Array Math

#### Subtract 5 from each element of a

In [None]:
(a - 5)  # subtract 5 from each element

#### Create an array containing squares of all numbers from 1 to 10 (inclusive)

In [None]:
np.arange(1, 11) ** 2  # squares from 1 to 10

#### Create an array containing all powers of 2 from $2^0$ to $2^{10}$ (inclusive)

In [None]:
2 ** np.arange(0, 11)  # powers of two 2^0..2^10

#### Same as above (powers of two), but subtract one from each element, that is $a_k = 2^k - 1$

In [None]:
2 ** np.arange(0, 11) - 1  # 2^k - 1

### Bonus task
Write code that lists all available dtypes with specified number of bits

In [None]:
# List available NumPy dtypes with bit width (common ones)
dtypes = [np.int8, np.int16, np.int32, np.int64,
          np.uint8, np.uint16, np.uint32, np.uint64,
          np.float16, np.float32, np.float64]
[(dt.__name__, np.dtype(dt).itemsize * 8) for dt in dtypes]

## Plotting Basics
 - Use [this tutorial](https://matplotlib.org/users/pyplot_tutorial.html) as reference when (if) you get stuck
 - Execute the cells with imports, otherwise you won't have numpy and matplotlib imported and Python will complain

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [None]:
# Load weather data from CSV
weather = pd.read_csv('/mnt/data/OshawaWeather2016.csv')
weather.head()

In [None]:
weather.shape

In [None]:
weather.dtypes

#### Crate separate 1D arrays for each column (e.g: day, maxt, mint and so on)

In [None]:
# Create 1D arrays from each column for easy NumPy operations
_day = weather['Day'].to_numpy()
_tmax = weather['Max Temp (°C)'].to_numpy()
_tmin = weather['Min Temp (°C)'].to_numpy()
_tmean = weather['Mean Temp (°C)'].to_numpy()
_precip = weather['Total Precip (mm)'].to_numpy()
(_day[:5], _tmax[:5], _tmin[:5], _tmean[:5], _precip[:5])

#### Plot the minimum temperature as a function of day number

In [None]:
# Min temp vs day
plt.figure()
plt.plot(_day, _tmin, label='Min Temp (°C)')
plt.xlabel('Day of Year')
plt.ylabel('Temperature (°C)')
plt.title('Minimum Temperature vs Day')
plt.legend()
plt.grid(True)
plt.show()

#### Plot the max temperature in degrees Fahrenheit

In [None]:
# Max temperature in Fahrenheit vs day
_tmax_f = (_tmax * 9/5) + 32
plt.figure()
plt.plot(_day, _tmax_f, label='Max Temp (°F)')
plt.xlabel('Day of Year')
plt.ylabel('Temperature (°F)')
plt.title('Maximum Temperature (F) vs Day')
plt.legend()
plt.grid(True)
plt.show()

#### Plot the difference between the maximum and the minimum day temperature as a function of day number

In [None]:
# Difference between max and min vs day
_tdiff = _tmax - _tmin
plt.figure()
plt.plot(_day, _tdiff, label='Daily Temp Range (°C)')
plt.xlabel('Day of Year')
plt.ylabel('Δ Temp (°C)')
plt.title('Daily Temperature Range')
plt.legend()
plt.grid(True)
plt.show()

#### Plot both, the minimum and maximum temperature on the same figure

In [None]:
# Plot min and max on same figure
plt.figure()
plt.plot(_day, _tmin, label='Min Temp (°C)')
plt.plot(_day, _tmax, label='Max Temp (°C)')
plt.xlabel('Day of Year')
plt.ylabel('Temperature (°C)')
plt.title('Min and Max Temperatures')
plt.legend()
plt.grid(True)
plt.show()

#### Add axis labels and a legend to the plot above
Use the following functions:
 - plt.xlabel
 - plt.ylabel
 - plt.title
 - plt.legend

#### Plot the precipitation as a function of max temperature
You probably don't want a line connecting the dots.
Are there any patterns you can spot?

In [None]:
# Precipitation vs max temperature (scatter)
plt.figure()
plt.scatter(_tmax, _precip)
plt.xlabel('Max Temp (°C)')
plt.ylabel('Total Precip (mm)')
plt.title('Precipitation vs Max Temperature')
plt.grid(True)
plt.show()

#### Plot the precipitation as a function of (t_max - t_min)

In [None]:
# Precipitation vs (t_max - t_min) (scatter)
plt.figure()
plt.scatter(_tdiff, _precip)
plt.xlabel('Temp Range (°C)')
plt.ylabel('Total Precip (mm)')
plt.title('Precipitation vs Temperature Range')
plt.grid(True)
plt.show()

#### Read about plt.hist() function and plot a histogram of the maximum temperatures

In [None]:
# Histogram help (left as reference)
# plt.hist?
# Now plot histogram of maximum temperatures
plt.figure()
plt.hist(_tmax, bins=20)
plt.xlabel('Max Temp (°C)')
plt.ylabel('Count')
plt.title('Histogram of Max Temperatures')
plt.show()

#### Plot a histogram of the differences between the min and max temperature

In [None]:
# Histogram of differences (t_max - t_min)
plt.figure()
plt.hist(_tdiff, bins=20)
plt.xlabel('Temp Range (°C)')
plt.ylabel('Count')
plt.title('Histogram of Daily Temperature Range')
plt.show()

#### For each day calculate the average of the 3 temperatures in the data (min max and avg)


In [None]:
# Average of min, max and mean for each day
_avg3 = (_tmin + _tmax + _tmean) / 3.0
_avg3[:10]

#### Calculate the total amount of precipitation for the whole year
Does the number seem reasonable? The annual average precipitation in Toronto is 831 mm according to https://en.wikipedia.org/wiki/Geography_of_Toronto

In [None]:
# Total precipitation for the year
_total_precip = _precip.sum()
_total_precip

#### Calculate the total precipitation on all odd numbered days (day 1, 3, 5 and so on)

In [None]:
# Total precipitation on odd-numbered days (1,3,5,...)
odd_mask = (_day % 2 == 1)
_precip[odd_mask].sum()