# Python Bootcamp Day 2:
## Arrays and Plotting
### Instructors: 

## Goals for the Day

- Review of Lists
- Arrays
    - Introduction to Numpy
    - Arrays
    - Numpy Functions
- Plotting
    - Introduction to Matplotlib
    - 1D arrays
    - 2D arrays
    - Histograms

## Notebook and Learning Structure

Any text in black will be instruction and guidance and will usually start with a section number.<br>
<span style="color:blue"> Any text in blue will be tasks to do and start with "Task".</span><br>
<span style="color:red"> Any text in red will be optional challenges and advanced concepts for anyone looking to try more and say "Challenge".</span>


## 1.0 How to use lists

Lists are used to store multiple variables in a single variable <br>
Empy lists can be initialised by either of the following: <br>
>x = list() <br>
>x=[ ]

Here is a good tutorial for lists if you want more: https://www.w3schools.com/python/python_lists.asp

In [None]:
x = [0,1,2,3,4] #create a list
print(type(x))

As we can see, x is a 'list'.<br>
We can get the length of our array with:
> len(x)

In [None]:
print('Our array has', len(x), 'elements') #print length of our array

You can easily access items in a list by 'indexing'<br>
__In python, indexing starts at 0.__ <br>
This means that the first item in our list is at index 0 and the second item is at index 1 and so on...

In [None]:
print('The first item in our list is: ', x[0]) #access the first element of our array (index = 0)

Lists are also nice because you can easily add items to them or remove items

In [None]:
x.append(5) #adds 5 to the end of the list
print(x)

What is the difference between these two methods to remove values from the list?

In [None]:
x.remove(1)
print(x)
del x[3]
print(x)

Lists can contain data of different data types:

In [None]:
x2 = ["Hi", 0, True, "no"]
print(x2)

To combine lists you can simply add them together:

In [None]:
x3 = x + x2
print(x3)

## 2.0 Arrays and Numpy

The bad thing about lists is that they can be slow to manipulate <br>
NumPy arrays are stored at one continuous place in memory unlike lists, so processes can access and manipulate them very efficiently.

We import the NumPy package with:<br>
>import numpy as np

A good numpy tutorial can be found herehere: https://www.w3schools.com/python/numpy/numpy_intro.asp

### 2.1 Numpy installation and arrays

In [None]:
import numpy as np

In [None]:
arr = np.array([0,1,2,3,4]) #initialize numpy array
print(type(arr))

To add values to the end of an array you can also use the 'append' method but it is implemented a bit differently: <br>
>np.append(arr, 6)

In [None]:
np.append(arr, 6)
print(arr)

What happened to our appended value?<br>
We need to 'reset' the array like so:<br>
>arr = np.append(arr, 6)

In [None]:
arr = np.append(arr, 6)
print(arr)

Here are a few other ways to initialize 1D arrays:

What is the difference between these two methods? Can you figure out what the (-10,150,20) means for each?

In [None]:
a1 = np.linspace(-10,150, 20)
print(a1)
a2 = np.arange(-10,150, 20)
print(a2)

We can also initialize an array full of zeros or ones:<br>
> arr = np.zeros(6) <br>
> arr = np.ones(6)

*The value in parenthesis (6) is the shape of the array that is being created

In [None]:
arr = np.zeros((2,3)) #What does the (2,3) mean?
print(arr)

In [None]:
print(arr.shape)

We can change the values in the array by indexing:

In [None]:
arr[0,1] = 3.14 #change the value in the first row, second column to 3.14
print(arr)

In [None]:
arr(0,1) = 3.14 #change the value in the first row, second column to 3.14
print(arr)

<span style="color:blue"> Now change all the rest of the values to numbers of your choosing!

We can flip our array up/down or left/right using:<br>
>np.flipud(arr) #flips array vertically <br>
>np.fliplr(arr) #flips array horizontally

In [None]:
arr_flipud = np.flipud(arr)
print(arr_flipud)
arr_fliplr = np.fliplr(arr)
print(arr_fliplr)

We can also reverse the dimensions of our array ('transpose'):<br>
> np.transpose(arr) <br>
> arr.T

In [None]:
print(np.transpose(arr))
print(arr.T)

Lets do some math with our array! Here are a few mathematical functions we can use with NumPy arrays: <br>
>np.mean(arr) #take mean of array <br>
>np.sin(arr) #take the sin of all array elements <br>
>np.log(arr) #take the natural log of all array elements <br>
>arr-2 #subtract 2 from all array elements <br>
>arr**2 #square all array elements

<span style="color:blue"> How do you think you might calculate the sum of all array elements>?

<span style="color:blue"> What does the following piece of code below do?

In [None]:
print(np.mean(arr, axis = 0))

### 2.2 Manipulating Arrays and Using Functions

In [None]:
#Some other calculations:
print(np.cos(arr)) #cosine of all array elements
print(np.std(arr)) #standard deviation of array
print(np.min(arr)) #minimum value of array
print(arr + 2) #all array values plus 2

We can also flatten our array into 1D:

In [None]:
arr_flat = arr.flatten()
print(arr_flat)

Converting between arrays and lists is easy!<br>
> arr.tolist() #convert numpy array --> list <br>
> np.array(list_name) #convert list --> numpy array 

In [None]:
arr_list = arr_flat.tolist()
print(arr_list)
print(type(arr_list))

In [None]:
arr_np = np.array(arr_list)
print(type(arr_np))

"Slicing" an array is a great way to only get part of the array!<br>
The __:__ operator slices an array between the indexes provided

In [None]:
print(arr_np)
print(arr_np[1:4]) #here were are slicing from the 1st to the 4th elements

We can also use 'negative' indexing to get the last element!

In [None]:
print(arr_np[-1]) #get last value of array

Let's try slicing on a 2d array:

In [None]:
arr = np.array([[1,2,3,4,5],[6,7,8,9,2], [3,6,8,4,9]]) #initialize a 2d array
print(arr)

You can also take every other value of an array like this:

In [None]:
arr = np.array([0,1,2,3,4,5,6,7,8,9,10])
print(arr[::2])

There are many ways we can create arrays filled with random values. One of them is with the np.random.rand() function

In [None]:
random_arr = np.random.rand(10) #initialize a random array (10 random numbers between 0 and 1)
print(random_arr)

Finally, lets talk about merging arrays! There are a number of ways that we could do this: <br>
>np.vstack() #combine arrays vertically <br>
>np.hstack() #combine arrays horizontally <br>
>np.append() #append one array to another <br>
>np.concatinate() <br>

In [None]:
a1 = np.array([0,1,2,3,4])
a2 = np.array([5,6,7,8,9])

a_vstack = np.vstack([a1,a2]) #notice that the arrays are within square brackets
print(a_vstack)

a_hstack = np.hstack([a1,a2]) #notice that the arrays are within square brackets
print(a_hstack)

a_append = np.append(a1, a2) #no sqaure brackets here
print(a_append)

a_concat = np.concatenate([a1,a2])
print(a_concat)

One final important NumPy thing: __null values__

Often when working with atmospheric science data there are periods where data is not availabe (due to instrument error or other reasons). In these instances you might see a 'NaN' value. This stands for 'not a number'. It is essentially a placeholder for when data is not available.

In [None]:
x = np.array([0,1,6,3,np.nan, 8,5])
print(x)

## <span style="color:blue"> Challenge #1:

- <span style="color:blue"> x1: Create an evenly spaced array of integers from 1 to 25. ie: [1,2,3...25]

- <span style="color:blue"> x2: Create another array of 25 random values

- <span style="color:blue"> x3: Subtract x2 from x1

- <span style="color:blue"> x4: Reverse x3

- <span style="color:blue"> x5: vertically combine x3 and x4

- <span style="color:blue"> x6: slice the array from the 4th to the 18th column, including all rows

- <span style="color:blue"> x7: Flatten the array

- <span style="color:blue"> x8: Take every 3rd value from x7

- <span style="color:blue"> x9 Calculate the mean of x6
    
- <span style="color:red"> x10 Look at the [Numpy Documentation](https://numpy.org/doc/1.24/user/quickstart.html) and try out 3 new functions

## 3.0 Plotting using Matplotlib

### 3.1 Importing the function and make a basic plot
We import the plotting package (matplotlib) with the following command:<br>
> import matplotlib.pyplot as plt

Matplotlib has two basic interfaces to create a plot. <br>
   > 1) Object Oriented <br>
   > 2) Pyplot model

In [None]:
import matplotlib.pyplot as plt #package to plot data

In [None]:
fig, ax = plt.subplots() #Creating a plot via the object oriented method

In [None]:
plt.plot() #creating a plot via the pyplot module

__We will use the first method here!__

To plot data on the axis we created we do: <br>
> ax.plot(data)

### 3.2 Add data

In [None]:
random_arr = np.random.rand(10) #create random array

fig, ax = plt.subplots()
ax.plot(random_arr) #plot data on the axis we created!

You can see we have the annoying [<matplotlib.lines.Line2D at 0x7fc6849f8c50>] line above the plot. To get rid of this we can add a semi-colon to the last line of our code:

In [None]:
fig, ax = plt.subplots()
ax.plot(random_arr);

Let's explore the matplot lib documentation a bit: <br>
https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.plot.html

<span style="color:blue"> Try changing the color and width of your line!

### 3.3: Axes

We can change the axis properties by calling the following functions: <br>
> ax.set_xlabel('label') #set x-axis label <br>
> ax.set_xlim([0,5]) #set the x-axis to only go between 0 and 5
> ax.set_xticks([0,2,5]) #only have x-ticks at the designated numbers

In [None]:
fig, ax = plt.subplots()
ax.plot(random_arr)
ax.set_ylabel('Random number', fontsize = 15) #change y-axis label

ax.set_xlim([0,8]) #set x-axis to go between 0 and 8
ax.ylim([0,1]) #set x-axis to go between 0 and 8

ax.set_yticks([0,0.1,0.8]); #change y-axis ticks

You can also plot 2 lines on the same plot

### 3.4: Add multiple lines

In [None]:
random_arr2 = np.random.rand(10)

fig, ax = plt.subplots()

ax.plot(random_arr, color = 'red', linestyle = 'dashed', label = 'line1')
ax.plot(random_arr2, color = 'blue', linewidth = 2, label = 'line2')
ax.legend();

Or we can give each line its own plot: <br>

> fig, ax = plt.subplots(2,3) #creates 2 rows of 3 subplots each

In [None]:
fig, ax = plt.subplots(1,2, figsize = (20,5))
ax[0].plot(random_arr)
ax[0].set_xlabel('x', fontsize = 16)
ax[0].set_ylabel('Random number', fontsize = 16)


ax[1].plot(random_arr2)
ax[1].set_xlabel('x', fontsize = 16)
ax[1].set_ylabel('Random number', fontsize = 16);

Or we can plot the data sets against eachother:

'.' specifies we want to plot the data with dots instead of lines

### 3.4 What if we don't want a line?

In [None]:
fig, ax = plt.subplots()
plt.plot(random_arr, random_arr2, '.', markersize = 10)
plt.xlabel('line1')
plt.ylabel('line2');

You can also create scatter plots with the scatter() function:

In [None]:
fig, ax = plt.subplots()
ax.scatter(random_arr, random_arr2);

### 3.5: Plotting 2D arrays

In [None]:
random_arr2d = np.random.rand(10,10) #10x10 array
print(random_arr2d)

In [None]:
fig, ax = plt.subplots()
ax.imshow(random_arr2d);

You can find other possible colormaps here: https://matplotlib.org/3.5.0/tutorials/colors/colormaps.html

And we add a colorbar with: <br>
> ax.colorbar(im), __where we have to set im equal to our plot__

In [None]:
fig, ax = plt.subplots()
im = ax.imshow(random_arr2d, cmap = 'inferno')
plt.colorbar(im);

### 3.6: Histograms

Histograms are another common plot type! Here is some sample code to create a histogram plot. You can explore histograms more later if you like!

In [None]:
rng = np.random.default_rng(12) #create random number generator with a fixed seed (12)
dist1 = rng.standard_normal(100000) #creates an normally distributed array of 100000 points

fig, axs = plt.subplots()
axs.hist(dist1, bins=20, color = 'red', alpha =0.5, edgecolor = 'maroon') ;
    #bins = 20 means we will have 20 bins for our data
    #color set to indianred
    #alpha changes the transparency
    #edgecolor outlines our bins


We can save any figure we create with the following command: <br>
>plt.savefig('fig_name.png')

##  <span style="color:blue"> Challenge #2:

-  <span style="color:blue">1: Create an array from 0 to 10 with a size of 100

-  <span style="color:blue">2: create a new array that is the sin of x

-  <span style="color:blue">3: create a new array that is equal to x^2

-  <span style="color:blue">3: On two different subplots, plot x and y, and x and y2.

-  <span style="color:red"> 5: Add axis labels, titles, and change the line properties for each subplot!