### This notebook goes over the fundamentals of Python arrays, specifically Numerical Python N-D Arrays (NumPy arrays).
#### Portions of this notebook were adapted from lessons and images in the [NumPy: the absolute basics for beginners tutorial](https://numpy.org/devdocs/user/absolute_beginners.html) on numpy.org. Users should refer to the tutorial linked above for more in-depth lessons. The bare minimum that is needed to understand course material is presented here.

__Author__: Dr. Beadling. For any questions regarding the contents of this notebook please contact rebecca.beadling@temple.edu.

### You should be __entering__ this lesson with ... 
* Knowledge of the fundamentals of the structure of a Jupyter notebook. 
* Knowledge of Python fundamentals including syntax, variable assignemnt, data types, indexing, arthmetic, control flow, and functions.
* Knowledge of how to use git status, git commit, git push.

### You should be __leaving__ this lesson with ...
* An understanding of what an array is in Python.
* An understanding of how to create arrays, index arrays, and apply operations across arrays (arithmetic with multiple arrays + broadcasting arrays using scalar values).
* Basic arithmetic and array operations 
* How to write basic mathematical equations using NumPy.
* A newly created numpy array saved out as a .csv file with Philadelphia Hourly temperatures.

In [None]:
import numpy as np
import matplotlib.pyplot as plt

In [None]:
%matplotlib inline                           
%config InlineBackend.figure_format='retina' 
plt.rcParams['figure.figsize'] = 12,6   

#### In previous lessons in this unit we encountered many different data types within the Python programming language. Some of these data types included __containers__, i.e., data structures that could hold a number of elements at the same time.

#### <span style="color:red"> What data types have we worked with that you could classify as being containers? Which of them are __ordered__ ? Put your answer in the cell below.

#### In all of the examples of data types that we would consider __containers__ so far, the elements inside of the containers are allowed to be of _any data type_ and can have __multiple data types__. For example:

In [None]:
colleges={"2016-2018":"Alabama",
          "2019":"Oklahoma"}

In [None]:
qb_stats = ['Jalen Hurts', 1, colleges, 1.85]

In [None]:
qb_stats

In [None]:
type(qb_stats)

In [None]:
type(qb_stats[0]), type(qb_stats[1]), type(qb_stats[2]), type(qb_stats[3])

In [None]:
for x in qb_stats:
    dtype = type(x)
    print(dtype), print(x)

#### An array is also an ordered container, but it is fundamentally different from `lists`, `dicts`, and `tuples`, in that every single element inside of the array has to be of the __same data type__. You can have an array that contains only floats, ints, or only any other Python data type, __but you cannot mix and match!__ 

#### This requirement is very powerful, because it means that we can perform operations across all the elements of the array without running into errors (if we are of course using an appropriate operation). Arrays are not built in data structures, so we must import specific python packages to create and work with them.

#### One of the most common Array structures you will work with in Python when performing numerical calculations or working with numerical data is a __NumPy Array__, or a "N-dimensional array". We work with NumPy arrays by importing and calling the __Numerical Python (NumPy)__ package. Recall that we imported this package as np at the start of this notebook.

### From [the Numpy user guide](https://numpy.org/doc/stable/user/absolute_beginners.html):
"NumPy (Numerical Python) is an open source Python library that’s __used in almost every field of science and engineering__. It’s the universal standard for working with numerical data in Python, and it’s at the core of the scientific Python and PyData ecosystems. NumPy users include everyone from beginning coders to experienced researchers doing state-of-the-art scientific and industrial research and development. The NumPy API is used extensively in Pandas, SciPy, Matplotlib, scikit-learn, scikit-image and most other data science and scientific Python packages."

![](https://upload.wikimedia.org/wikipedia/commons/thumb/3/31/NumPy_logo_2020.svg/1200px-NumPy_logo_2020.svg.png)

### Is your head spinning with "n-dimensional array" ??? IT'S OKAY, STAY WITH ME!
### A N-dimensional array is simply an array with __any number__ of dimensions.

#### You can think about a ndarray (n-dimensional array) as a __grid__ of information. The information in this __grid__ can be accessed via indexing and slicing, just like the `lists` that we encountered earlier.

### The image below shows arrays visualized as __grids__: 
* #### A __1D array__: information only 1 dimension. You will hear 1D arrays referred to as __vectors__. _(in more advanced courses)_
> #### Example: change in temperature over time (a "__timeseries__") where the only dimension (axis) of the array is time.
* #### A __2D array__: information in 2 dimensions. You will hear 2D arrays referred to as __matrices__. _(in more advanced courses)_
> #### Example: a map of sea surface temperature, where the dimensions (axes) of the array are latitude x longitude.
* #### A __3D array__: information in 3 dimensions. You will hear 2D arrays referred to as __tensors__. _(in more advanced courses)_
> #### Example: global ocean temperature dataset, where the dimensions (axes) of the array are latitude x longitude x ocean depth. 
Now if we our global ocean temperature dataset was changing with __time__, our dimensions (axes) would be latitude x longitude x ocean depth x time. This would be a __4D array__.

#### The analysis of observational datasets and climate model results requires the ability to work with numerical data stored in 1D to 4D arrays.


![](https://predictivehacks.com/wp-content/uploads/2020/08/numpy_arrays.png)

![](https://cac.cornell.edu/myers/teaching/ComputationalMethods/python/anatomyarray.png)

#### Lets start with a simple 1D array (vector). We can create any 1D np.ndarray using the following syntax:
* #### array_variable = np.array([element1,element2, .... element n])

In [None]:
array1d = np.array([1,2,3]) # create a variable called array1d equal to a numpy array with the contents ([])

In [None]:
array1d

#### Now printing out the data type (shown using two different methods below) shows us that this new variable we created is a numpy.ndarray:

In [None]:
### the \ symbol allows us to break the line and continue it below.
type(array1d),\
array1d.dtype

#### We can use the `.shape` and `.ndim`, methods to print out the shape of the array and the number of dimensions in the array:
* #### `.shape:` the number of elements along __each axis__. 
The results are returned as (#of elements along axis 0, # of elements along axis 1, ... # of elements along axis n). See the example in the "anatomy of an array" diagram above. The array has a shape of 8 elements along axis 0, and 3 along axis 1, returning a shape of (8,3). 
* #### `.dims:` the number of dimensions in the array. 
A 1D array will return a result of 1, a 2D array will return a result of 2, and nD array will return a result of n.

#### The result of applying these methods to our array1d below shows us that this is a 1D array with 3 elements along its first axis [0].

In [None]:
array1d

In [None]:
array1d.shape

In [None]:
array1d.ndim

#### Numpy has methods of `.ones()`, `.zeros()` , and `random.rand()` that creates an array filled with ones, zeros, or random numbers of a user specified shape. This functionality can be useful in certain applications, we will use it here to create ndarrays of various shapes in the examples below:

In [None]:
arrayones_1d = np.ones(5)  # a 1D array, with 5 elements over the first dimension
arrayones_1d

In [None]:
arrayzeros_1d = np.zeros(5)  # a 1D array, with 5 elements over the first dimension
arrayzeros_1d

In [None]:
arrayzeros_1d = np.random.rand(5)  # a 1D array, with 5 elements over the first dimension
arrayzeros_1d

#### What if we wanted to create a 2D array of ones, zeros, or random numbers?

In [None]:
arrayones_2d = np.ones([5,1]) ##  # a 2D array, with 5 elements over the first (rows) and second (columns) dimensions
arrayones_2d

In [None]:
arrayzeros_2d = np.zeros([5,5]) ## this is a 5 X 5 array
arrayzeros_2d

In [None]:
arrayrandom_2d = np.random.rand(5,5) ## this is a 5 X 5 array
arrayrandom_2d

#### <span style="color:red"> Play around with changing the dimension sizes in the examples above to get a handle on how the arrays change. In the cell below, using np.ones and np.zeros, create a 2D array with 2 rows and 3 columns.

#### <span style="color:red"> What if we wanted to create a 3D or 4D array of ones or zeros? Use the cells below to do so:

#### We can index and slice arrays just like we learned for `lists`. Let's return to the 1D array we created at the start: `array1d`

In [None]:
array1d

#### <span style="color:red"> How would you extract the first element in array1d? What about the last? What about if we wanted to extract the number2? For practice, for each one, in the cells below, show how you would do it using positive AND negative indexing.

In [None]:
array1d[0]

#### <span style="color:red"> How would you extract the first TWO elements in array1d? 

In [None]:
array1d[0:2]

### We can visualize this as shown below:

![](https://numpy.org/devdocs/_images/np_indexing.png)

### Basic arithmetic and operations on arrays: 1D

#### Arithmetic operators on arrays apply __elementwise__ (i.e., are applied to __element to element__). A new array is created and filled with the result.

In [None]:
arr1 = np.array([10,20,30,40,50])
arr2 = np.array([30,30,30,30,30])

In [None]:
arr1 + arr2   # elementwise addtion of arr1 and arr2: arr1[0] + arr2[0] ... and so on.

In [None]:
arr1 - arr2   # elementwise subtraction of arr1 and arr2: arr1[0] - arr2[0] ... and so on.

In [None]:
arr1 * arr2     # elementwise product of arr1 and arr2: arr1[0] * arr2[0] ... and so on.

In [None]:
arr1 / arr2     # elementwise floor division of arr1 and arr2: arr1[0] / arr2[0] ... and so on.

In [None]:
arr2 > arr1   # elementwise boolean operations of arr1 and arr2: arr1[0] > arr2[0] ... and so on.

In [None]:
arr1 ** arr2  # elementwise exponentiation .... element of arr1 raise to the power of the element of arr1: arr1[0] ** arr2[0] ... and so on.

### Elementwise arithmetic is visualized in the graphics below:

![](https://numpy.org/doc/stable/_images/np_array_dataones.png)
![](https://numpy.org/doc/stable/_images/np_data_plus_ones.png)
![](https://numpy.org/doc/stable/_images/np_sub_mult_divide.png)

### You can perform basic operations to elements within a single array or over a single axis of a multi-dimensional array:

#### The following methods return a single (scalar) value:
* `.sum()` : find the sum of the elements in an array (i.e., arr[0] + arr[1] + .... arr[n])
* `.mean()` : fine the arithmetic mean (average) of the elements in the array (i.e., (arr[0] + arr[1] + .... arr[n]) / n number of elements)
* `.std()` : fine the standard deviation the elemnts in the array.
* `.min()` : find the element of the minimum (lowest) value in the array.
* `.max()` : find the element of the maximum (greatest) value in the array.

#### The following methods return an array with the resulting values:
* `np.square(array)`: square all elements within the array, return array with squared values.
* `np.sqrt(array)`: square root all elements within the array, return array with the square root of values.
* `np.cos(array)`: apply cosine to all elements within the array, return array with the computed values.
* `np.sin(array)`: apply sine to all elements within the array, return array with the computed values.

In [None]:
arr1.sum()

In [None]:
arr1.mean()

In [None]:
np.square(arr1)

In [None]:
np.sqrt(arr2)

![](https://numpy.org/doc/stable/_images/np_aggregation.png)

#### <span style="color:red"> In the cells below, create a 1D array and practice using the `.sum()`, `.mean()`, `.std()`, `.min()`, `.max()`. Make sure you have a cell for each operation and display the output. Discuss any challenges in your understanding with those around you.

#### <span style="color:red"> In the cells below, create a 1D array and practice using the `np.square(array)`, `np.sqrt(array)`, `np.cos(array)`, `np.sin(array)`. Make sure you have a cell for each operation and display the output. Discuss any challenges in your understanding with those around you.

### __Broadcasting__: carrying out an operation between __an array and a single number__ (scalar). NumPy understands that the operation must be done __for each cell__.

#### Say you have an array that contains data regarding distance in miles, but you wish to convert this data into km for your calculation: 
* 1 mile = 1.6 km.

In [None]:
data_in_miles = np.array([1.0, 2.0])
data_in_km = data_in_miles*1.6
data_in_km

![](https://numpy.org/doc/stable/_images/np_multiply_broadcasting.png)

### <span style="color:red"> In the cells below: 
#### <span style="color:red">  1. Create a 1D array of today's hourly temperatures in Philadelphia which you can find using __this [link](https://weather.com/weather/hourbyhour/l/Philadelphia+PA?canonicalCityId=aa0f46aff5c7ee96eb5fdea10c53c77c9578eb071854d7f04ae0a7aa517772ab)__: 
#### <span style="color:red">  2. Print out the shape of your array. How many elements does your array contain?
#### <span style="color:red">  3.  Compute the max, min, average, and standard deviation of the temperatures.
#### <span style="color:red">  4.  Create new array that contains all the values in the array to converted to degrees Celcius.
#### <span style="color:red">  5.  Create new array that contains all the values in the array to converted to Kelvin.

### Working with 2D arrays (matrices) [We won't go into the creation of 3D arrays (tensors) for our purposes, but the same methods apply]

#### You were briefly introduced to the creation of > 1D arrays earlier in this lesson when you were challenged with creating them using the `.ones()`, `.zeros()`, and `.random.rand()`. How would we create one with user-defined values? We can pass Python "lists of lists":
* #### The first axis (axis 0) of our array (ROWS) contains the elements inside the "outer most" brackets.
* #### The second axis (axis 1) of our array (COLUMNS) contains the elements inside of the inner most brackets.

In [None]:
array2d = np.array([[1, 2], [3, 4], [5, 6]])
array2d

![](https://numpy.org/devdocs/_images/np_create_matrix.png)

#### <span style="color:red"> How many elements does the first axis (axis 0) contain?

#### <span style="color:red"> How many elements does the second axis (axis 1) contain?

#### <span style="color:red"> Based on your assessment above. What will be the result of the `.shape` and `.ndim` methods? Explain your results to the person next to you.

#### How do we __index__ if the array is greater than 1D ? The following array is the 2D array shown visually as blue shaded blocks in the image a few cells above.

In [None]:
array2d

In [None]:
array2d[0,1]

#### If we want to selected a specific element of a 2D array, we use the following syntax which identifies the location (index) of the element as [row location,column location]. Multidimensional arrays can have one index per axis.
* #### element = array[row_index,column_index]
* #### The numerical value of 1 in our array2d is located at row 0, column 0:

In [None]:
array2d[0,0]

#### The number 5 in our array2d is located at row 2, column 0:

In [None]:
array2d[2,0]

#### <span style="color:red"> Write code in the cell below that extracts the number 4 from our array2d. Explain your thought process to the person next to you.

#### <span style="color:red"> Write code in the cell below that extracts the number 2 from our array2d. Explain your thought process to the person next to you.

#### In the code below, the `:` translates to __extract ALL of the elements in each row__ and is equivalent to the example above where we specified 0:3:

In [None]:
array2d[:, 1]    # This is equivalent to the previous example, the `:` translates to `extract ALL of the elements in each row`

### Writing mathematical equations with NumPy. The ease of implementing mathematical formulas that work on arrays is one of the things that make NumPy so widely used in the scientific Python community. Let's work through some examples.

![](https://miro.medium.com/v2/resize:fit:640/format:webp/1*NoYRMhNKhmgC9fRossUXsA.png)

#### Translating the equation above into NumPy syntax would yield:
#### In words: "The __mean__ is equation to the sum of all of the elements in the array divided by the number of elements in the array"

## `mean = np.sum(arr) / np.size`

#### Apply this to our array, arr1:

In [None]:
mean = arr1.sum() / arr1.size  ### .size returns the size, i.e., the number of elements in a array
mean

#### Confirm that our value returned matches that from just using the `.mean()` method itself:

In [None]:
arr1.mean()

#### <span style="color:red"> In the cells below, translate the equation (sigma) for standard deviation into NumPy syntax and use it to compute the standard deviation of arr1. Confirm that your equation returns the same value as using the `.std()` method itself. Be careful of your parentheses!!

![](https://www.gstatic.com/education/formulas2/553212783/en/population_standard_deviation.svg)

* ### sigma = population standard deviation
* ### N = the size of the population
* ### x_i = each value from the population (your elements)
* ### u = the population mean

#### <span style="color:red"> Confirm that our value returned matches that from just using the `.mean()` method itself:

#### <span style="color:red"> Comprehension Check! Use the space below to write a function that takes in a 1D array as an arugment and returns the mean and standard deviation. Use the two equations for mean and standard deviation that you just worked out above. Revisit your python102b notebook to refresh yourself on functions.

#### <span style="color:red"> Send your array of hourly temperatures in Philadelphia through your function to compute the mean and standard deviation.

#### There will likely come a time were you want to save your NumPy arrays and load them back in, so you do not have to re-run your code all the time.
#### You can save a NumPy array as a plain text file like a `.csv` or `.txt` file with np.savetxt:
* csv_arr = your_numpy_array   : define a new variable called csv_arr set equal to the numpy array you want to save
* np.savetxt('name_of_file_to_be_saved.csv', csv_arr) : use np.savetxt to save your file with a user-designated name with .csv or .txt, csv_arr variable defined above.
* np.loadtxt('name_of_file_defined_above')

#### Example:
####  `csv_arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])` 
#### `np.savetxt('new_file.csv', csv_arr)`
#### `np.loadtxt('new_file.csv')`


#### <span style="color:red"> Following the example above, save your array Philadelphia hourly temperatures as a .csv file titled `Philly_Temps_[putdatehere].csv` and load it back into your Jupyter Notebook book and show the contents to confirm this was done correctly. You should see a .csv file appear in your directory on the left hand side of your screen.

### Congrats, you have worked through the basics of working with Arrays! The foundations you have learned here will be critical to understanding the structure of global climate datasets and the results of climate simulations. If you are having trouble grasping these concepts, please review this notebook again and / or come in for office hours or find peers in the class to work with.

#### After completing all the excercises outlined in red throughout this notebook, add both your Jupyter Notebook (this file!) and your .csv to your git, commit your changes, and push these back to your GitHub. Confirm you see these in your GitHub BEFORE leaving class.