# Assignment 0: Introduction to Python and the Jupyter Notebook Environment

#### Skills
1. Work with Juypter notebooks (cell types)
2. Python basics - data type
3. Some useful data structures
    - lists    
    - numpy arrays
    - dictionaries
4. Basic plotting functions
5. Further reading

#### Concepts
* Analyzing and visualizing data

## Welcome to ENV 330 / MAE 330

_You are looking at [Jupyter Notebook](http://jupyter-notebook.readthedocs.io/en/latest/notebook.html)_.  A Jupyter notebook (formerly known as an iPython notebook) is a lightweight environment for describing and running blocks of code in a [variety of languages](https://github.com/jupyter/jupyter/wiki/Jupyter-kernels).  Over the last few years, it has become a popular tool in the scientific community for its use in data analysis and exploration.  Throughout this course we'll be using notebooks to look at data both from observations and numerical models.

## 1. Cell types
A key feature of Jupyter notebooks is that they support different cell types.  A "cell" in a Jupyter notebook is a field in which one can enter text.  To enter text in a cell, one simply needs to click inside it to make it active.  There are two types of cells:

1. Markdown cells (like this cell itself)
2. Code cells.

### Markdown cells
Markdown cells are meant for displaying text.  Markdown is a way of indicating basic formatting aspects in plain text form.  A full description of its syntax (with examples) can be found [here.](https://daringfireball.net/projects/markdown/syntax) You can also double click on any Markdown cell in this notebook to see the text that produced it.  

Things Markdown supports are multi-level headings, bulleted and numbered lists, links, images, and font styles (e.g. **bold** or *italic*).  Jupyter notebooks also include MathJax support, so you can type LaTeX-style equations.  E.g. 
$$y = mx + b$$
In practice, it can often be helpful to use Markdown cells to provide background for the data analysis you are doing (e.g. to provide links to scientific papers, or describe the math behind a certain kind of analysis).

### Code cells
Code cells are what make the Jupyter notebook a powerful data analysis tool.  They enable one to write multi-line snippets of code, which can be executed one at a time, and in any order.

After running a cell, the variables created in the cell will now be accessible to all other cells in the notebook.

### Cell operations
- To create a cell: In the menu bar, click on "Insert".
- Command mode and edit mode: Command mode binds the keyboard to notebook level actions, indicated by a blue left margin. Edit mode allows one to work on the text in a cell, indicated by a green border.
- Going from command mode to edit mode: Click anywhere in the cell or press "Enter".
- Going from edit mode to command mode: Click on cell border or press "ESC".
- To navigate among cells: Use Up and Down keys.
- To run a cell (i.e. the one you are typing in) you hold down the "Control" (windows) or "Shift" (mac) key and then press "Enter" key, or simply click on the "Run" button in the toolbar above. 

In [None]:
print('Hello world!')

To see all the interactive variables, we can use the [magic command](https://ipython.readthedocs.io/en/stable/interactive/magics.html) `%whos`.

In [None]:
%whos

## 2.  Python basics - data type

Python is a dynamically typed language; therefore it is very straightforward to assign data to a variable (there is no need to specify the data type). That is to say, Python has no command for declaring a variable; a variable is created the moment you first assign a value to it.

Create the following three variables:

`a = 2` 

`b = 4.0`

`c = 'c'`

In [None]:
a = 
b = 
c = 

Note that after executing the cell above, all other cells now have access to the variables "a", "b" and "c".

You can check the value of a variable by executing it alone in a cell. You can also check the datatype using the `type()` command. 

What are the datatypes of `a`, `b`, and `c`? 

In [None]:
type(a)

In [None]:
type(b)

In [None]:
type(c)

Now create a new variable:

`d = a + 2`

then print the result

In [None]:
d = 

In [None]:
print(d)

## 3.  Python data structures
Some fundamental Python data structures that will be of use to us in this course are:
- Lists
- `Numpy` arrays
- Dictionaries

### 3.1 Lists
Lists store *ordered* sequences of objects. The objects need not be of the same type (but often they will be).  See the following two examples of lists below.

In [None]:
list_a = ['a', 'b', 'c', 'd'] # list of strings
list_b = ['a', 2, '🤪', list_a] # mixed list of strings, and integers and a nested list

# Note that comments can be inserted into a codeblock using the '#' symbol. 
# This is a good way to document your work for future reference. 

One can iterate over items in a list using a `for` loop, but
watch out for indentations!

In [None]:
for item in list_b:
    print(item) # notice this line is indented

One can also access specific elements of a list using integer indexes 

**Note:** Python is "zero-indexed" meaning that the first element of the list is at index 0, (this will likely be very important in the future!) 

Use indexing to print each element of `list_a`

In [None]:
list_a[0]

In [None]:
list_a[1]

In [None]:
list_a[-1]

In [None]:
list_a[-2]

`list_a[i:j]` means that slicing index starts from i and ends at j, but not including j.

Call a slice of `list_a`

In [None]:
list_a[1:3]

An alternative way of iterating over the list using a `for` loop would be using an integer index and `range`.

In [None]:
range(4)

In [None]:
for i in range(4):
    print(list_a[i])

<div class="alert alert-info"><h1>Exercise 1</h1></div>

1. Create a cell below 
2. Display the first 2 elements of list_a



Other operations one can do with lists can be found in the [Python documentation](https://docs.python.org/3/tutorial/datastructures.html).

### 3.2  `NumPy` and `NumPy` arrays

`NumPy` is a numerical array library for mathematical operations. It is a fundamental package for scientific computing with Python. It contains among other things:

- `NumPy` arrays that can be used to store N-dimensional tables.
- Functions and useful linear algebra

`NumPy` is not part of the standard library of Python; therefore we need to import it using an import statement.

In [None]:
import numpy as np

There are various ways to create an array. Below listed useful ones:
- `np.zeros`, `np.ones`
- `np.linspace`, `np.arange`
- Create from a python list

We can create an empty array of zeros by using the `np.zeros` command (say if we wanted to create a blank one to assign values to later). Similarly, `np.ones` create an array of ones.

In [None]:
z = np.zeros(5)
z

We can also use the function `np.linspace`, which linearly interpolates values between the specified starting and end points. The syntax is 
```
np.linspace(start, stop, number)
```

In [None]:
x = np.linspace(2, 3, 5)
x

We can also use `np.arange` to create an array. The syntax is 
```
np.arange(start, stop, step)
```

In [None]:
np.arange(1,10,1)

To create a simple 1D numpy array, we can use a list:

In [None]:
arr = np.array([1,2,3])
arr

To create a simple 2D numpy array, we can use a nested list:

In [None]:
arr = np.array([[1, 2, 5],
                [3, 4, 8]])
arr

This array consists of two rows and three columns:

In [None]:
arr.shape

We can index a 2D array using integer indexes:

In [None]:
arr[0,1]

You can use slices to index rows and columns separately; for instance if we wanted to select just the first (or last) two columns of the array we could write:

In [None]:
arr[:, 0:2]

In [None]:
arr[:, -2:]

Many `numpy` functions can be passed an "axis" argument; this specifies a particular axis of the array to execute the function along.  For example, using our 2D array, we can not only sum across the entire array but also along each axis:

In [None]:
np.sum(arr)

In [None]:
np.sum(arr, axis=0)

In [None]:
np.sum(arr, axis=1)

The `numpy` library implements a huge number of functions that one can use on arrays.  We'll get a taste for a few of them later on, but more detail can be found in the [`numpy` documentation](https://docs.scipy.org/doc/numpy-1.12.0/reference/).

<div class="alert alert-info"><h1>Exercise 2</h1></div>

1. Create a cell below 
2. Create an array called `x` of 20 equally spaced values between 0 and 12. (hint: try linspace or np.arange)
3. Using a for loop and the sum function, to create a new variable `y` that computes the cumulative sum of x (i.e. an array of same dimension as x in which the nth element is the sum of the first n elements of x). 



### 3.3 Dictionaries

In Python, a dictionary is an *unordered* collection of items. "Dicts" or "Dictionaries" map a key to a value. A usual case is to describe a collection of related objects with string-based keys.  For example:

In [None]:
states = {'NJ': 'New Jersey', 'NY': 'New York', 'PA': 'Pennsylvania'}

We could use the `.keys()` method to display all the keys in a dictionary.

In [None]:
states.keys()

In [None]:
states['NJ']

Values in a dictionary can be of any data type, e.g., interger, string, or even a dictionary!

In [None]:
## Example
NJ = {'name': 'New Jersey', 'population': 9.006}
states['NJ'] = NJ

A nested dictionary is a dictionary inside a dictionary. 

Using the now-updated `states` dictionary, access the NJ population. 

In [None]:
states['NJ']['population']

Again one can find more information on Python dictionaries in the [documentation](https://docs.python.org/3/tutorial/datastructures.html?highlight=dictionaries#dictionaries).

<div class="alert alert-info"><h1>Exercise 3</h1></div>

1. Create a cell below 
2. Create an empty dictionary called "classmates"
3. Pick three or more categories (e.g. Name, age, height, homestate, class year... etc)
4. Go around the room and meet your classmates. For each class mate, make a dictionary following the NJ example above.
5. Add each individuals dictionary to your "classmates" dictionary to create a nested dictionary 

**Note: Make sure you use consistent capitalization and punctuation for all entries!**

Once done, we can use the dictionary to easily answer questions about some of the people you interviewed. 

Below are examples: 

In [None]:
## Compare the heights of two classmates
name1 = 
name2 = 

classmates[name1]['height'] > classmates[name2]['height']

In [None]:
## Make a class roster list

all_classmates = classmates.keys()

namelist = []
for key in all_classmates:
    namelist.append(classmates[key]['name'])
    
print(namelist)

In [None]:
## Count the number of G5s using a list

class_years = []
for key in all_classmates:
    class_years.append(classmates[key]['class_year'])
    
class_years.count('G5')
    

In [None]:
## Find the mean age of the class using a numpy array

all_ages = np.zeros(len(classmates.keys())) 
i=0
for key in all_classmates:
    all_ages[i] = classmates[key]['age']
    i+=1
    
all_ages.mean()

## 4. Plotting a time series
Here we will introduce some more applications-driven examples.  Plotting in Python is typically done using the [`matplotlib`](https://matplotlib.org) library.  Again `matplotlib` is not part of the Python standard library so we will have to import it.

In [None]:
import matplotlib.pyplot as plt
import numpy as np # In principle we do not need to import numpy again since we have done it above,
                   # but in case you skipped the cells above we will do it again here

To plot a line we will use the [`plot`](https://matplotlib.org/api/_as_gen/matplotlib.axes.Axes.plot.html?highlight=plot#matplotlib.axes.Axes.plot) command.

In [None]:
plt.plot([1,2,3],[3,2,5])

As a basic example, we'll plot a family of sine curves using `matplotlib` to illustrate the basics of making a plot.  We'll plot:
$$ y = \sin(x - a) $$

1. First, create some values for $x$.  To do so, try using the `linspace` command.

2. Then, create the variable $y$ using the NumPy sine function, `np.sin()`

In [None]:
x = np.linspace(-10,10, 500)
y = np.sin(x)

To create an environment in which we can make a plot, we'll use the [`plt.subplots`](https://matplotlib.org/api/pyplot_api.html?highlight=subplots#matplotlib.pyplot.subplots) command.  This returns a [`Figure`](https://matplotlib.org/api/figure_api.html?highlight=figure#module-matplotlib.figure) object and single [`Axes`](https://matplotlib.org/api/axes_api.html#matplotlib.axes.Axes) object or list of `Axes` objects, depending on the two input arguments.  The first input argument is the number of rows of `Axes` and the second is the number of columns (so 2, 2 would represent four `Axes` objects).

Using $x, y$ from above, you can see an example below of a two-panel plot.

In [None]:
plt.plot(x,y)

plt.xlabel('x')
plt.ylabel('y')
plt.title('y=sin(x)')

Next we need to create the variable `a` that we can insert into our equation for $y$. However, if we want to test multiple `a` values at once, we can loop over a list of `a_values` and plot our function with each. See the example below. 

In [None]:
a_values = np.array([1, 2, 3, 4]) 

for a in a_values:
    y = np.sin(x - a)
    plt.plot(x, y, label='a = {:0.2f}'.format(a))

plt.xlim([-6,6])
plt.ylim([-1.75, 1.75])
plt.xlabel('x')
plt.ylabel('y')
plt.title(r'$y = sin(x-a)$')  # Use LaTex formatting to create an equation in the title

plt.legend(loc='upper right', ncol=2, frameon=False)
fig = plt.gcf() # This collects the actual figure after it's made

Note that adding a [legend](https://matplotlib.org/api/_as_gen/matplotlib.axes.Axes.legend.html?highlight=ax%20legend#matplotlib.axes.Axes.legend) requires setting a label in your plot command.

For writing up projects it will be helpful to be able to make panel plots and save graphics produced using matplotlib.  This can be done using the [`Figure.savefig`](https://matplotlib.org/api/figure_api.html#matplotlib.figure.Figure.savefig) command in ``matplotlib``, or simply right click on the figure and save.

In [None]:
fig.savefig('example-fig.pdf')  # Change the extension of the file for a different format (e.g. 'example-fig.png')

## 5. Further reading

There are many many python tutorials (e.g. [legend](https://www.w3schools.com/python/python_syntax.asp)), although some of it goes beyond what we need for the class. You are welcome to read as much as you want or come back to it later. The official documentations of [numpy](https://numpy.org/doc/stable/index.html) and [matplotlib](https://matplotlib.org/) are always great source of references as well.