<span style="font-family:Papyrus; font-size:3em;">Lab 1: Jupyter Notebooks and Python Basics</span>


# Jupyter Notebooks

Jupyter notebooks are a way to combine programming codes with text to develop and present a computational result.

You can start a Jupyter notebook through Google Drive. ``New>More>Google Colaboratory``.
You can also install Jupyter on your computer using ``pip install jupyterlab``. 

Using Jupyer notebooks:
1. Interrupting
1. Restarting

# Packages
Packages are extensions to the base capabilities of python.
We will use several packages in this class.
- ``numpy``: numerical data structures and algorithms
- ``pandas``: tabular manipulation of data
- ``tellurium``: simulation of reaction networks
- ``matplotlib``: plotting
- ``seaborn``: extended plotting

The following shows how to install packages from within Jupyter. You can do this from the command line if you are installing on your computer. However, I *strongly* recommend that you use [python virtual enviornments](https://docs.python.org/3/library/venv.html) so that you avoid conflicts with your existing installed software.

In [1]:
# Install packages within Jupyter notebook. Only need to do this once.
if False:
    !pip install numpy
    !pip install pandas
    !pip install tellurium
    !pip install matplotlib
    !pip install seaborn


In [3]:
# Make the packages usable in this notebook
import numpy as np
import pandas as pd
import tellurium as te
import matplotlib.pyplot as plt
import seaborn as sns

# Python Calculator
Python can be used as a simple calculator. Below are some examples.

In [5]:
2+2*3 + 2**4 + np.sin(np.pi/3)

24.866025403784437

In [6]:
"this " + "is " + "a " "sentence."

'this is a sentence.'

In [7]:
# Boolean tests
a = 2 + 3
a == 5

True

# Basic Data Structures

In [2]:
# A list is an order collection of objects of any type. Elements can be accessed by an integer index, where 0 is the first element.
aList = [1, 'a', True, [1, 2]]
aList[1] 

'a'

In [3]:
# A dictionary is an associative lookup using a key and value.
aDict = {"key1": 2, 4: 8 }
aDict["key2"] = "another value"
aDict

{'key1': 2, 4: 8, 'key2': 'another value'}

In [5]:
# range(n) is a list of integers from 0 to n-1
list(range(4))

[0, 1, 2, 3]

In [14]:
# An array is ordered list of values of the same type with more flexible indexing and an ability to do numerical operations
values = np.array([10, 20, 30, 40])
values[[1, 3]]

array([20, 40])

# DataFrames (Tables)
1. Series - index, values
2. DataFrames
3. Building tables
4. Selecting columns, rows, values
4. Searching, combining

In [1]:
!ls

Lab_1.ipynb  nst-est2019-01.csv


In [15]:
# A DataFrame is a table. It has names for rows (index) and columns.
# A DataFrame can be initialized from a comma separated variable f(CSV) file.
df = pd.read_csv("nst-est2019-01.csv")
df.head()

Unnamed: 0,State,Base,2010,Revised
0,Alabama,4779736,4780125,4785437
1,Alaska,710231,710249,713910
2,Arizona,6392017,6392288,6407172
3,Arkansas,2915918,2916031,2921964
4,California,37253956,37254519,37319502


In [16]:
# Make sate the index
df = df.set_index("State")
df.head()

Unnamed: 0_level_0,Base,2010,Revised
State,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Alabama,4779736,4780125,4785437
Alaska,710231,710249,713910
Arizona,6392017,6392288,6407172
Arkansas,2915918,2916031,2921964
California,37253956,37254519,37319502


In [18]:
# You can also initialize a DataFrame from a dictionary
dct = {"Base": [4.7, .7, 6.3, 2.9, 37], 2010: [4.7, .7, 6.3, 2.9, 37]}
df1 = pd.DataFrame(dct, index=["Alabama", "Alaska", "Arizona", "Arkansas", "California"])
df1

Unnamed: 0,Base,2010
Alabama,4.7,4.7
Alaska,0.7,0.7
Arizona,6.3,6.3
Arkansas,2.9,2.9
California,37.0,37.0


In [19]:
# Adding a column
df1["Calculated"] = df1["Base"] + df1[2010]
df1

Unnamed: 0,Base,2010,Calculated
Alabama,4.7,4.7,9.4
Alaska,0.7,0.7,1.4
Arizona,6.3,6.3,12.6
Arkansas,2.9,2.9,5.8
California,37.0,37.0,74.0


In [20]:
# Selecting columns
df1[["Base", "Calculated"]]

Unnamed: 0,Base,Calculated
Alabama,4.7,9.4
Alaska,0.7,1.4
Arizona,6.3,12.6
Arkansas,2.9,5.8
California,37.0,74.0


In [21]:
# Selecting rows
df1.loc[["Alaska", "California"], :]

Unnamed: 0,Base,2010,Calculated
Alaska,0.7,0.7,1.4
California,37.0,37.0,74.0


In [22]:
# Selecting a value
df1.loc["Alaska", "Base"]

0.7

In [24]:
# A Series is a column or row in a table and can be manipulated like a DataFrame, but without a column argument
ser = df1["Base"]
ser

Alabama        4.7
Alaska         0.7
Arizona        6.3
Arkansas       2.9
California    37.0
Name: Base, dtype: float64

# Flow of Control

## if-statement

## for-statement

# Functions

## Definition

## Arguments

## Return values

## Documenting

## Testing

# Programming style

1. Whenever possible, use functions instead of scripts. This is because functions facilitate reuse, and functions are testable. Never use copy and paste for reuse.

1. Use meaningful names for functions are variables. Function names should be verbs. For example, a function that calculates a fast Fourier transform might be named ``calcFFT``. A bad name for this function would be the single letter ``f``.

1. Constants used in the notebook should have a name in all capital letters. For example, use PI, not pi. (By definition, a constant is a variable that is assigned a value only once.)

1. The following should be used for functions:

   1. Code cells should contain at most one function definition.

   1. Functions should contain documentation that specifies: (a) what it does; (b) data types and semantics of its input parameters; (c) data type and semantics of what it returns.

   1. The code cell in which a function resides should contain an "assert" statement that runs the function and performs a test on its output.
   
   1. Variables in a function are either parameters of the function, local to the function, or global constants.

# Simulations