# Programming with Python

## Session Overview

1) Getting started with Python 
    a) Reviewing our conda installation and making our first conda environment

2) Introduction to Jupyter Notebooks
    a) Using your conda environment as a Jupyter Notebook kernel

3) Importing packages

4) Variable types

5) For loops, while loops, if statements, and logicals 

6) Creating Functions 

7) Basic Python Math

8) Numpy

9) Analyzing Data 

    a) Slicing arrays  

    b) Statistical operations on arrays

    d) Working with lists

    g) Calling data in .nc, .csv, and .bin formats

10) Errors and Exceptions 

11) Debugging 

12) Best Practices

## Getting started with Python - Installation and Development Environment

Here we describe the necessary tools to get up and running with Python. The listed tools should be installed prior to the workshop: 

**1) Miniforge/Miniconda**: The installation instructions provided for this workshop detail how to get conda through Miniforge. Conda will be our Python package manager. We will install all necessary Python packages through conda, and we will always activate a conda environment to run our Python scripts. 

**2) Jupyter Notebooks**: The easiest way to develop in Python is with Jupyter Notebooks, which allow you to run a few lines of code at a time instead of running entire Python scripts. This feature, along with the option for inline figures, makes Jupyter Notebooks ideal for data analysis and collaborative science. We will use our conda environments as kernels for our Jupyter Notebooks.

**3) VSCode**: Visual Studio Code (VSCode) is a easy-to-use development environment that has extensions for every major programming language. It seamlessly integrates Conda, Jupyter Notebooks, Python and Remote - SSH. We will use VSCode as the example for this workshop. The same code can be run in Jupyter Lab, so VSCode is not strictly required.

**4) IPython**: IPython is a command-line tool that allows you to run lines of python code directly in the terminal. IPython should be automatically installed with Miniforge and Jupyterlab. To use it, type `ipython` into the terminal. We will not use IPython directly in this workshop, but it can occasionally be useful to run a few commands in the terminal.

## Jupyter Notebook Intro and Shortcuts (Let's move the shortcuts to another page potentially)
They only really need a quick intro to creating new cells, using markdown in between code cells, running cells (shift + enter), restarting the kernel, and clearing outputs. The short cuts can be moved to a resources page just like the unix cheat sheet will be.

1) esc + m = convert a code cell to a markdown cell

2) esc + y = convert a markdown cell to a code cell 

3) shift + enter = run current cell 

4) esc + a = Insert a cell above the current cell 

5) esc + b = Insert a cell below the current cell

6) esc + x = Cut the current cell



## Importing Packages 

## Variable Types 

Include an example of each in the list below instead of just a description. Try to break up the text so there isn't so much of it at once.

**1) Integers, floats, and complex**: integers are integers, floats are numbers with decimals, complex are complex valued floats. All types have a default precision (32 bits) that can be increased (64 bits) or decreased (16 bits)

**2) Strings**: a collection of letters. strings are denoted with quotation marks "" or ''.  

**3) Boolean**: True or False

**4) Lists**: a sequence of any type of object including strings, floats, integers, and booleans. All things in the list do not have to be the same type.

```my_list = [1, "hello", 3.14, True, [1, 2], {"a": 1}]```

**5) Arrays**: a mutable sequence

**6) Dictionaries**: an unordered collection of data values, used to store data values. Unlike other Data Types that hold only single value as an element, Dictionary holds key:value pair.

```
monthly_data = {}
monthly_data = {'time': [], 'mean': [], 'median': [],'std': [],'N': [] }
```

**7) Tuples**: A collection of things: (float, float, list) or (dictionary, float, string). denoted by () and separated by commas. Useful for returning multiple things from a function.



In [None]:
# Include some code that shows how to investigate attributes of variables (size, shape, type, attrs)

## For Loops, While Loops, If Statements, and Logicals

**Conditional Statements**:

**Syntax**
```python
if condition:
    print(swh)
elif condition:
    print(wsp)
else:
    print(swh, wsp)
```

**Logicals**

- `a == b` : Test for equivalence

- `a != b` : Test for inequivalence

- `a and b`  
  Returns `True` if **both** operands are true.

- `a or b`  
  Returns `True` if **either** operand is true.

- `not a`  
  Returns `True` if operand is **False**.

> **Note:** Logical operators can only be used to compare two boolean expressions.  
> They should **not** be used in long series of conditions.  
> Instead, break complex logic into multiple conditional statements and store boolean results in variables for later use.

**Other Operators**

- `>=`, `<=`
  Greater than or equal to, less than or equal to

- `+=`, `-=`, `*=`, `/=`  
  Shortcut assignment operators for addition, subtraction, multiplication, and division, respectively.  
  These update the variable in-place without needing to redefine it.

**What will evaluate to True (False)?**

- `0`, empty strings `""`, and empty lists `[]` are considered **False**
- All other numbers, strings, and lists are considered **True**
- `True` and `False` are special boolean values representing truth


## Creating Functions

**Syntax**: 
```python
    def function_name(parameters): 
        '''
        Documentation
        '''
        #Code for function
        
        return variable_1, variable_2 #Output variables from function
```
**Basics**: 
 
1) Variables defined within a function can only be seen and used within the body of the function.

2) Specify default values for parameters when defining a function using name=value in the parameter list

3) Parameters can be passed by matching based on name, by position, or by omitting them (in which case the default value is used)

**Purpose** of Functions are to automize and streamline your code in order to your code more efficient, easier to debug, and easier to write. Functions can be expanded upon to compelete a wide variety of tasks by adding conditional statements. 

**Importing functions** from .py files with the following code: 

import sys # import sys library 
sys.path.append('/zdata/home/lcolosi/python_functions/') #using sys, give the notebook that path inorder to find the functions .py file 

#import functions 
from unweighted_least_square_fit import least_square_fit 
from char_LSF_curve import character_LSF
from monthly_mean import monthly_average

In [None]:
def wind_stress_mag(C_d, rho, U_10):
    
    '''
    Documentation Section: 
    
    wind_stress_mag(C_d, rho, U_10)
        
        Function for computing scalar wind stress magnitude 
        
        Parameter
        ---------
        C_d : wind drag coefficient 
        rho : air density (kg/m^3)
        U_10 : wind speed 10 meters above the suface of the water (m/s)
        
        Libraries necessary
        -------------------
        import numpy as np
    '''
    
    #import libraries
    import numpy as np
    
    #compute wind stress magnitude
    wsm = C_d*rho*U_10**2
    
    return wsm

In [None]:
wsm = wind_stress_mag(0.001, 1.2, 10)
print(wsm)

#Obtain documentation on function: 
wind_stress_mag?

## Basic Mathematical Operations 
Let's turn as much as possible of this into demonstration (code cells) as opposed to just text, start with regular operations that don't include numpy. then do numpy examples

### **Python Math**
1. a + b= elementwise addition 

2. a - b= elementwise subtraction

3. a * b = elementwise multiplication 

4. a @ b or np.dot(a,b) = matrix multiplication 

5. a/b = elementwise division

6. a//b = floor division 

7. a**b = elementwise exponentiation 

8. a.T = transpose of a

9. a.conj().T = conjugate transpose of a

## Numpy

1) Numpy arrays (declaring them: np.array(), np.linspace, range())
2) Slicing arrays
3) Common mathemtical operations with numpy 

### **Numpy Arrays**

### **Slicing arrays** 

Basics: 

1) Array indices start at 0

2) *Array[x,y]* selects a single element from the 2D array where x = row and y = column

3) General syntax for an array: *array[start:stop:step]*

4) Use *low:high* to specify a slice of the array along a dimension of the array including indices from *low* to *high-1*

5) All indexing and slicing that works on arrays also works on strings 



### **Numpy Math**

In [None]:
import numpy as np #library for working with arrays 

# Example numpy operations

## Analyzing Data 

In [None]:
import numpy as np #library for working with arrays 
import matplotlib.pyplot as plt #library for plotting
from netCDF4 import Dataset, num2date #library commands used for reading in netCDF files 
import glob # use to create a list of filenames 

In [None]:
# separate these into a few code cells with the appropriate descriptions from above, so that the description is immediately followed by the code example
print(swh[2,3])
print(swh[0:3,4:8])
print(swh[4:])
print(swh[:3])
print(swh[-3:])
print(swh[0:5:2])
print(swh[::2])

### **Statistical operations on arrays** (move the statistics functions to the numpy section - can reiterate it here with some real data as an example)

In [None]:
#operation on the entire array
swh_mean = np.mean(swh)
swh_std = np.std(swh)
swh_max = np.max(swh)
swh_min = np.min(swh)
print(swh_mean, swh_std, swh_max, swh_min)
#operations along an axis 
swh_mean_column = np.mean(swh, axis=0) #calculate mean across row axis (down each column)
swh_mean_row = np.mean(swh, axis=1) #calculate mean across column axis (across each row)
print('')
print(swh_mean_row)
print('')
print(swh_mean_column)

In [None]:
#initalize list 
swh_list = []

for ilen in range(0,10,1):
    swh_list.append(ilen)
    
#look what swh_list looks like
print(swh_list)
print(len(swh_list))
print(np.size(swh_list))

**Working with lists**

Basics:

1) Lists are mutable (can be manipulated such that elements can be swapped, deleted, etc.) 

2) Lists can contain multple types of variables

3) Lists are indexed and sliced with square brackets (e.g., list[0] and list[2:9]), in the same way as strings and arrays.

In [None]:
#create an empty list
wsp = []
#Create a populated list
wsp = [1.2, 3.5, 'air', 3.2, 3.1, [6.9, 3.8, 4.9]]
#Slicing List
print(wsp[2:4])
print(wsp[-1][1])

**List Comprehension**: a powerful way of manipulating lists

In [None]:
#Example of list comprehension
days = [str(i).zfill(2) for i in range(1,31)]
print(days)

In [None]:
#Logical operator example 
#Initialize variables 
swh_norcal = np.random.randint(0,8,size=(5,8))
swh_socal = np.random.randint(1,10, size=(5,8))

#Set conditional statement
if np.max(swh_norcal) > np.max(swh_socal) and np.min(swh_norcal) < np.min(swh_socal):
    print('Northern California has a larger spread of values than southern California')
elif np.max(swh_norcal) == np.max(swh_socal) or np.min(swh_socal) == np.min(swh_norcal:
    print('Northern California maybe has the same spread of values as southern California')
else: 
    print('Southern California has a larger spread of values than Northern California')

**Calling data in .nc and .txt formats**

**Basic steps for importing data into Jupyter notebooks**:

1) Import libraries for calling netCDF or text files 

    a) netCDF files: from netCDF4 import Dataset, num2date
    
    b) Text files: import numpy as np

2) Set the path to the data located locally on your computer. There are ways to obtain data located remotely, however that is not covered here. Paths are usually strings. 

    ex: filenames = '/zdata/downloads/colosi_data_bk/binned_data/ifremer_p1_daily_data/my_daily_binned_ifremer_data/ifremer_swh_daily_binned_data_93_16_bia.nc'
    
    If you need to import several files, use the glob library to compile a list of filenames that have a common filename structure or patter. Use wildcards (* or ?) in order to grab multiple files with the same beginning or endings. Use * in a pattern to match zero or more characters, and ? to match any single character.
    
    ex: filenames = sorted(glob.glob('/zdata/downloads/colosi_data_bk/binned_data/ccmpv2_wind_data/daily_binned_ccmp_v2_data/ccmp_v2_wsp_daily_binned_data_*_high_res.nc'))
    
    where the sorted function (built in function to python) returns a new list containing all items from the iterable in ascending order


3) From here, we take two separate paths for netCDF and text files 

    a) netCDF files: 
    ```python
        #set nc variable in order to read attributes and obtained data: 
        nc = Dataset(filenames[0], 'r')
        
        #print key variables:
        print(nc.variables.keys())
        
        #call data 
        swh = nc.variables['swh'][:]
    ```
    b) Text files: 
    ```python
        #Call data 
        chlorophyll = np.loadtxt(fname='/Users/lukecolosi/Downloads/chlorophyll-01.csv', delimiter=',')
        ```
   For netCDF files and text files, both will leave you with variables linked to data arrays 


## Errors and Exceptions (Lets change this to be tips and tricks for how to read errors, but not necessarily worry about listing types of errors)

Tracebacks (error output) can look intimidating, but they give us a lot of useful information about what went wrong in our program, including where the error occurred and what type of error it was.

An error having to do with the ‘grammar’ or syntax of the program is called a SyntaxError. If the issue has to do with how the code is indented, then it will be called an IndentationError.

A NameError will occur if you use a variable that has not been defined, either because you meant to use quotes around a string, you forgot to define the variable, or you just made a typo.

Containers like lists and strings will generate errors if you try to access items in them that do not exist. This type of error is called an IndexError.

Trying to read a file that does not exist will give you an FileNotFoundError. Trying to read a file that is open for writing, or writing to a file that is open for reading, will give you an IOError.

## Best Practices

Program defensively, i.e., assume that errors are going to arise, and write code to detect them when they do.

Put assertions in programs to check their state as they run, and to help readers understand how those programs are supposed to work.

Use preconditions to check that the inputs to a function are safe to use.

Use postconditions to check that the output from a function is safe to use.

Test your program or function in order to make sure it is correctly preforming the way you want it to. Give your function or program an input with a know output and make sure this output is outputted. 

Write documentation before writing code in order to help determine exactly what that code is supposed to do and to assist future reader.

## Debugging 

- Know what code is supposed to do before trying to debug it.

- Make it fail fast.

- Change one thing at a time, and for a reason.

- Keep track of what you’ve done.

### Acknowledgements 

Some of the material in this lesson is derived from the Software Carpentry Lessons for Python Programming and Plotting https://swcarpentry.github.io/python-novice-inflammation/reference/