# Python Basics
This is adapted from Alex Robel's coding tutorial, found here: https://github.com/aarobel/PracticalCodingMath4EAS/blob/main/Python_basics.ipynb.

This is a Jupyter notebook that takes you through many basic commands and concepts in Python. To execute through this notebook, use up and down arrows to move between sections (highlight in a blue box when you are in them) and then press CTLR-Enter. Output (if there is any) shows up on under each executed cell.

In [None]:
#import sys
#!{sys.executable} -m pip install matplotlib
#!{sys.executable} -m pip install scipy

## Loading Packages

While there are many built in functions in the basic Python environment, for much of the functionality that you might use, you must install and load in packages. Anaconda, which you can download and setup fairly easily has many "standard" packages already in its environment. Once you start to do fancier or more non-standard things, you may need to install new packages, which is generally straightforward to do using the "conda install" command on the command line (this can also be done though the Anaconda GUI). For scientific computing and basic plotting, the most useful packages are numpy, matplotlib and scipy. These are all part of the standard anaconda environment.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import scipy
import math

## Executing Commands

In normal python code (not a notebook like this) output is automatically supressed. In a notebook all output except the last line of a cell is suppressed. Otherwise, basic arithmetic syntax is the same as MATLAB, except for exponents.

In [None]:
3*5 # Do basic arithmetic and see output

In [None]:
3*5
8**2 # Only see output of the last cell (this is the case in Jupyter notebooks)

In [None]:
# Example: Showing output

ice_velocity_yrs = 1500 # meters per year
seconds_per_year = 3.15e7
ice_velocity_seconds = ice_velocity_yrs*seconds_per_year # in meters per second
ice_velocity_seconds

## Assign variables and data types

Python is more strict about data types (particularly numbers) than MATLAB. In MATLAB integers and doubles will often be automatically converted by a function depending on what is needed. In Python, many functions require inputs to be one or the other and will give errors if you are not careful with input type

In [None]:
a1=3
a1

In [None]:
a2=3.0
a2

In [None]:
a3 = 'hello'
a3

In [None]:
a4 = True #be careful to capitalize logical types in Python
a4

In [None]:
a5 = [1,5,9]
a5

To do column vector and fancier array algebra, you need to define a numpy array which has more capabilities than the basic Python row vector

In [None]:
a6 = np.array([[1, 5, 9]]).T
a6

In [None]:
a7 = np.array([[2,4,6], [8,10,12]])
a7

## Creating useful vectors/matrices

arange is the Python equivalent of the colon in MATLAB. BUT: Python indexing is not inclusive of the ending index (more on indexing in the next section)

In [None]:
b1 = np.arange(2,9)
b1

In [None]:
b2 = np.arange(2,8.5,0.5)
b2

In [None]:
# Example: When might we want to generate these kinds of vectors?

river_length = 100 # 100 meter long river
x_pos = np.arange(0,101,10)

# How long is this array?
len(x_pos)
x_pos.shape

# Define a velocity at each position along the river
velocity = 0.25*np.sin(x_pos/10)+10

plt.plot(x_pos,velocity,color='gray',linewidth=2)
plt.xlabel('Position Along River (meters)')
plt.ylabel('Flow Velocity (meters/second)')
plt.grid()
plt.title('Velocity Along REU River')
plt.show()

plt.scatter(x_pos,velocity,c = 'gray',s=40)
plt.xlabel('Position Along River (meters)')
plt.ylabel('Flow Velocity (meters/second)')
plt.grid()
plt.title('Velocity Along REU River')
plt.show()


In [None]:
b3 = np.linspace(0.4,8.2,10)
b3

b3b = np.logspace(1,4,10)
b3b

In [None]:
b4 = np.zeros([10,11])
b4

In [None]:
b5 = np.ones([10,11])
b5

numpy has some functions that are not imported by default, like those contained in matlib, these need to be imported separately

In [None]:
import numpy.matlib
b6 = np.matlib.repmat(b1, 2, 2)
b6

In [None]:
[b7,b8] = np.meshgrid(b2, b3, sparse=False, indexing='ij') #need to specify type of indexing in Python's meshgrid, ij is most common, and the same as matlab
b7

## Indexing

Indexing is one of the biggest differences between Python and MATLAB. Python is zero-indexed, which means that the first element of vector or array is indexed as '0'. Also, to reference an index in a Python array one uses square brackets '[]', whereas parenthesis '()' are only used to call functions. There are some other key differences which are explored in examples here

In [None]:
c1 = b6[1,2]
c1

In [None]:
c1 = b6[1,1]
c1

Another key difference is that when using vector indexing (i.e. with colons as you would in MATLAB), the indexing goes from the first index, and then up to BUT NOT INCLUDING the end index. So in the below example [0:2] yields a vector index including index 0 and 1, but not 2.

In [None]:
c2 = b6[0:2,0]
c2

In [None]:
c3 = b6[0:2,0:2]
c3

In [None]:
c4 = b6[0,-2] #-1 is used instead of "end" to indicate the last element along a dimension of an array
c4

In [None]:
c5 = b6[0,b1-1]
c5

In [None]:
# Example: When would you need to index?

velocity[0]

plt.plot(x_pos[4:11],velocity[4:11],color='gray',linewidth=2)
plt.xlabel('Position Along River (meters)')
plt.ylabel('Flow Velocity (meters/second)')
plt.grid()
plt.title('Velocity Along REU River')
plt.show()

plt.plot(x_pos,velocity,color='gray',linewidth=2)
plt.xlabel('Position Along River (meters)')
plt.ylabel('Flow Velocity (meters/second)')
plt.grid()
plt.title('Velocity Along REU River')
plt.xlim(x_pos[4],x_pos[-1])
plt.show()


## Boolean/Logical Operations
Mostly these work the same as in MATLAB, though booleans are stored as "True" and "False" indexing is a bit different

In [None]:
2 < 5

In [None]:
l1 = b1 < 5
l1

In [None]:
l2 = b1[l1] #logical indexing mostly works the same as in MATLAB, but only works on numpy arrays
l2

In [None]:
# Example: When would you need these logical operations?

time = np.linspace(0,12,100)
velocity_time = 5*np.sin(time/2)+10+np.random.randn(1,100)*3
velocity_time[0,55] = 57.635
velocity_time[0,22] = 43.583

plt.scatter(time,velocity_time,s=40)
plt.xlabel('Time (months)')
plt.ylabel('River Velocity (meters/second)')
plt.grid()
plt.title('Velocity Over Time')
plt.show()

velocity_fix = velocity_time
velocity_fix[velocity_time > 25] = np.nan

plt.scatter(time,velocity_fix,s=40)
plt.xlabel('Time (months)')
plt.ylabel('River Velocity (meters/second)')
plt.grid()
plt.title('Velocity Over Time')
plt.show()


In [None]:
# Example: When would you need these logical operations?

time = np.linspace(0,12*10,500)
velocity_time = 5*np.sin(time/2)+10 + np.random.randn(1,500)*2;

plt.scatter(time,velocity_time,s=40)
plt.xlabel('Time (months)')
plt.ylabel('River Velocity (meters/second)')
plt.grid()
plt.title('Velocity Over Time')
plt.show()

threshold = 12*np.ones(time.shape)

plt.scatter(time,velocity_time,s=40)
plt.plot(time,threshold,color='red',linewidth=2)
plt.xlabel('Time (months)')
plt.ylabel('River Velocity (meters/second)')
plt.grid()
plt.title('Velocity Over Time')
plt.legend(['Data', 'Legend'])
plt.show()

idx = velocity_time > 12
time[np.squeeze(idx)]
time[np.squeeze(idx)]%12

plt.scatter(time[np.squeeze(idx)],velocity_time[0,np.squeeze(idx)],s=40)
plt.plot(time,threshold,color='red',linewidth=2)
plt.ylim(0,25)
plt.xlabel('Time (months)')
plt.ylabel('River Velocity (meters/second)')
plt.grid()
plt.title('Velocity Over Time')
plt.legend(['Data', 'Legend'])
plt.show()

plt.hist(time[np.squeeze(idx)]%12)
plt.grid()
plt.xlabel('Month in a Year')
plt.xticks([0,1,2,3,4,5,6,7,8,9,10,11],['Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec'])
plt.show()

As we will also see for loops below, conditional statements (if/else) do not have end like in MATLAB, BUT INDENTATION IS IMPORTANT AND ENFORCED IN PYTHON (compared to MATLAB where indentation does not play a role in code execution)

In [None]:
if c1>5 : #use a colon to indicat the end of the logical condition
    print('hurray!')
elif c1<0 : #elif instead of else if
    print('sure...')    
else :
    print('oh no!')

In [None]:
l3 = (b1 < 5) & (b1 > 0)
l3

In [None]:
l4 = (b1 > 5) | (b1 < 0)
l4

## Vector and matrix manipulation and arithmetic

In [None]:
d1 = b6.transpose()
d1

In [None]:
d2 = np.sum(b6,axis=None)
d2

In [None]:
d3 = np.sum(b6,axis=1)
d3

In [None]:
d4a = np.random.rand(2,14)
d4 = b6*d4a #note the default in Python is elementwise multiplication with *
d4

In [None]:
d5a = np.random.rand(14,2)
d5 = b6 @ d5a #for matrix multiplication use @
d5

In [None]:
d6a = np.random.rand(3,3)
d6b = np.random.rand(3,1)
x = np.linalg.lstsq(d6a,d6b,rcond=None)
x

## Loops

In [None]:
e1=0
for i in np.arange(0,10) :   #for loop iterates over the vector i (from 0 to 9) with defined boundaries
    e1 = e1 + (b3[i]**2) 
e1

In [None]:
e2 = np.sum(b3**2)
e2

In [None]:
e3=0
i=0
while e3 < 10 :   #while loop iterates indeinfitely while a logical operation is still true
    e3=e3 + (b3[i]**2) 
    i=i+1
e3

## Functions
Lambda functions are used to anonymously define a function inline in Python. Otherwise, they must be defined using def, a colon and indenting

In [None]:
fcn = lambda x: np.exp(np.sin(x)**2) #lambda anonymous function

In [None]:
def fcn2(x):
    np.exp(np.sin(x)**2)

In [None]:
fcn(np.linspace(0,2*np.pi,10))

## Plotting

In [None]:
f1=b2**2
plt.plot(b2,f1)

In [None]:
plt.plot(b2, b2**2, 'o', color='green',label = "line 1")
# plotting the line 2 points 
plt.plot(b2, b2**3, 'o', color='red',label = "line 2")
plt.xlabel('x - axis',fontsize=18)
# Set the y axis label of the current axis.
plt.ylabel('y - axis',fontsize=18)
# Set a title of the current axes.
plt.title('Two or more lines on same plot with suitable legends ')
# show a legend on the plot
plt.legend()
# Display a figure.
plt.show()

In [None]:
[B2,B3] = np.meshgrid(b2,b3)
F3 = np.sin(B2) + np.exp(-(B3-4)**2)
plt.contour(B2,B3,F3)
plt.colorbar          
plt.xlabel('B2',fontsize=20)
plt.ylabel('B3',fontsize=20)
plt.title('F3(B2,B3)',fontsize=20)

In [None]:
plt.contourf(B2,B3,F3)
plt.colorbar         
plt.xlabel('B2',fontsize=20)
plt.ylabel('B3',fontsize=20)
plt.title('F3(B2,B3)',fontsize=20)
plt.colorbar()

In [None]:
plt.pcolor(B2,B3,F3)
plt.colorbar         
plt.xlabel('B2',fontsize=20)
plt.ylabel('B3',fontsize=20)
plt.title('F3(B2,B3)',fontsize=20)
plt.colorbar()

In [None]:
from mpl_toolkits import mplot3d
ax = plt.axes(projection='3d')
ax.plot_surface(B2,B3,F3, rstride=1, cstride=1,cmap='viridis', edgecolor='none')
ax.set_title('surface');
plt.colorbar         
plt.xlabel('B2',fontsize=20)
plt.ylabel('B3',fontsize=20)
plt.title('F3(B2,B3)',fontsize=20)

## Saving/Loading
Python does not have a built-in compressed format like MATLAB, but its works with a much wider range of data types (saving/loading) and has certain commonly-used data formats like pickles (find tuorial here: https://www.datacamp.com/community/tutorials/pickle-python-tutorial). While we will not cover the use of certain specialized packages here, a commonly used I/O package used in geosciences is geopandas:https://geopandas.org/.