<h1><center>Python tutorial</center></h1>
<center> Wandi Yu, Hampton University, 03.14.2022 </center>

### outline of this tutorial:
- introduction to Jupyter notebook 
- Python object
    - integer
    - float
    - functions
    - class
    - list
    - tuple
    - set 
    - dict
- loops and conditions
    - conditions
    - loop
    - iterable object 
- Python packages 
    - scientific calculations 
        - NumPy
        - SciPy
        - Pandas 
        - Xarray 
        - Metpy
        - Scikit-Learn
    - other important packages
        - glob
        - datetime
        - os
- read and write data 
    - basic Python I/O
    - .nc
    - .sav
    - .mat
- plot 
    - matplotlib 
    - Cartopy
    - seaborn 
- more resources 


# Jupyter Notebook

## markdown

In the 'Markdown' mode of the Jupyter notebook, you can add text in many formats: 


# this is a heading
## this is a subheading


*italic* and **bold**

lists:

1. list 1
2. list 2

- bullet 1
- bullet 2

or even LaTeX equations: 

$\pi = \rho RT$

for more Jupyter notebook markdowns, please check: https://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Working%20With%20Markdown%20Cells.html

## kernel

The Jupyter notebook has a Python interpreter kernel that will evaluate our code line-by-line. The format of the notebook also allows us to segregate code into blocks (called cells) that can be executed separately.

In [None]:
1;

In [None]:
print (1+2)
print (2+3)

two ways of annotation in the kernel: 

In [None]:
# annotation 1
'''
annotation 2
'''
print (1+3)

In [None]:
a = 1

In [None]:
print (a)

In [None]:
a = 2


# object

**In Python everything is an object**, which means every entity has some metadata (called attributes) and associated functionality (called methods). These attributes and methods are accessed via the dot syntax.

## integer

In [None]:
type(1)

## float

In [None]:
type(1e64)

## functions

In [None]:
def add_up(a,b):
    return a+b

In [None]:
print (add_up(1,2),type(add_up),add_up)

In [None]:
def subtract(a,b):
    return a-b

In [None]:
subtract(3,5)

In [None]:
subtract(a=3,b=5)

it doesn't matter if you change the order

In [None]:
subtract(b=5,a=3)

you can set default argument

In [None]:
def subtract(a,b=0):
    return a-b

In [None]:
subtract(3,5)

### variable argument and keywords

*arg: pack the arguments

In [None]:
def many_args(a, b, *args):
    print(a)
    print(b)
    print(args)

In [None]:
many_args(1, 2, 3, 'spam', None)

\**kw: set keywords

In [None]:
def kw_args(**kw):
    print(kw)

In [None]:
kw_args(a=1, b=2, c=3)

In [None]:
def all_args(*args, **kw):
    print(args)
    print(kw)
    if 'a' in kw:
        print (1)

In [None]:
all_args(1, 2, a=2, b=4, c=5)

In [None]:
import matplotlib.pyplot as plt

In [None]:
?plt.plot

\*args is a tuple, and \**kw is a dictionary, two basic data types in Python that we will show later.

## class 

Creating a new class creates **a new type of object**, allowing new instances of that type to be made. Each class instance can have attributes attached to it for maintaining its state. Class instances can also have methods (defined by its class) for modifying its state. (https://docs.python.org/3/tutorial/classes.html)

In [None]:
class number(object):
    """a number"""

    def __init__(self, x):
        self.n = x

    def add(self, other):
        return number(
            self.n + other)

In [None]:
x = number(1)
x

In [None]:
x.add(2)

In [None]:
class number(object):
    """a number"""

    def __init__(self, x):
        self.n = x
        self.next = x+1
    def __repr__(self):
        return str(self.n)
    
    def add(self, other):
        return self + other


In [None]:
(1+2,)+(2,4)

In [None]:
x = number(1)
x.next

In [None]:
x+number(2)

## string

In [None]:
type('1')

some methods of string

In [None]:
'1'+'2'

In [None]:
','.join('join')

In [None]:
'hello world'.find('wor')

In [None]:
'hello world'.upper(),'hello world'.lower()

In [None]:
'hello world'.isdigit(),'hello world'.isalpha(),'hello world'.isupper(),'hello world'.islower()

### formating output 

1. use %

In [None]:
print ('Hello, %s' % 'world')

2. use format 

documentation: 
https://docs.python.org/3/tutorial/inputoutput.html

In [None]:
print ('Hello {:10.2f}% {}'.format(100.002323,'world'))

next we will introduce 4 buit-in data types to store a collection of data 

## list 

a list is a collection of data in an order, and thus subscriptable

In [None]:
a = [1,2,3,4]
a

In [None]:
a[2]

a list in Python is not just a list of number, **you can put anything into a list!**

In [None]:
a = ['hello world', 3, 2.3,add_up,number,number(x),[1,2]]
a

some methods of list

In [None]:
a = np.array([1,2,4])
b = [5,6,7]
[i+j for (i,j) in zip(a,b)]

In [None]:
a = [1,2,3]
print (a.pop(),a)

In [None]:
a = [1,2,3]
print (a.pop(0),a)

In [None]:
a = [1,2,3]
a.index(2)

in-place methods: 
instead of returning a new object, they modify the list itself, and return nothing 

In [None]:
a = [1,2,4]
b = [5,6,7]
print (a.append(b))
print (a)

In [None]:
a = [2,4,1]
print (a.sort())
print (a)

print (a.reverse())
print (a)

print (a.insert(1,1.5))
print (a)

## tuple 

In [None]:
a = (1,2,4)

you can also put everything into a tuple

In [None]:
a = ('hello world', 3, 2.3,add_up,number,number(x))
a

In [None]:
a+(2,3)

different with list, tuple is **unchangable**

In [None]:
a.pop(2)
a

## set

1. a set cannot have duplicate elements
2. different from tuple or list, set is not subscriptable

In [None]:
a = [1,2,3,4,5,6,1,2]
b = set(a)
b

In [None]:
a = 'adgddagafwe'
print (list(set(a)))
print (''.join(list(set(a))))

In [None]:
b[0]

some methods of set:

In [None]:
a = [1,2,3,4,5,6,1,2]
b = set(a)
b

In [None]:
b.add(12)
b

In [None]:
b.remove(6)
b

set calculations:

In [None]:
set1 = set([1,2,3,4])
set2 = set([4,5,6,7])

In [None]:
set1 & set2

In [None]:
set1 | set2

In [None]:
set1 ^ set2

In [None]:
set1 - set([3])

## dict

you can regard dict as a set of key, each has a corresponding value

In [None]:
dict1 = {'a':1, 'b':2, 'c':3}

In [None]:
dict1['a']

In [None]:
print (dict1.items())
print (dict1.keys())
print (dict1.values())

what can be keys? 

dictionaries are indexed by keys, which can be any **immutable type**; strings and numbers can always be keys. Tuples can be used as keys if they contain only strings, numbers, or tuples; if a tuple contains any mutable object either directly or indirectly, it cannot be used as a key. You can’t use lists as keys, since lists can be modified in place using index assignments, slice assignments, or methods like append() and extend().

update the dictionary 

In [None]:
dict1['d'] = 4
dict1

In [None]:
dict1['a'] = 0
dict1

In [None]:
dict1.pop('a')
dict1

# loops and conditions

## conditions

Boolean data type: 

True and False

In [None]:
print (1 != 2)
print (type(1 != 2))

Boolean operations

In [None]:
a = True
b = False

In [None]:
a & b

In [None]:
a | b

Boolean after 'if' and 'elif' statement

In [None]:
a = 2
if a == 2:
    print ('a=2')
elif a == 1:
    print ('a==1')
else:
    print ('else')

## loops

for loop

In [None]:
list(range(3))

In [None]:
for i in '123':
    print (i)

while loop

In [None]:
i = 0
while i<3:
    print (i)
    i += 1

## iterable object

iterable object is an object which can be looped over or iterated over with the help of a for loop

**list, tuple, set, dict, and string are all iterable**

In [None]:
a = 'spring'
for i in a:
    print (i)

### iterator 

In [None]:
a = range(3)
print (a)
print (type(a))

an iterator generates the numbers **only when needed**, for example, when you call it in a for loop, or in a list

In [None]:
for i in a:
    print (i)

In [None]:
list(a)

another very common iterator is called a **list comprehension**

list comprehension looks like a simplified for loop, and generate a list

for example, if you want to pick up all digits in a string: 

In [None]:
x = [i for i in 'spring2022' if i.isdigit()]
print (x)

then use 'join' property of string:


In [None]:
''.join(x)

## avoid loops

**loops are very time consuming, so always avoid it!**


try some packages 

you may also want to read this: https://medium.com/python-pandemonium/never-write-for-loops-again-91a5a4c84baf

# Python package

import packages

In [None]:
import numpy as np
from itertools import combinations

check description:

In [None]:
?np.round

**for example, itertools**

The module standardizes a core set of fast, memory efficient tools that are useful by themselves or in combination.: https://docs.python.org/3/library/itertools.html

In [None]:
from itertools import combinations
list(combinations([1,2,3],2))

## packages for scientific calculation

## NumPy 

the **most important package** in Python in scientific calculations

It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more. (user guide: https://numpy.org/doc/stable/user/index.html)

a numpy array is similar with Matlab matrix

numpy cheat sheet: 

http://datacamp-community-prod.s3.amazonaws.com/ba1fe95a-8b70-4d2f-95b0-bc954e9071b0

data type: 

- numpy ndarray

In [None]:
import numpy as np
a = np.zeros((2,3)) 
print (a)
print (type(a))

operations:

+, - , \*, /, **

//, % 

dot, sqrt, mean, std, median, max, min ...

In [None]:
5%2

In [None]:
5//2

In [None]:
a = np.random.rand(2)
b = np.random.rand(2)
a+b

**numpy is much faster than python list**

In [None]:
xl = range(10000)
yl = range(10000)
xa = np.arange(10000)
ya = np.arange(10000)

In [None]:
%%timeit -n3

[i + j for i, j in zip(xl, yl)]

In [None]:
%%timeit -n3

xa + ya

#### array indexing

In [None]:
x = np.arange(10)
y = x[2:5]  # like python list indexing
len(y),y

In [None]:
x[x%2 == 1]

In [None]:
y = [1,2,3,4,5]
y>5

In [None]:
x > 5

In [None]:
np.where(x > 5)

In [None]:
x[np.where(x > 5)]

In [None]:
x[np.where(x > 5, x, 0)]

## SciPy

another very important package for scientific calculations

SciPy is a collection of mathematical algorithms and convenience functions built on the NumPy extension of Python. It adds significant power to the interactive Python session by providing the user with high-level commands and classes for manipulating and visualizing data. : https://scipy.github.io/devdocs/tutorial/index.html#user-guide

including: 

- numerical integration 
- numerical differentiation 
- optimization and root finding
- distributions and special functions
- sparse linear algebra  <small>(NumPy is for _dense_ linear algebra)</small>
- signal processing
- and more


In [None]:
x = np.random.rand(100)

In [None]:
from scipy import stats
import matplotlib.pyplot as plt

In [None]:
cauchies = [(loc, stats.cauchy(loc=loc, scale=1.)) for loc in (-1, 0, 1)]
x = np.arange(-3., 3., .05-1e-9)
for loc, cauchy in cauchies:
    plt.plot(x, cauchy.pdf(x), label="loc=%+d" % loc)
plt.legend()
plt.title("Cauchy PDFs")

## Pandas 

Pandas is a Python package for fast and easy data analysis and manipulation.

10 minutes to pandas: https://pandas.pydata.org/docs/user_guide/10min.html

In [None]:
import pandas as pd
df1 = pd.DataFrame({'customer': ['Luisa', 'Justine', 'Titus'],
                    'balance': [1583.31, 207.13, 820.89],
                    'age': [51, 64, 43]})
df1

## xarray

a package like pandas, but is developed by the Climate Corporation, so very powerful in scientific calculations in the field of atmospheric sciences

user guide: https://docs.xarray.dev/en/stable/

example: 
    https://docs.xarray.dev/en/stable/examples/weather-data.html

## MetPy

a package developed by NCAR: https://unidata.github.io/MetPy/latest/examples/index.html#

## Scikit-Learn 

a powerful machine learning package in Python:
    https://scikit-learn.org/stable/

## other useful packages:

### glob 

unix style pathname expansion: https://docs.python.org/3/library/glob.html

example: 

In [None]:
from glob import glob
glob('~/Desktop/*.nc')

### datetime

the basic date and time modeule in python:
https://docs.python.org/3/library/datetime.html#module-datetime

example:

In [None]:
import datetime
now = datetime.datetime.now()
now

In [None]:
now.strftime('%Y-%m-%d, %H:%M:%S')

In [None]:
now+datetime.timedelta(days=1)

### os 

operating system interface: https://docs.python.org/3/library/os.html

example:

In [None]:
import os

if os.path.exists('example.txt'):
    os.rename('example.txt','example1.txt')

# read and write data

## basic read function

In [None]:
f1 = open('example.txt','w')
f1.close()

In [None]:
x = list(np.random.randn(10))
with open('example.txt','w') as f1:
    for i in x:
        f1.write(str(i))
        f1.write(',')

In [None]:
with open('example.txt','r') as f2:
    text = f2.readlines()

In [None]:
text

In [None]:
text[0].strip().split(',')[:5]

In [None]:
[float(t) for t in text[0].split(',')[:5]]

## .nc

### use package netCDF4

### use package xarray

powerful, but is bulkier so slower than netCDF4

## .sav

## .mat

# plot

## matplotlib

Matplotlib lets you plot things, and matplotlib.pyplot is a layer on top of it to give it a MATLAB-like syntax

# line, contour, scatter, and histgram plot

In [None]:
?plt.contour

In [None]:
?np.random.randn

In [None]:
import numpy as np
x = np.random.rand(100)
y = np.random.randn(100)

In [None]:
import matplotlib.pyplot as plt
plt.figure(figsize=[8,6])
plt.subplot(2,2,1)
plt.plot(x,linewidth=2,label='x')
plt.plot(y,'r--',label='y')
plt.legend()
plt.title('line')


plt.subplot(2,2,2)
plt.contour(np.arange(10),np.arange(10),y.reshape([10,10]),colors='k')
plt.contourf(np.arange(10),np.arange(10),x.reshape([10,10]),cmap='YlOrBr')
plt.colorbar()
plt.title('contour')

plt.subplot(2,2,3)
plt.scatter(x,y,alpha=0.5)
plt.xlabel('x')
plt.ylabel('y')
plt.annotate('This is \nthe first point',(x[0],y[0]),(x[0]+0.1,y[0]+0.1),
            arrowprops=dict(facecolor='black', shrink=0.05),)
plt.title('scatter')

plt.subplot(2,2,4)
plt.hist(x,alpha=0.4,label='x')
plt.hist(y,alpha=0.4,label='y')
plt.title('histgram')

plt.tight_layout()

resources:

matplotlib official tutorial:

https://matplotlib.org/stable/tutorials/index.html

matplotlib gallery: 

https://matplotlib.org/stable/gallery/index.html

named color: 

https://matplotlib.org/stable/gallery/color/named_colors.html

colormaps: 

https://matplotlib.org/stable/tutorials/colors/colormaps.html#sphx-glr-tutorials-colors-colormaps-py

## cartopy 

a python package to draw geophysical data: https://scitools.org.uk/cartopy/docs/latest/

In [None]:
import matplotlib.pyplot as plt
import cartopy.feature as cfeature
import cartopy.crs as ccrs
from cartopy.mpl.ticker import LongitudeFormatter, LatitudeFormatter


In [None]:
lon = np.arange(0,368,8)
lat = np.arange(-90,94,4)

values = np.ones([len(lat),len(lon)])
values *= np.sin(lon*np.pi/180)[np.newaxis,:]

In [None]:
plt.figure(figsize=[8,6])
ax1 = plt.axes(projection=ccrs.PlateCarree(central_longitude=180)) 
ax1.add_feature(cfeature.COASTLINE, edgecolor='black')
lon_formatter = LongitudeFormatter(zero_direction_label=True)
lat_formatter = LatitudeFormatter()
ax1.xaxis.set_major_formatter(lon_formatter)
ax1.yaxis.set_major_formatter(lat_formatter)
ax1.add_feature(cfeature.COASTLINE, edgecolor='black')
ax1.set_global()
ax1.set_yticks([t*30-90 for t in range(7)], crs=ccrs.PlateCarree())
ax1.set_xticks([t*60 for t in range(6)], crs=ccrs.PlateCarree())

plt.contourf(lon,lat,values,transform=ccrs.PlateCarree()
             ,cmap='Reds',levels=np.linspace(-1,1,101))
cbar = plt.colorbar(shrink=0.6)
cbar.set_label('values')
plt.title('Example',fontsize=16)

## seaborn 

a more advanced plot package based on pandas: 

https://seaborn.pydata.org/examples/index.html

# More resources

a very good online interactive Python tutorial: 
    https://www.w3schools.com/python/
    
hands-on machine learning online tutorial: 
    https://www.kaggle.com/learn/intro-to-machine-learning
    
Jupyter notebook cheat sheet: 
    https://www.edureka.co/blog/wp-content/uploads/2018/10/Jupyter_Notebook_CheatSheet_Edureka.pdf
    
Python for IDL users:
    http://www.met.reading.ac.uk/~swsheaps/python.html

Numpy for IDL users:
    http://mathesaurus.sourceforge.net/idl-numpy.html
    
Numpy for MATLAB users:
    http://mathesaurus.sourceforge.net/matlab-numpy.html
