# Python Basics for Matlab Wizards
*Compiled by Sage Lichtenwalner, Rutgers University, June 5, 2018*

Welcome to Python.  

In this example, we will highlight some of the basics of programming in Python, along with some tips and tricks for those more familiar with Matlab.  There are a number of additional resources at the bottom I encourage you to check out.

This example was written in Google's Colaboratory environment, but it could be run in any [Jupyter Notebook](http://jupyter.org) environment.  You can also try these commands directly on the Python command line, though your results may vary.  For more information about Google Colab, check out their [Hello, Colaboratory](https://colab.research.google.com/notebooks/welcome.ipynb) introduction and the [Overview](https://colab.research.google.com/notebooks/basic_features_overview.ipynb).

## Some Basics

To execute the code in a cell click the **Play** icon on the left, type **Cmd/Ctrl+Enter** to run the cell in place or type **Shift+Enter** to run the cell and move focus to the next cell.

In [None]:
2+2

In [None]:
# Variable assignment
x = 4

# What kind of object is this?
type(x)

In [None]:
# Float 
y = 1.25
type(y)

In [None]:
# String
z = 'Python is great'
type(z)

In [None]:
# We can do some basic math
print( x + y )
print( x/y )

**By default the Colab/Jupyter environment will print out the output from the last line** without having to specify the `print` command.  

However, if we want to output the result from additional lines (as we did above), we need to use `print` on each line.  Note, Python 3 requires parentheses around the print command, but it is optional in Python 2.

Sometimes, we will want to suppress the output from the last line.  To do this, simply add a semi-colon "`;`" at the end.

In [None]:
# Find out what variables you already have in memory
whos

## Beware of type conflicts

Sometimes the result you expect is not what you get.  That's especially the case when mixing object types, like integers and floats, or numbers and strings.  Sometimes there are differences between the Python version you are running.   You can switch Python versions by going to *Runtime -> Change runtime type* in the menu. 

In [None]:
# Dividing two ints results in an int
print( 21/3 ) # Two integers
print( 23/3 ) # In Python 2 the (incorrect) result is an integer. In Python 3 this works as expected.
print( 23.0/3.0 ) # Two floats results in a float

In [None]:
# You can't concatenate strings and ints
print( z + x )

In [None]:
# But you can multiply them
print( z * x )

In [None]:
# Convert an int into a string
print( z + ' ' + str(x) + ' you' )

In [None]:
# A better way
print( 'Python is great %s you' % x )

## Fun with Lists

In [None]:
my_list = [3, 4, 5, 9, 12]

Remember, Python uses 0-based indexing, so to grab the first element you use "0" and the last element would n-1.  (In Matlab you would use 1 to n.)

In [None]:
# The fist item
my_list[0]

In [None]:
# The last item
my_list[-1]

In [None]:
# Extract a subset
my_list[2:4]

In [None]:
# Update a value
my_list[3] = 99
my_list

In [None]:
# Warning, Python variables are object references and not copies by default
my_second_list = my_list
print( my_second_list )

my_second_list[0] = 66
print( my_second_list )
print( my_list ) # The first list has been overwritten

In [None]:
# To avoid this, create a copy of the list, which keeps the original intact
my_list = [3, 4, 5, 9, 12]
my_second_list = list(my_list) # You can also use copy.copy() or my_list[:]

my_second_list[0] = 66
print( my_second_list )
print( my_list )

## Dictionaries

In [None]:
my_dict = {'temperature': 21, 'salinity':35, 'sensor':'CTD 23'}
my_dict

In [None]:
# Accessing a key/value pair
my_dict['sensor']

In [None]:
# Grab a list of dictionary keys
my_dict.keys()

## Functions, Conditions and Loops

If you're familiar with how to do these in Matlab or R, it's all very similar, just with a different syntax.  Remember, Python uses spaces to group together sub-elements, rather than parentheses, curly braces, or end statements.  Traditionally, you can use 2 or 4 spaces to indent lines.

In [None]:
def my_func(number):
  print('Running my_func')
  if type(number)==int:
    for i in range(number):
      print(i)
  else:
    print("Not a number")

my_func(4)
my_func('hi')

## NumPy
NumPy is an essential library for working with scientific data.  It provides an array object that is very similar to Matlab's array functionality, allowing you to perform mathematical calculations or run linear algebra routines.

In [None]:
import numpy as np

In [None]:
# NumPy has its own data types
xi = np.array([1, 2, 3], dtype=np.int) # We could also specify np.float
print(xi)
print(type(xi))
print(xi.dtype)

In [None]:
# Let's add the arrays
xi+xi

In [None]:
# Multiplication is easy too.  
# Note this is equivalent to xi.*xi in Matlab
xi*xi

In [None]:
# Inner product
# Matlab equivalent: x*x'
np.inner(xi,xi)

In [None]:
# Outer product
# Matlab equivalent: x'*x
np.outer(xi,xi)

In [None]:
# Cross product
np.cross(xi,xi)

In [None]:
# Initialize an array of zeros
x = np.zeros(shape=(4,5))
x

In [None]:
y = x+2
y

In [None]:
# Array dimensions
print( xi.shape )
print( xi.size )
print( y.shape )
print( y.size )

In [None]:
# Random numbers
z = np.round(np.random.random(x.shape)*5)
z

In [None]:
# Slicing
z[2:4, ::2] # Extract 2-4 on the first axis (columns), stride of 2 on the second axis(rows)

In [None]:
# Sum the rows
z_sum = z.sum(axis=1)
z_sum

In [None]:
# Matrix multiplication
y.transpose() * z_sum

In [None]:
# Several universal functions are included (e.g. sin, cos, exp)
np.sin(z_sum)

## SciPy
The SciPy library includes a huge collection of mathematical algorithms, functions and constants to enable scientific computing.  

It is build upon NumPy, so data is typically expected to be in a NumPy array format.  Some useful features include:
* **constants** - Physical and mathematical constants
* **fftpack** - Fast Fourier Transorms
* **integrate** - Integration and ODE solvers
* **interpolate** - Interpolation and smoothing
* **io** - File Input and Output
* **linalg** - Linear algebra
* **signal** - Signal processing
* **stats** - Statistics

For more information, check out the [SciPy Tutorial](https://docs.scipy.org/doc/scipy/reference/tutorial/index.html)


## Pandas
[Pandas](https://pandas.pydata.org) is great for working with spreadsheet like tables, like Excel or CSV files.  

However, it is not great for multidimensional arrays (e.g. x,y,z).  For that you should use [Xarray](http://xarray.pydata.org).

For this example, we will demonstrate the power of pandas to quickly load a datafile and play with ut using some archived meteorological data from [NDBC 44025](http://www.ndbc.noaa.gov/station_history.php?station=44025).

In [None]:
import pandas as pd

In [None]:
url = 'http://www.ndbc.noaa.gov/view_text_file.php?filename=44025h2017.txt.gz&dir=data/historical/stdmet/'
ndbc = pd.read_csv(url,sep='\s+',skiprows=[1])

In [None]:
ndbc.head()

In [None]:
ndbc['WDIR'].head(3) # Just the first 3 elements of one variable

In [None]:
# Take the mean of each column
ndbc.mean(axis=0)

In [None]:
# Monthly means
ndbc['WSPD'].groupby(ndbc['MM']).mean()

In [None]:
# Basic plotting is bulit into Pandas
ndbc['WSPD'].groupby(ndbc['MM']).mean().plot();

In [None]:
# Create a datetime vector
from datetime import datetime
df = pd.DataFrame({'year': ndbc['#YY'], 'month': ndbc['MM'], 'day': ndbc['DD'], 'hour': ndbc['hh'], 'minute': ndbc['mm']})
ndbc['datetime'] = pd.to_datetime(df)

# Change the array index to use the datetime column
ndbc = ndbc.set_index('datetime')

# Now we can easily plot the Air Temperature vs. datetime!
ndbc['ATMP'].plot();

In [None]:
# Pandas even lest you do a whole bunch of quick analysis in one line
ndbc.describe()

## Matplotlib

[Matplotlib](https://matplotlib.org) is the Matlab-like equivalent to plotting in Python.

Remember, you can add a semi-colon to the last line to supress unwanted output, which often happens when using matplotlib.

In [None]:
import matplotlib.pyplot as plt

In [None]:
# Let's make a simple plot
t = np.linspace(0,10,100)
y = np.cos(t)*np.sin(t)

plt.plot(t, y, '-rs'); # Line with red squares

In [None]:
# A slightly more advanced example
mu, sigma = 100, 15
x = mu + sigma * np.random.randn(100)

plt.plot(x,'r.') #Red dots
plt.xlabel('Smarts')
plt.ylabel('Probability')
plt.title('Histogram of IQ')
plt.text(10, 135, r'$\mu=100, \ \sigma=15$'); #Annotations can support LaTex

For more, check out the great list of [Matplotlib Examples](https://matplotlib.org/gallery/index.html), and also check out the [Seaborn](https://seaborn.pydata.org) library, which provides a great set of defaults to make your plots look better.  (In fact, Google Colab includes this by default.)

## Additional Resources

* [NumPy for Matlab users](https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html) - A quick introduction from SciPy
* [NumPy for Matlab users Cheatsheet](http://mathesaurus.sourceforge.net/matlab-numpy.html)
* [MATLAB vs. Python: Top Reasons to Choose MATLAB](https://www.mathworks.com/products/matlab/matlab-vs-python.html) - For another perspective
* [Pangeo Python Basics Tutorial](https://github.com/pangeo-data/pangeo-tutorial-sea-2018/blob/master/notebooks/1.0.scientific_python_ecosystem.ipynb)  - Many of the above examples come from this.
* [Python for Matlab Users](http://researchcomputing.github.io/meetup_fall_2014/pdfs/fall2014_meetup13_python_matlab.pdf) - A meetup presentation from CSU.  This is a good overview.  Some of the above examples also come from this.
* [Pyzo Python vs. Matlab](http://www.pyzo.org/python_vs_matlab.html)
* [Webinar: Python for MATLAB Users, What You Need to Know](https://www.youtube.com/watch?v=YkCegjtoHFQ)
* [Python Graph Gallery](https://python-graph-gallery.com) - For when you need more inspiration