<a href="https://colab.research.google.com/github/mattbaxter689/Univ6080/blob/main/ScientificPython.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Credit: this notebook is based on [slides created by Roland Memisevic](http://www.iro.umontreal.ca/~memisevr/teaching/mlvis2013/pythonintro.pdf). Some of the material related to objects and classes was adapted from [Stavros Korokithakis's tutorials](http://www.stavros.io/tutorials/python/).

# Introduction
“Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to combine remarkable power with very clear syntax”, and its standard library is large and comprehensive. Its use of indentation for block delimiters is unique among popular programming languages.” -- Wikipedia

Some features of Python:

   * Interpreted Language
   * Strict Syntax
   * Indentation!
   * "Batteries Included"
   * Dynamic types
   * Mixes [imperative](http://en.wikipedia.org/wiki/Imperative_programming), object-oriented, and functional programming elements

The "IPython" interactive shell is *highly* recommended for any kind of interactive work! Note that here, I'm using a [Jupyter Notebook](http://jupyter.org/) running in the [Google Colaboratory](https://colab.research.google.com/) environment.

There are various implementations of Python available, and several (incompatible) versions. In this course, we recommend that you use the same version as Google Colaboratory's default (currently 3.6), though the materials we distribute should be compatible with Python 2.X.

# "Hello World"

In [None]:
print("hello world")

In [None]:
print("hello world", 1, 2, 1 + 2)

In [None]:
a = 1
b = 1
print("hello world", a + b)

In [None]:
a = 1
b = "hello"
print("hello world", a + b)

What happened here?

In [None]:
b = "world"
print(f"hello {b}")

This preferred way of formatting strings, called f-strings, was introduced in Python 3.6. An older way to format strings (compatible with Python 2 series but not recommended) is '%-formatting:'

In [None]:
print("hello %s" % b)

# Built-in data structures
## Tuples

In [None]:
T = (1, 2, 3, "hello")
print(T[0])

Note that in Python, indexing starts at 0!

## Lists

In [None]:
L = [1, 2, 3, "hello"]
L[0] = "Lists are mutable"
print(L[0])

In [None]:
L.append("goodbye")
print(L[-1])

## Dictionaries ("Hashes")

In [None]:
D = {"a": 1, "b": 2}
print(D["a"])

# Functions, control structures
## Function definition

In [None]:
def timesfour(x):
    return 4 * x

print(timesfour(2))

# Control structures
## If-then-else

In [None]:
s = "z"
if s == "y":
    print("y")
elif s == "z":
    print("z")
else:
    print("b")

### While

In [None]:
a = 1.0
s = "hello"
while a != 10.0 and s == "hello":
    a = a + 1.0
print(a)

### For-loops

In [None]:
for i in range(3):
    print(i)

In [None]:
for i in [1, 2, 'x', 3, 4, 'h', 5]:
    print(i)

Note that either double or single quotes can be used to delimit strings.
# Everything is an object
   * Functions, too.
   * Objects have member components (functions and attributes).

In [None]:
type(timesfour)

## Defining classes

In [None]:
# here "object" specifies the superclass
class MyClass(object):
    common = 10
    # this is a constructor
    def __init__(self):
        self.myvariable = 3
    def myfunction(self, arg1, arg2):
        return self.myvariable

classinstance = MyClass()
# note that arguments passed to myfunction are ignored
print('classinstance.myfunction(1, 2): %s' % classinstance.myfunction(1, 2))

classinstance2 = MyClass()
# This variable is shared by all classes.
print ('classinstance.common: %s' % classinstance.common)
print ('classinstance2.common: %s' % classinstance2.common)

In [None]:
# Note how we use the class name instead of the instance.
MyClass.common = 30
print ('classinstance.common: %s' % classinstance.common)
print ('classinstance2.common: %s' % classinstance2.common)

In [None]:
# This will not update the variable on the class,
# instead it will bind a new object to the old
# variable name.
classinstance.common = 10
print ('classinstance.common: %s' % classinstance.common)
print ('classinstance2.common: %s' % classinstance2.common)

In [None]:
MyClass.common = 50
# This has not changed, because classinstance.common is
# now an instance variable.
print ('classinstance.common: %s' % classinstance.common)
# but this has changed
print ('classinstance2.common: %s' % classinstance2.common)

In [None]:
# This class inherits from MyClass. The example
# class above inherits from "object", which makes
# it what's called a "new-style class".
# You can read more about these here: http://stackoverflow.com/a/54873
# Multiple inheritance is declared as:
# class OtherClass(MyClass1, MyClass2, MyClassN)
class OtherClass(MyClass):
    # The "self" argument is passed automatically
    # and refers to the class instance, so you can set
    # instance variables as above, but from inside the class.
    def __init__(self, arg1):
        self.myvariable = 3
        print(arg1)

In [None]:
classinstance = OtherClass("hello")
print ('classinstance.myfunction(1, 2): %s' % classinstance.myfunction(1, 2))

In [None]:
# This class doesn't have a .test member, but
# we can add one to the instance anyway. Note
# that this will only be a member of classinstance.
classinstance.test = 10
print ('classinstance.test: %s' % classinstance.test)

However, note that we can get pretty far without needing classes!

# Getting help
When working interactively,

In [None]:
a = [1,2,3]
help(a)

``help`` expects the *object* you need help about. Just instantiate one, if you do not have it!

The ``?`` operator is also useful (only works in IPython and Jupyter):

In [None]:
list?

IPython also has a number of so-called "magic" functions: e.g. %autoreload, %paste, %debug, %hist, %timeit. Type %magic at a prompt to learn more.

In [None]:
%lsmagic

# Scripts, modules, packages
* Naming convention for scripts: ".py"
   * ``python myscript.py``
* In IPython we can use:
   * ``%run myscript.py``

In [None]:
%%file mytest.py
for i in range(3):
    print("hello %d" % i)

In [None]:
%run mytest.py

## Modules
* Combine common functionality in "modules" (= "libraries")
* Same naming convention as for scripts: ".py"
* Can combine modules in **packages**

To use a module:

In [None]:
import datetime
# imports a single object
from datetime import date
# imports everything (careful, pollutes namespace)
from datetime import *

Remember, everything is an object. Access the contents of modules accordingly.

In [None]:
from datetime import datetime
datetime.now()
myobject = datetime.month
print(myobject)

# Packages for data crunching

## NumPy

NumPy is the fundamental package for scientific computing with Python.

## Matplotlib

Matplotlib is a Python plotting library, inspired by MATLAB. It can be used both within a traditional Python system (via the object-oriented API) but also in a more convenient form in Jupyter and IPython via the PyPlot interface.

## PyPlot

PyPlot is a shell-like interface to Matplotlib, with a similar look and feel to MATLAB.

The most common way to interact with PyPlot is as follows

```
import matplotlib.pyplot as plt
import numpy as np
```

## SciPy

SciPy is an ecosystem of open-source software for mathematics, science, and engineering. It includes NumPy, Matplotlib, IPython, Sympy and pandas. There is also a library called SciPy, part of this same ecosytem, which is a fundamental library for scientific computing. It is more for specific needs, like Fourier transforms, special functions, etc.

## Pylab

The PyPlot and NumPy namespaces can be combined into a single one via PyLab. Its use is now discouraged, but we'll use it in this notebook for simplicity. In future notebooks we will import PyPlot and NumPy separately, as above.


In [None]:
from matplotlib.pylab import *

# NumPy arrays
* The central object for representing *data* is the NumPy array.
* It is an n-dimensional generalization of a matrix.
* It can hold data of various types, like int, float, string, etc.
* The most common use is for representing vectors and matrices filled with numbers (such as floats)
* One way to generate an array is to use ``numpy.array``
* But we'll see other ways soon

The wildcard import with the ``*`` above, although being bad practice, imports PyLab into the global namespace. That lets us refer to ``array`` instead of ``numpy.array``, and so on.

In [None]:
array([1,2,3])

In [None]:
ones((2,3))

In [None]:
zeros((3,2))

In [None]:
eye(4)

In [None]:
# draw from normal distribution with mean 0, std 1
randn(2,2)

There are many other useful commands built in, e.g. ``load()``, ``save()``, ``loadtxt()``, etc.

NumPy arrays have many useful member components:

In [None]:
a = array([[1,2,3],[4,5,6]])
print(a)
print(a.T)

In [None]:
a.mean(0)

In [None]:
a.mean(1)

In [None]:
a.mean()

In [None]:
a.std(0)

In [None]:
a.max()

One of the most useful built-in commands is ``shape``:

In [None]:
print(randn(3,3).shape)
print(randn(2,5,7,3).shape)
print(array([1,2,3,4]).shape)

## Doing computations
To do computations on arrays, numpy as functions like

* ``exp()``, ``log()``, ``cos()``, ``sin()``

and operators like

* ``+``, ``-``, ``*``, ``/``

These typically operate *elementwise*.

Other useful matrix operations include

* ``dot()``, ``numpy.linalg.svd()``, ``numpy.linalg.eig()``

## Accessing your data

In [None]:
a = array([1,2,3])
print(a[0])
print(a[1])

In [None]:
a = array([[1,2,3],[4,5,6]])
print(a[0,0])
print(a[1,2])

In [None]:
a = randn(2,3,4)
print(a)
print(a[0,0,0])
print(a[0,1,3])

In [None]:
a = array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
print(a)
print(a[0, :])  # a "slice"
print(a[:, 0])  # another "slice"
print(a[1:3, 0])  # another "slice"
print(a[1:3, :])  # another "slice" (this is a 2-d block)

# Plotting

In [None]:
x = arange(0,pi,pi/180)
plot(x, cos(x))

In [None]:
x1 = 1 + 0.1*randn(20)
y1 = 1 + 0.1*randn(20)
x2 = -1 + 0.5*randn(20)
y2 = -1 + 0.5*randn(20)
scatter(x1,y1)
scatter(x2,y2,c='r',marker='x')
xlim(-2,2)
ylim(-2,2)
legend(['class 1', 'class2'], loc='lower right')

In [None]:
subplot(1,2,1)
hist(randn(200),bins=10)
ylabel('counts')
xlabel('bins')
subplot(1,2,2)
boxplot(randn(200))
title('box and whisker plot')


In a *box and whisker plot*, the box extends from the lower to upper quartile values of the data, with a line at the median. The whiskers extend from the box to show the range of the data.  Outlier points are those past the end of the whiskers.

# Broadcasting and newaxis
How do you add a $2 \times 5$ matrix and a $1 \times 5$ vector?

* In Matlab, you would use ``repmat`` (or for the pros, ``bsxfun``). You can use Python's ``repmat`` equivalent, called ``tile``.
* However, NumPy comes with another possibility, called "broadcasting"
* NumPy will always try to copy every dimension in each array as often as it needs to make the dimensions fit

![title](https://scipy-lectures.org/_images/numpy_broadcasting.png)

In [None]:
print((randn(2,5) + randn(1,5)).shape)


How about a $2 \times 5$ matrix plus a $1 \times 5 \times 3$ tensor?

In [None]:
print(randn(2,5).shape)
print(randn(2,5) + randn(1,5,3))

We need to make the number of dimensions match. Solution: Numpy's ``newaxis``

In [None]:
print(randn(2,5)[:,:,newaxis].shape)
print((randn(2,5)[:,:,newaxis] + randn(1,5,3)).shape)

# Style

Python has its own style convention called ["PEP8"](http://www.python.org/dev/peps/pep-0008). This is a great resource to check out if you are unsure about how to format your code. There are also command-line tools (e.g. ``pep8``) which will verify your code against the standard. Most serious open-source projects will expect you to adhere to the PEP8 standard before contributing code.

# Additional resources

Here's a short YouTube video about NumPy arrays:


In [None]:
from IPython.display import YouTubeVideo
# a short video about using NumPy arrays, from Enthought
YouTubeVideo('vWkb7VahaXQ')

* My favourite tutorial for scientific Python is the [Python Scientific Lecture Notes](http://scipy-lectures.github.io/)
* If you're already a Matlab user, see [NumPy for Matlab users](https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html) is a useful reference
* J.R. Johanssen's [Lectures on scientific computing with Python](https://github.com/jrjohansson/scientific-python-lectures)