# A Brief Introduction to Python, numpy, and matplotlib

In this lesson, we'll learn about the [Python programming language](https://www.python.org/) and the [Jupyter notebook](http://jupyter.org/), as well as the Python packages [numpy](http://www.numpy.org/) and [matplotlib](http://matplotlib.org/).

This isn't a very thorough introduction -- we'll just learn the things that are essential for the course.  Some resources for learning more about Python and its use in scientific computing are linked [at the end of this notebook](#Further-resources).

To run the code in this notebook, you'll need an installation of Python, numpy, and matplotlib.  The easiest way to get them all is to use [CoCalc](http://cocalc.com) -- just create a free account, start a new project, and upload this notebook file.  Then open this notebook there and you're off.

You can also install everything locally on your own machine.  For local installation, [Anaconda](https://www.anaconda.com/) is convenient, or you can just use pip.  All of these are free.  If you're new to Python, I recommend using Python version 3.7 or later.

## Python and the Jupyter Notebook

The code for this course is written in Python, which is a programming language designed to promote code that is easy to read and write.  Python has become one of the most important languages in scientific computing, and is arguably [the most popular programming language in the world](https://www.hpcwire.com/2018/08/07/python-remains-the-most-popular-programming-language/).  It is high-level like MATLAB, but unlike MATLAB it is free and is intended as a general-purpose language.

[Jupyter](http://www.jupyter.org) is a collection of tools for interactive programming in Python.  Most importantly for us, Jupyter includes a browser-based notebook.  The notebook (which you are using now) allows you to run Python code in your web browser; just click on a cell with code and hit shift+enter.  Try it with the cell below.

In [None]:
print("AMCS 252 is the best!")

Try changing the message in the box and running it again.

The box of code above is called a *cell*.  You can use the menu and toolbar near the top of your browser to insert or delete cells or save the notebook.  The text you're reading is also in a cell, which you can edit (just double-click on this text).  You'll also see math in these cells; the math is written using LaTeX.  For example:

$$e^{i \pi} = -1.$$

The menu bar above also has buttons to run cells or to stop code running.

You can find a huge collection of interesting Jupyter notebooks on a wide range of topics [here](https://github.com/jupyter/jupyter/wiki/A-gallery-of-interesting-Jupyter-Notebooks).  Since notebooks combine text, mathematics, and executable code, they're a great way to learn.

If you prefer programming in an IDE, you may wish to try out [Jupyterlab](https://jupyterlab.readthedocs.io/en/stable/).

### Python basics

Python has many built-in functions, like `print`, but most functions are inside *modules*, which aren't loaded unless you `import` them:

In [None]:
from math import sqrt
print(sqrt(2))

We can also import a whole module:

In [None]:
import math
print(math.sqrt(2))

What else is in the `math` module?  You can find out using IPython's tab completion.  Just put your cursor at the end of the line below and hit `tab`:

In [None]:
math.

What does `math.atan2` do, for instance?  You can find out with IPython's magic help function, invoked by using the question mark after a function name.  When you're done reading the help, you can close the pager below by clicking on the "x".

In [None]:
math.atan2?

### Lists

A *list* is an ordered collection of values or objects.

In [None]:
x = ['apple',2,3,4]

You can ask for one or more items from a list as follows:

In [None]:
x[0]

In [None]:
x[:2]

In [None]:
x[:-1]

Two important things to remember:
1. Python lists are indexed starting from zero.
2. When you ask for a range of things, you don't get the last one.

You can quickly make lists of numbers using the `range` function:

In [None]:
y = list(range(5))
y

### Loops

In [None]:
for i in [10, 20, 'x']:
    print(i)
    print('in the loop')
    # still inside the loop

# Now outside the loop    
print('finished')

A *for loop* just iterates over a list.  Notice how the contents of the loop are indented.  Python knows the loop has ended when it finds a line that is not indented.  

To nest loops, indent again:

In [None]:
for i in range(5):
    print(i)
    for j in range(2):
        print('in the inner loop')
    
print('finished')
print(i)

Don't forget the colon!

### Functions

In [None]:
def square(x):
    return x**2

print(square(5))

In [None]:
def identity(x):
    return square(sqrt(x))

In [None]:
identity(5)

Notice how the contents of the function are indented (just like the for loop).  Python knows the function has ended when it finds a line that is not indented.

## Numpy
Python includes a package for numerical computation called numpy,  which will be an essential tool in this course. You may find the  following links helpful:  

- [Numpy docs tutorial](https://docs.scipy.org/doc/numpy/user/quickstart.html)
- [Another nice numpy tutorial](http://cs231n.github.io/python-numpy-tutorial/)

If you want something really exhaustive, read either of the books by Hans Petter Langtangen (KAUST has access to the e-books).  

To get started, we import the numpy module.  We will also tell Python that we want to refer to numpy by the short abbreviation "np":

In [None]:
import numpy as np

### Arrays

The most important Numpy class is the `array`.  You can make arrays just like you would in MATLAB:

In [None]:
x = np.linspace(0, 1, 5)
print(x)

**Try changing the inputs to the *linspace* function and see if you can determine exactly what it does.**

In [None]:
y = np.arange(0, 1, 0.2)
print(y)

**Try changing the inputs to the *arange* function to determine exactly what it does.**  You can also use the help (via "?") to get an explanation.

Like `range`, `arange` omits the final value.

Arrays are like lists, except that you can perform math with them in an easier and faster way:

In [None]:
print(x+y)

In [None]:
print(x*y)

The syntax for creating a multidimensional array in numpy is also similar to Matlab:

In [None]:
A = np.array([[1,2,3],[2,5,6],[3,8,9]])
print(A)

### Indexing

You can slice numpy arrays just as in Fortran 90 or Matlab, but (as with lists) the arrays are indexed from zero and you don't get the last element of a slice.

In [None]:
print(A[1,0])

In [None]:
print(A[:,0])

You can index in some slightly fancier ways, too:

In [None]:
print(A[:,1:])

In [None]:
print(A[-1,:])

### Array and matrix multiplication
How does multiplication of arrays work? Let's see:  

You can do lots of other things with arrays.  Type "A." (notice the period!) and press `tab` to see some of them:

In [None]:
print(A*A)

By default, numpy just multiplies componentwise. If you want to do matrix-matrix (or matrix-vector) multiplication, use the *dot* function:  

In [None]:
np.dot(A,A)

In Python 3, you can also just use the "@" symbol:

In [None]:
A@A

In [None]:
x = np.array([1,2,3])

In [None]:
np.dot(A,x)

In [None]:
A@x

Note that Numpy doesn't have different representations for "row" or "column" vectors.  It will interpret a vector whichever way is necessary so that the operation you ask for makes sense:

In [None]:
x@A

In [None]:
np.dot(x,x)

In [None]:
x@x

In [None]:
np.outer(x,x)

### Linear algebra
What else can you do with ndarrays? Besides basic arithmetic, there is a nice linear algebra package:  

In [None]:
np.linalg?

We can use this package to solve a linear system of equations: $Mx=b$

In [None]:
M = np.array([[0,2],[8,0]])
b = np.array([1,2])
print(M)
print(b)

In [None]:
x = np.linalg.solve(M,b)
print(x)

**Can you think of an easy way to check that $x$ is the correct solution?  Program your check in the box below.**

In [None]:
M@x

We can also solve eigenvalue problems:

In [None]:
lamda, V = np.linalg.eig(M)
print(lamda)
print(V)

Notice how we have put two variables on the left of the equals sign, to assign the outputs of *eig()* to two different variables. What are the two outputs?  **How can you check that these outputs are correct?  Program it in the box below.**

**Try playing around with some of the things you found in the box below. See if you can do the following:**   
1. Create a 10x10 matrix whose $(i,j)$ entry is equal to $i\times j$ without using a loop. Hint: use <em>np.fromfunction()</em>.  
2. Split your 10x10 matrix into five $10 \times 2$ matrices with one line of code. Hint: use <em>np.hsplit()</em>**

## Matplotlib

For plotting we will use the [matplotlib](http://matplotlib.org/) package.  It works a lot like MATLAB's plotting functions.

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt

The line beginning with a "%" is referred to as a magic function.  It makes plots appear in the browser, rather than in a separate window.  If you want to know about all of IPython's magic functions, just type "%magic".

Now for a very simple example. Suppose we want to plot the function $\sin(\exp(x))$ on the interval $x\in(0,4)$. We'll use the numpy versions of the sine and exponential functions, which operate on arrays (the math module versions operate only on scalars):  

In [None]:
x=np.linspace(0,4,1000)
f=np.sin(np.exp(x))
plt.plot(x,f,'--k');

We'll also see later how to make animations with Matplotlib and Jupyter.

### A note about speed

Like MATLAB, Python is relatively slow -- especially when using **loops** with many iterations, nested loops, or deeply nested function calls.  For the exercises in this course, Python will be sufficiently fast, but you should use numpy slicing whenever possible.

We will often need to compute the difference of each pair of successive entries of an array.  Here are two ways to do it.  Which is faster?  We can find out using another "magic" function, `%%timeit`:

In [None]:
N = 1000000
a = np.random.rand(N)
b = np.zeros(N-1)

In [None]:
%%timeit
for i in range(len(b)):
    b[i] = a[i+1]-a[i]  # Compute successive differences

In [None]:
%%timeit
b = a[1:]-a[:-1]

For large-scale computational problems, you shouldn't use Python for any code that needs to be fast. Instead, you can write/generate code in C or Fortran and use weave, [cython](http://cython.org/), [f2py](http://www.f2py.com/), or other similar packages like [numba](https://numba.pydata.org/) to automatically incorporate compiled code into your Python program.

## Why use Python?

Hopefully this brief adventure has convinced you that Python is a worthwhile programming language.  Some of the reasons I chose Python as the language for this course (and other courses I teach) are:
- Compared to lower-level languages like C, C++, and Fortran, Python code is easier to write and read (meaning fewer bugs!)
- A huge number of available libraries means it is easy to get things done in Python without writing a lot of code yourself
- Python is a general-purpose programming language, with features (like optional arguments with defaults) that lead to more elegant, maintainable code than what you can write with specialized languages like MATLAB.
- Python is free and comes installed on most systems.  You can run this whole course with only a web browser and access to the internet.
- It's easy to incorporate code written in other languages into your Python programs.
- It's often easy to [parallelize existing scientific codes](http://numerics.kaust.edu.sa/papers/pyclaw-sisc/pyclaw-sisc.html) using Python.
- All these advantages have made it [very](https://www.economist.com/graphic-detail/2018/07/26/python-is-becoming-the-worlds-most-popular-coding-language) [popular](https://stackoverflow.blog/2017/09/06/incredible-growth-python/)

For a longer discussion with links to some scientific studies of the value of Python, see [this blog post by Lorena Barba](http://lorenabarba.com/blog/why-i-push-for-python/).

## Further resources

### Python, numpy, and matplotlib

We've only scratched the surface of Python here.  You'll learn more in the rest of the course -- and it's great to learn by doing.  But if you want a really solid foundation, you may want to take a look at these:

- http://www.learnpython.org/ (free)
- http://www.codecademy.com/tracks/python (free)
- http://www.diveintopython.net/ (free)
- [A Primer on Scientific Programming with Python](http://www.amazon.com/Scientific-Programming-Computational-Science-Engineering/dp/3642302920) (book; not free)
- [Matplotlib tutorials](https://matplotlib.org/tutorials/index.html) (free)

### Other Python packages for science

Besides numpy and matplotlib, there are many other useful Python packages for scientific computing. Here is a short list:  

- [scipy](http://scipy.github.io/devdocs//) - optimization, ODEs, sparse linear algebra, etc.
- [sympy](http://sympy.org/) - symbolic computation
- Visualization: [yt](http://yt-project.org/), [vispy](http://vispy.org/), [Bokeh](http://bokeh.pydata.org/)
- [pandas](http://pandas.pydata.org/) - data analysis
- [mpi4py](https://mpi4py.readthedocs.io/en/stable/) - parallel computing
- [petsc4py](https://petsc4py.readthedocs.io/en/stable/), [pytrilinos](https://trilinos.org/packages/pytrilinos/) - Python bindings for the "big 2" parallel scientific libraries
- [pyCUDA](http://mathema.tician.de/software/pycuda), [pyOpenCL](http://mathema.tician.de/software/pyopencl) - GPGPU computing
- [FENiCS](http://fenicsproject.org/), [FiPy](http://www.ctcms.nist.gov/fipy/), [PyClaw](http://clawpack.github.io/pyclaw/) - solve complicated PDEs with very sophisticated numerical methods
- [networkX](http://networkx.github.com/), [pygraphviz](https://pygraphviz.github.io/) - graphs
- [astropy](http://www.astropy.org/), [biopython](http://biopython.org/wiki/Main_Page), [pychem](http://pychem.sourceforge.net/) - discipline-specific tools