# Introduction to Python using Jupyter

Python is a general purpose programming language, however, we will focus on scientific computations using NumPy package. In the Jupyter environment, Python code is organized in cells that can be run seperately. Let's have a look at the following cell: 

In [None]:
# (Comments in Python start with #)
# Run the below code by clicking this cell and hitting shift+enter
print('Hello world')

In what follows, the cells are meant to be run in the order they appear. Changing the ordering may cause errors. 

## Using Python as a calculator

An excellent way to learn Python is to read the 
[Python Tutorial](https://docs.python.org/3/tutorial/index.html). I have reproduced here a couple of examples from [Section 3.1](https://docs.python.org/3/tutorial/introduction.html#using-python-as-a-calculator) of the tutorial.

Expression syntax is straightforward: the operators `+`, `-`, `*` and `/` work just like in most other languages (for example, Pascal or C); parentheses `()` can be used for grouping. For example:

In [None]:
50 - 5*6

In [None]:
(50 - 5*6) / 4

In [None]:
8 / 5  # division always returns a floating point number

## Variables 

A variable is an object where a value is stored. Let's have some examples

In [None]:
# We define variables a, b and c
a = 1
b = 1
c = a + b

# We can change the value of a
a = 2

# Print all the three variables using formatted string, or f-strings, see
# https://docs.python.org/3/reference/lexical_analysis.html#f-strings
print(f'{a = }, {b = }, and {c = }')

In [None]:
# We can also print the value of one of the variables simply by
a

In [None]:
# Get info on all the variables
%whos

## The help system

Appending a function or command name with `?` gives its short description. More detailed documentation can be found in the references at the bottom of the _Help_ menu. The JupyterLab environment consists of several components, and each of them is documented sepately. The following should help you to choose the right reference:
* Python for regular python functions such as `print`
* IPython for commands starting with `%`, for example `%whos`
* NumPy for functions related to matrices
* Matplotlib for functions related to plotting

In [None]:
# Get description of the print function
print?

In [None]:
# Get description of the %whos command
%whos?

In [None]:
# Appending ? does not work with operators but the following does
help('+')

In [None]:
# You may find the interactive help system useful, personally I prefer googling
# Note that running this opens up an input box and the kernel is in the busy state 
# (see the status line at the bottom) until you type 'quit' in the input box.
help()

## Constants

The imaginary unit is `1j`. Note that `j` by itself can be used as a name of a variable (but, to avoid confusion, you probably don't want to do this if you are also using complex numbers). NumPy package defines many commonly used  constants. Before we can use them we need to import the package. (For more info on importing see [Modules Section](https://docs.python.org/3/tutorial/modules.html#modules) of the Python Tutorial.)

To import NumPy run the following cell. **If you restart the kernel you need to run the cell again.** In other words, if you get the error message _"NameError: name 'np' is not defined"_, then run the below cell. Note that the import statement does not produce any visible output.

In [None]:
import numpy as np

Recall that Euler's formula says $e^{\pi i} + 1 = 0$. Let's write the left-hand side of the formula in Python.

In [None]:
# Note that power is ** instead of the more common ^  
# This should give zero up to the machine precision
err = np.e**(np.pi * 1j) + 1
err

The default data type for number in numpy is called `float`. (Somewhat confusingly, it is essentially synonymous to both `np.float_` and `np.double`, and on a typical system it is `np.float64`.) Let's get info on the precision and limits of this type.

In [None]:
print(np.finfo(float))

In [None]:
# Machine epsilon gives an upper bound on the relative error due to rounding 
# in floating point arithmetic. For our purposes this is 
eps = np.finfo(float).eps
eps

In [None]:
# Check that the rounding error err above is indeed less than eps
abs(err) < eps

In [None]:
# Print eps after rounding to two digits
print(f'{eps:.1e}')

## Mathematical functions

See the [list of mathematical functions](https://numpy.org/doc/stable/reference/routines.math.html) in NumPy for documentation of many mathematical functions.

Recall that $\sin(\pi) = 0$. Let's write the left-hand side of the formula in Python.

In [None]:
# This should give zero up to the machine precision
np.sin(np.pi)

## First steps towards programming

Consult [Section 3.3](https://docs.python.org/3/tutorial/introduction.html#first-steps-towards-programming) of the tutorial for a detailed explanation of the below example. 

In [None]:
# Fibonacci series:
# the sum of two elements defines the next
a, b = 0, 1
while a < 10:
    print(a)
    a, b = b, a+b

## Control flow

Consult [Section 4](https://docs.python.org/3/tutorial/controlflow.html#more-control-flow-tools) of the tutorial for a detailed explanation of the below examples. 

In [None]:
x = int(input("Please enter an integer: "))

if x < 0:
    x = 0
    print('Negative changed to zero')
elif x == 0:
    print('Zero')
elif x == 1:
    print('Single')
else:
    print('More')

In [None]:
for i in range(5):
    print(i)

In [None]:
for n in range(2, 10):
    for x in range(2, n):
        if n % x == 0:
            print(n, 'equals', x, '*', n//x)
            break
    else:
        # loop fell through without finding a factor
        print(n, 'is a prime number')

## Functions

The below example is from [Section 4.1](https://docs.python.org/3/tutorial/controlflow.html#defining-functions) of the tutorial. Here the first statement of the function body is a string literal; this is the function’s documentation string, or docstring. (More about docstrings can be found in the section [Documentation Strings](https://docs.python.org/3/tutorial/controlflow.html#tut-docstrings).) One nice feature is that docstrings work with the help system. 

In [None]:
def fib2(n): 
    """Return a list containing the Fibonacci series up to n."""
    result = []
    a, b = 0, 1
    while a < n:
        result.append(a)    
        a, b = b, a+b
    return result

# Now call the function we just defined:
fib2(2000)

In [None]:
fib2?

In [None]:
# We can see the full definition of fib2 function as follows
fib2??

## Matrices and vectors

NumPy implements the most common operations and functions related to matrices and vectors. If you are familiar with Matlab, the best place to start is the guide for [Matlab users](https://numpy.org/doc/stable/user/numpy-for-matlab-users.html#numpy-for-matlab-users). Even if you don't know Matlab, the guide is worth glancing at since it summarizes well the features of NumPy. In addition to what follows, you might want to read also [NumPy: the absolute basics for beginners](https://numpy.org/doc/stable/user/absolute_beginners.html) document.

I have reproduced here some examples from [NumPy quickstart](https://numpy.org/doc/stable/user/quickstart.html#numpy-quickstart) guide. Some of the examples have been modified slightly. In NumPy's documentation both vectors and matrices are called arrays.

In [None]:
# Define a vector with 4 elements
# Here the vector is created from a list
# See Section 3.1.3 of the Python Tutorial for info on lists:
# https://docs.python.org/3/tutorial/introduction.html#lists
x = np.array([1.2, 3.5, 5.1, 0])
x

In [None]:
# Create 3 x 4 matrix full of zeros
a = np.zeros((3, 4))
a

In [None]:
# Dimensions of x and a
print(f'{x.shape = } and {a.shape = }')
# Number of elements in x and a
print(f'{x.size = } and {a.size = }')

In [None]:
# Create a vector with 9 evenly spaced numbers between 0 and 2
np.linspace(0, 2, 9)

In [None]:
# If an array is too large to be printed,
# NumPy automatically skips the central part of the array and only prints the corners
np.linspace(0, 2, 1001)

In [None]:
# This behaviour is controlled by threshold in printoptions, see
# https://numpy.org/doc/stable/reference/generated/numpy.set_printoptions.html#numpy-set-printoptions
# Funtion get_printoptions returns the options as a dictionary.
# For more info on dictionaries see Section 5.5 of the Python Tutorial
# https://docs.python.org/3/tutorial/datastructures.html#dictionaries
np.get_printoptions()['threshold']

In [None]:
# Arithmetic operators on arrays apply elementwise.
# A new array is created and filled with the result.
a = np.array([20, 30, 40, 50])
b = np.arange(4)
b

In [None]:
c = a - b
c

In [None]:
# Many mathematical functions apply elementwise
10 * np.sin(a)

In [None]:
# The matrix product can be performed using the @ operator
A = np.array([
    [1, 1],
    [0, 1]])
B = np.array([
    [2, 0],
    [3, 4]])
A @ B

In [None]:
# Some operations, such as += and *=, act in place to modify an existing array rather than create a new one.
a = np.ones((2, 3)) # 2 x 3 matrix full of ones
# Let's use some random numbers for our next example matrix
rg = np.random.default_rng()  # create instance of default random number generator
b = rg.random((2, 3)) # 2 x 3 matrix full of uniform random numbers in the interval [0,1)
a *= 3
a

In [None]:
b += a
b

In [None]:
# If you had a look at the the NumPy quickstart guide, 
# you might wonder what class and method mean in the following passage:
# "Many unary operations, such as computing the sum of all the elements in the array, 
# are implemented as methods of the ndarray class."
# An explanation can be found in Section 9 of the Python Tutorial, see
# https://docs.python.org/3/tutorial/classes.html#classes
# We are not going to define any classes in this course, though, so this is not very important.

# By default, unary operations apply to the array as though it were a list of numbers, regardless of its shape. 
b.sum()    # sum of all elements

In [None]:
# However, by specifying the axis parameter you can apply an operation along the specified axis of an array:
b.sum(axis=0)    # sum of each column

In [None]:
b.min(axis=1)    # min of each row

In [None]:
# Arrays can be iterated over, indexed and sliced 
a = np.arange(10)**3
a

In [None]:
for i in a:
    print(i**(1 / 3.))  # Rounding errors are observed in the cube root

In [None]:
a[2]

In [None]:
a[2:5]

In [None]:
# from start to position 6, exclusive, set every 2nd element to 1000
a[:6:2] = 1000
a

In [None]:
a[::-1]  # reversed a

In [None]:
def f(x, y):
    return 10 * x + y
b = np.fromfunction(f, (5, 4))
b

In [None]:
b[2, 3]

In [None]:
b[0:5, 1]  # each row in the second column of b

In [None]:
b[:, 1]    # equivalent to the previous example

In [None]:
b[1:3, :]  # each column in the second and third row of b

In [None]:
b[-1]   # the last row. Equivalent to b[-1, :]

In [None]:
# Iterating over multidimensional arrays is done with respect to the first axis:
for row in b:
    print(row)

In [None]:
# However, if one wants to perform an operation on each element in the array, 
# one can use the flat attribute which is an iterator over all the elements of the array:
for element in b.flat:
    print(element)

In [None]:
# The diagonal of a 4 x 4 submatrix of b
d = np.diag(b[0:4])
d

In [None]:
# Create a diagonal matrix
np.diag(d, 0)

In [None]:
# Simple assignments or function calls make no copies.
a = np.array([
    [ 0,  1,  2,  3],
    [ 4,  5,  6,  7],
    [ 8,  9, 10, 11]])
b = a    # no new array is created
b is a   # a and b are two names for the array

In [None]:
b[0,0] = -100
a

In [None]:
# Slicing an array returns a view of it, not a copy.
s = a[:, 0:1]
s[0,0] = -200
a

In [None]:
# The copy method makes a complete copy of the array and its data.
d = a.copy()
d[0, 0] = 9999
a

In [None]:
# The shape of an array can be changed with various commands. 
# Note that the following three commands all return a modified array, 
# but do not change the original array.
# (If you want to change it, use resize.)
a = np.floor(10 * rg.random((3, 4)))
a

In [None]:
a.ravel()  # returns the array, flattened

In [None]:
a.reshape(6, 2)

In [None]:
a.T  # returns the array, transposed

In [None]:
print(f'{a.T.shape = } and {a.shape = }')

In [None]:
# Several arrays can be stacked together along different axes
# You can also experiment with column_stack, vstack and hstack, see
# https://numpy.org/doc/stable/user/quickstart.html#stacking-together-different-arrays
# For the opposite effect, that is, splitting, use vsplit and hsplit, see
# https://numpy.org/doc/stable/user/quickstart.html#splitting-one-array-into-several-smaller-ones
a = np.array([4., 2.])
b = np.array([3., 8.])
np.row_stack((a, b))

For more advanced topics, see [Less Basic](https://numpy.org/doc/stable/user/quickstart.html#less-basic) section of  the NumPy quickstart guide. For still more advanced things, see [NumPy user guide](https://numpy.org/doc/stable/user/index.html).

## Plotting

We will use Pyplot (a part of Matplotlib) that is a collection of functions that work like those in Matlab. For an introdution see [Pyplot tutorial](https://matplotlib.org/stable/tutorials/introductory/pyplot.html#pyplot-tutorial). (Note that when running Pyplot in Jupyter, you don't need to use the show function that you see in many examples.)

To enable plotting we need to import Matplotlib package by running the following cell. **If you restart the kernel you need to run the cell again.** In other words, if you get the error message _"NameError: name 'plt' is not defined"_, then run the below cell. Note that the import statement does not produce any visible output.

In [None]:
import matplotlib.pyplot as plt

In [None]:
# Plot one period of the sin function
xs = np.linspace(0, 2*np.pi)
ys = np.sin(xs)
plt.plot(xs, ys); # ; hides the output of this command

In [None]:
# Plot both sin and cos with some tuning of the x axis
L = 2*np.pi
xs = np.linspace(0, L)
plt.plot(xs, np.sin(xs), 'b-')   # sin in blue solid line
plt.plot(xs, np.cos(xs), 'r--')  # cos in red dashed line
plt.gca().set(
    xlim = (0, L),   # remove white space on left and right
    xticks = np.pi * np.array([0, .5, 1, 1.5, 2]),
    xticklabels = ['0', '$\pi/2$', '$\pi$', '$3\pi/2$', '$2\pi$'],
    ); 

In [None]:
# See the end of the docstring of plt.plot for more formatting options 
# like 'b-' and 'r--'
plt.plot?

In [None]:
# Log-log plots are useful, for example, for showing convergence rates
xs = np.linspace(1, 2)
ys = xs**2
plt.plot(xs, ys)    # plot the function f(x) = x^2 
plt.figure()        # create a new figure, otherwise the previous plot is overwritten
plt.loglog(xs, ys); # plot the same function f with log scales on both axes 

In [None]:
# Plot a Gaussian function of two variables

# Enable the interactive features of matplotlib so that 
# we can rotate the plot by dragging 
# To revert back to the usual plots run %matplotlib inline
%matplotlib widget

from matplotlib import cm # import colormaps

xs = np.linspace(-1,1)
Xs, Ys = np.meshgrid(xs, xs)
c = 8
Zs = np.exp(-c*(Xs**2 + Ys**2))
fig, ax = plt.subplots(subplot_kw={"projection": "3d"})
surf = ax.plot_surface(Xs, Ys, Zs, cmap=cm.coolwarm)

# Add a color bar which maps values to colors.
fig.colorbar(surf, shrink=0.5, aspect=5);

## Debugging

Debugging is the process of finding and resolving bugs (defects or problems that prevent correct operation).
Perhaps the simplest way to debug is to print some intermediate results and auxiliary info. While this can be done using the `print` function, it can be convenient to use `logging` package since then the auxiliary debug info can easily be turned on and off. For more info on logging see [Logging HOWTO](https://docs.python.org/3/howto/logging.html).

You can also use a debugger by clicking the bug icon in the top right corner. Then click a line number in the code below to set a breakpoint, and run the code. The debugger will stop at that line, and you can, for example, inspect variables and run the code line by line. See the documentation of [Jupyter](https://jupyterlab.readthedocs.io/en/stable/user/debugger.html#usage) for a brief animation illustrating this.

In [None]:
import logging
from logging import info
# Debug messages can be turned off by changing DEBUG to WARNING in the below line
logging.getLogger().setLevel(logging.INFO);

def rowreduce(a):
    '''Naive Gaussian elimination of a matrix'''
    n = a.shape[0] # size of the matrix (that is assumed to be square)
    for j in range(n-1):
        d = a[j,j] # the jth diagonal element 
        #TODO We should check that d is not zero here!
        for k in range(j+1,n):
            mu = - a[k,j]/d
            a[k] = a[k] + mu*a[j]

        info(f'The matrix after elimination in col {j+1} \n{a}') 

# Define a numpy matrix, more on this later
a = np.array([
    [1,2,3],
    [2,3,4],
    [3,4,6],
    ])
print(f'Original matrix\n{a}')
# You can add here a line saying 
# sys.stdout.flush() 
# to get the expected order of output. This requires importing sys.
rowreduce(a) # This changes a
print(f'Matrix after row reduction\n{a}')

## Profiling

Profiling means evaluating the time (or other resource) required by a function to run. Let's have an example based on the `fib2` function that we defined above. (Rerun the cell defining `fib2` if you get the error *"NameError: name 'fib2' is not defined"*.) Commands `%time` and `%timeit` give simple ways to evaluate runtimes. The difference is that the latter gives an average over several runs while the former uses a single run. More sophisticated profiling is discussed for example in this [blog post](https://towardsdatascience.com/speed-up-jupyter-notebooks-20716cbe2025). 

In [None]:
# We don't care about the output of fib2 so we use _ to indicate a throw away variable
%time _ = fib2(1e6)

In [None]:
%timeit _ = fib2(1e6) # Running this takes a while so be patient