# Python and jupyterhub basics

Welcome to this `jupyter` notebook! A `jupyter` notebook is a way to organize Python code and text into blocks which can be run one by one. It's my personal favorite way to do lightweight coding with a lot of visual components. This notebook is to introduce you to some basic Python concepts we will need in this course.

## Running blocks of code

The first thing you need to do is know how to run a code block. The grey block below is a _code block_. You can run it by clicking on the little play button that appears on the left when you click on the block. (You can also use Shift+Return as a shortcut).

In [None]:
print('Welcome to TDA!')

If you ran it successfully, it should have printed a message for you. That printed message is stored in the notebook as well, so when you download it later on, it'll still be there.

One of the basic operations in Python is to store a value in a variable. For example, let's store the value `2.5` in a variable `x`. We can do this by executing the following code block.

In [None]:
x = 2.5

When you ran the block, nothing was printed as output. However, if you want to see what's in the variable `x`, you can just write a code block with that variable in it, and run it, like so.

In [None]:
x

This seems very basic once you know it, but it's a fundamental trick for looking inside your variables.

You may be wondering how I made this text in between the code blocks. You can make a text block by clicking +Text above. You can also make a code block by clicking +Code. The text blocks use Markdown typesetting, but we don't need to get into that.

## Importing a library

To use non-standard Python libraries, we need to import them. Try and execute the code block below to import `gudhi`. If it fails, let me know.

In [None]:
import gudhi

## Exercise 1: making and plotting noisy data

As a basic exercise in Python, we are going to generate some noisy point cloud data and plot it. If you're already very familiar with Python, you can skip all this. 

First, let's import numpy, a great library for doing anything with vectors.

In [None]:
import numpy as np

Let's generate a set of points which lie on a circle. We'll use list comprehension, which makes a list of numbers (or other objects) using a for loop. First let's generate a list of evenly spaced numbers between 0 and 1 using the numpy function arange.

In [None]:
points_in_interval = np.arange(0,1,0.01)

How do we see what's inside `points_in_interval`? That's right, just put in the variable and run the code block.

In [None]:
points_in_interval

Now we can generate the x and y coordinates of our circular data set using a list comprehension. Our x-coordinates need to be a list of $\cos(2 \pi t)$ values where t goes from 0 to 1. Our y-coordinates are similar, just with sin not cos.

*Python tip:* lines with # signs are comments and won't be executed. Use them for making your code more readable.

In [None]:
#x-coordinates
X = [np.cos(t*2*np.pi) for t in points_in_interval]

In [None]:
#y-coordinates
Y = [np.sin(t*2*np.pi) for t in points_in_interval]

Let's check we've done it right by plotting our data. We'll use matplotlib, a plotting library.

In [None]:
import matplotlib.pyplot as plt

The function scatter() from matplotlib takes the x-coordinates and y-coordinates as arguments.

In [None]:
plt.scatter(X, Y)

It doesn't look quite circular, because the axes are not equally scaled. That's easy to fix if we really want to...

In [None]:
plt.scatter(X, Y)
plt.axis('equal')

Okay, time to noise up this data. We can use numpy's random.normal function to make a list of random values from a normal distribution.
The "scale" parameter is just the standard deviation of the distribution we're sampling from. We'll make the size of the list equal to the length of X, so len(X).

In [None]:
X_noise = np.random.normal(scale=0.1, size=len(X))

Look at it!

In [None]:
X_noise

We can plot a histogram of it to check it looks "normal".

In [None]:
plt.hist(X_noise)

Okay let's make the Y noise and add the noise to our vectors. In order to add lists properly, we have to turn them into numpy arrays, i.e. vectors.

In [None]:
Y_noise = np.random.normal(scale=0.1, size=len(X))
#turn the X and Y into vectors and add the noise
X = np.array(X) + np.array(X_noise)
Y = np.array(Y) + np.array(Y_noise)

We can use the same code as before to look at our noisy data.

In [None]:
plt.scatter(X, Y)
plt.axis('equal')

## Basic Exercise 2: Handling arrays
We will frequently have to deal with values stored as a 2D array or matrix, and it helps to know how to manipulate these. 

The basic data structure in `numpy` is an array. You can build an array from a list like this:

In [None]:
X = np.array([1,2,3,4,5])

In [None]:
X

We have now made `X` a 1D array. If we want a 2D array, we have to input a list of lists, like so.

In [None]:
M = np.array([[1,2,3],[4,5,6]])

In [None]:
M

What if I want to access the (1,2) entry of M, that is, the entry in row 1 and column 2? (Remember, Python indexes from 0). This is how you do that.

In [None]:
M[1,2]

By using a `:`, you can access whole rows or whole columns. 

In [None]:
M[1,:]

In [None]:
M[:,2]

A common error is to try and access a row that doesn't exist. Here's what happens if I try to access the 5th row of `M`.

In [None]:
M[5,:]

A crucial part of learning to code in a new environment is learning what different errors mean. Some are helpful, like the above. Some are not so helpful though. When in doubt, **Google the error**. You would be surprised how often researchers (e.g. me) do this every day while coding.

There's too much basic Python to cover in a single notebook, but here is an important trick that I want to mention. Python _loves_ lists (things that look like `[a,b,c]`). It is set up to handle them very nicely. _List comprehension_ is a way of working through a list one element at a time and calculating some output. For example, suppose I have this list: 

In [None]:
l = [1,2,3,4,5]

I want to create a new list whose entries are the entries of `l` times two. I can do this with list comprehension like so.

In [None]:
m = [2*x for x in l]

In [None]:
m

I can also create a list consisting only of those elements which are greater than two like this.

In [None]:
m = [x for x in l if x > 2]

In [None]:
m

If you want to do something to a list, chances are you can do it with list comprehension. 

## Handing in your work

When you've finished a notebook, you need to download the .ipynb file and upload it to Canvas in the right assignment. Try this now. Download this file (the .ipynb file) and upload it to Assignment 0.