# A (Shorter) Crash Course in Python for Scientists

Heavily based on the original [Notebook (v6.0)](https://gist.githubusercontent.com/rpmuller/5920182/raw/97620296ff6c01d1fc9f5b2e67006fc239bf302c/Crash%2520Course%2520v0.5.ipynb.json) by 
[Rick Muller](http://www.cs.sandia.gov/~rmuller/), Sandia National Laboratories

Adapted and reduced by Luca Citi

This work is licensed under a [Creative Commons Attribution-ShareAlike 3.0 Unported License](http://creativecommons.org/licenses/by-sa/3.0/deed.en_US).

## What You Need to Install

These notes assume you have a Python distribution that includes:

* [Python](http://www.python.org) version 3.x;
* [Numpy](http://www.numpy.org), the core numerical extensions for linear algebra and multidimensional arrays;
* [Scipy](http://www.scipy.org), additional libraries for scientific programming;
* [Matplotlib](http://matplotlib.sf.net), excellent plotting and graphing libraries;
* [Jupyter](http://jupyter.org), with the additional libraries required for the notebook interface.


# I. Python Overview

The lessons that follow make use of the Jupyter (formerly IPython) notebooks.

Briefly, notebooks have code cells (that are generally followed by result cells) and text cells. The text cells are the stuff that you're reading now. The code cells start with "In []:" with some number generally in the brackets.

**If you put your cursor in the code cell and hit Shift-Enter, the code will run in the Python interpreter and the result will print out in the output cell.**

You can then change things around and see whether you understand what's going on. If you need to know more, see the [Jupyter notebook documentation](http://jupyter.org/documentation.html).



## Using Python as a Calculator

Many of the things I used to use a calculator for, I now use Python for:

In [None]:
2+2

In [None]:
(50 - 5 * 6) / 4

In Python 3, division between integers returns a float (as you would obtain on a handheld calculator). To obtain the integer division, use a double slash:

In [None]:
print(7 / 3)
print(7 // 3)

You can define variables using the equals (=) sign:

In [None]:
width = 20
length = 30
area = length*width
area

## Strings
Strings are lists of printable characters, and can be defined using either single quotes

In [None]:
'Hello, World!'

or double quotes

In [None]:
greeting = "Hello, World!"

The **print** function is often used for printing character strings and other objects:

In [None]:
print(greeting, "The area is", area)

## Lists
Very often in a programming language, one wants to keep a group of similar items together. Python does this using a data type called **lists**.

In [None]:
days_of_the_week = ["Sunday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday"]

You can access members of the list using the **index** of that item:

In [None]:
days_of_the_week[2]

Python lists, like C, but unlike Fortran, use 0 as the index of the first element of a list. Thus, in this example, the 0 element is "Sunday", 1 is "Monday", and so on. If you need to access the *n*th element from the end of the list, you can use a negative index. For example, the -1 element of a list is the last element:

In [None]:
days_of_the_week[-1]

You can add additional items to the list using the .append() command:

In [None]:
languages = ["Fortran","C","C++"]
languages.append("Python")
print(languages)

The **range()** command is a convenient way to make sequential lists of numbers:

In [None]:
list(range(10))

Note that range(n) starts at 0 and gives the sequential list of integers less than n. If you want to start at a different number, use range(start,stop)

In [None]:
list(range(2,8))

Lists do not have to hold the same data type. For example,

In [None]:
a_list = ["Today",7,99.3,""]

In [None]:
help(len)

In [None]:
len(a_list)

## Iteration, Indentation, and Blocks
One of the most useful things you can do with lists is to *iterate* through them, i.e. to go through each element one at a time. To do this in Python, we use the **for** statement:

In [None]:
for day in days_of_the_week:
    print(day)

This code snippet goes through each element of the list called **days_of_the_week** and assigns it to the variable **day**. It then executes everything in the indented block (in this case only one line of code, the print statement) using those variable assignments. When the program has gone through every element of the list, it exists the block.

(Almost) every programming language defines blocks of code in some way. In Fortran, one uses END statements (ENDDO, ENDIF, etc.) to define code blocks. In C, C++, and Perl, one uses curly braces {} to define these blocks.

**Python uses a colon (":"), followed by indentation level to define code blocks.** Everything at a higher level of indentation is taken to be in the same block. In the above example the block was only a single line, but we could have had longer blocks as well:

In [None]:
for day in days_of_the_week:
    statement = "Today is " + day
    print(statement)

The **range()** command is particularly useful with the **for** statement to execute loops of a specified length:

In [None]:
for i in range(10):
    print("The square of ", i, " is ", i*i)

## Slicing
Lists and strings have something in common that you might not suspect: they can both be treated as sequences. You already know that you can iterate through the elements of a list.

The *slicing* operation can be used on any sequence. We already know that we can use *indexing* to get the first element of a list:

In [None]:
days_of_the_week[0]

If we want the list containing the first two elements of a list, we can do this via

In [None]:
days_of_the_week[0:2]

or simply

In [None]:
days_of_the_week[:2]

If we want the last items of the list, we can do this with negative slicing:

In [None]:
days_of_the_week[-2:]

which is somewhat logically consistent with negative indices accessing the last elements of the list.

You can do:

## Booleans and Truth Testing
We have now learned a few data types. We have integers and floating point numbers, strings, and lists to contain them. We have also learned about lists, a container that can hold any data type. We have learned to print things out, and to iterate over items in lists. We will now learn about **boolean** variables that can be either True or False.

We invariably need some concept of *conditions* in programming to control branching behavior, to allow a program to react differently to different situations. If it's Monday, I'll go to work, but if it's Sunday, I'll sleep in. To do this in Python, we use a combination of **boolean** variables, which evaluate to either True or False, and **if** statements, that control branching based on boolean values.

For example:

In [None]:
day = "Sunday" # single equal -> assignment

if day == "Sunday": # double equal -> comparison
    print("Sleep in")
elif day == "Saturday":
    print("Do chores")
else:
    print("Go to work")

In [None]:
50 == 2*25

In [None]:
3 < 3.14159

In [None]:
1 == 1.0

In [None]:
1 != 0

In [None]:
1 <= 2

In [None]:
1 >= 1

## Functions
We can define a function with the **def** statement in Python:

In [None]:
def square_loss(y, target):
    "Return the value of the square loss"
    return (y - target)**2

We can now call **square_loss()** for different arguments:

In [None]:
square_loss(2, 3)

In [None]:
square_loss(-2, 3)

## Dictionaries
**Dictionaries** are an object called "mappings" or "associative arrays" in other languages. Whereas a list associates an integer index with a set of objects:

The index in a dictionary is called the *key*, and the corresponding dictionary entry is the *value*. A dictionary can use (almost) anything as the key. Whereas lists are formed with square brackets [], dictionaries use curly brackets {}:

There's also a convenient way to create dictionaries without having to quote the keys.

In [None]:
ages = {"Rick": 46, "Bob": 86, "Fred": 21}
print("Rick's age is ",ages["Rick"])

In [None]:
dict(Rick=46,Bob=86,Fred=20)

# II. Numpy and Scipy

[Numpy](http://numpy.org) contains core routines for doing fast vector, matrix, and linear algebra-type operations in Python. [Scipy](http://scipy) contains additional routines for optimization, special functions, and so on. Both contain modules written in C and Fortran so that they're as fast as possible. Together, they give Python roughly the same capability that the [Matlab](http://www.mathworks.com/products/matlab/) program offers. (In fact, if you're an experienced Matlab user, there a [guide to Numpy for Matlab users](http://www.scipy.org/NumPy_for_Matlab_Users) just for you.)

## Making vectors and matrices
Fundamental to both Numpy and Scipy is the ability to work with vectors and matrices. You can create vectors from lists using the **array** command:

In [None]:
import numpy as np

In [None]:
np.array([1,2,3,4,5,6])

You can pass in a second argument to **array** that gives the numeric type. There are a number of types [listed here](http://docs.scipy.org/doc/numpy/user/basics.types.html) that your matrix can be. Some of these are aliased to single character codes. The most common ones are 'd' (double precision floating point number), 'D' (double precision complex number), and 'i' (int32). Thus,

To build matrices, you can either use the array command with lists of lists:

In [None]:
np.array([[0,1],[1,0]], 'd')

You can also form empty (zero) matrices of arbitrary shape (including vectors, which Numpy treats as vectors with one row), using the **zeros** command:

In [None]:
np.zeros((3,3), 'd')

In [None]:
np.ones((1,3), 'd')

or column vectors:

In [None]:
np.ones((3,1),'d')

There's also an **eye** (identity matrix) command:

In [None]:
np.eye(4)

## Linspace, matrix functions, and plotting
The **linspace** command makes a linear array of points from a starting to an ending value. If you provide a third argument, it takes that as the number of points in the space.

In [None]:
x = np.linspace(0,4,41)
x

**linspace** is an easy way to make coordinates for plotting. Functions in the numpy library (all of which are imported into IPython notebook) can act on an entire vector (or even a matrix) of points at once. Thus,

In [None]:
y = np.sin(x)
y

In conjunction with **matplotlib**, this is a nice way to plot things. (Depending on the browser and installation, replacing the *inline* option below with *notebook* allows interactive mode.)

In [None]:
#%matplotlib notebook
%matplotlib inline
import matplotlib.pyplot as plt

In [None]:
plt.figure()
plt.plot(x, y)
plt.show()

We can also plot histograms:

In [None]:
data = np.array([28, 16, 22,  8, 26, 21, 13, 15, 26, 16, 24,  1, 23, 15, 24, 5])

plt.figure(figsize=(6,4))
plt.hist(data, bins=np.linspace(start=0, stop=30, num=7))
plt.show()

And scatter plots:

In [None]:
plt.figure(figsize=(8,6))
plt.scatter(data, x[:len(data)])
plt.show()

## Matrix operations
Matrix objects act sensibly when multiplied by scalars:

In [None]:
0.125 * np.eye(3)

as well as when you add two matrices together. (However, the matrices have to be the same shape.)

In [None]:
np.eye(2) + np.array([[1,1],[1,2]])

Something that confuses Matlab users is that the times (*) operator give element-wise multiplication rather than matrix multiplication:

In [None]:
np.eye(2) * np.ones((2,2))

To get matrix multiplication, you need the **dot** command:

In [None]:
np.dot(np.eye(2), np.ones((2,2)))

In python 3, @ is a shortcut to perform matrix multiplications:

In [None]:
import numpy as np
print(np.eye(2) @ np.ones((2,2)))

There are **determinant**, **inverse**, and **transpose** functions that act as you would suppose. Transpose can be abbreviated with ".T" at the end of a matrix object:

In [None]:
m = np.array([[1,2],[3,4]])
m.T