# Python for Science


Python is free, it is open source, and it has a huge community.

Python is one of the most popular and loved programming languages in the world!

The CodeEval blog published its data on the ["Most Popular Coding Languages"](http://blog.codeeval.com/codeevalblog/2016/2/2/most-popular-coding-languages-of-2016) on February 2016. It shows Python in the first place in popularity, based on usage in the CodeEval community. Meanwhile, the StackOverflow [trends graph](https://insights.stackoverflow.com/trends?tags=python) shows increasing interest in Python during the last 5 years.

Python can be used for many things: managing data bases, creating graphical user interfaces, making websites, and much more… including science. Because of the many uses, the world of Python includes many, many **Libraries** (you load the parts that you need).

In science, the two libraries that are king and queen of the world are: [NumPy](http://www.numpy.org), and [Matplotlib](https://matplotlib.org).

## Numpy

NumPy is for working with data in the form of arrays (vectors, matrices). It has a myriad built-in functions or methods that work on arrays directly. To load the library into your current session of interactive Python, into a saved Python script, or into a Jupyter notebook, you use:

```python
    import numpy
```

Tips:

* a one-dimensional array (vector) has the form: `[1.0, 0.5, 2.5]`
* a two-dimensional array (matrix) has the form: `[[ 1.0, 0.5, 2.5], [ 0.5, 1.1, 2.0]]`
* the elements in an array are numbered with an index that **starts at 0**
* the colon notation: in any index position, a `:` means "all elements in this dimension"
* once `numpy` is loaded, its built-in functions are called like this: `numpy.function(arg)` (where `arg` is the function argument: arrays to operats on, and parameters)

_Try it!_

In [None]:
import numpy

In [None]:
# By the way: comments in code cells start with a hash.
# here are two arrays, saved as variables x and y:
x = [1.0, 0.5, 2.5]
y = [[ 1.0, 0.5, 2.5], [ 0.5, 1.1, 2.0]]

In [None]:
# The print function works on arrays:
print(x)

In [None]:
print(y)

In [None]:
numpy.shape(y)

In [None]:
numpy.shape(x)

Let's review what happened there. We first loaded `numpy`, giving us the full power to use arrays. We created two arrays: `x` and `y`… then we print `x` and we print `y`. They look nice. 

Numpy has a built-in function to find out the "shape" of an array, which means: _how many elements does this array have in each dimension?_ We find that `y` is a two-by-three array (it has two dimensions). 

What is the first element of `y`? We can use square brackets and the zero-index to find out:

In [None]:
y[0]

Right. The first element of `y` is a 3-wide array of numbers. If we want to access the first element of _this_ now, we use:

In [None]:
y[0][0]

We learned that:

* The square brackets allow us to pick out the elements of an array using an index: `x[i]`
* For a two-dimensional array, we can use two indices: `y[i][j]`
* All indices start at zero.

This is super powerful!

## Matplotlib

Matplotlib is for making all kinds of plots. To get an idea of the great variety of plots possibe, have a look at the online [Gallery](https://matplotlib.org/gallery.html). You can see that Matplotlib itself is a pretty big library. We can load a portion of the library (called a module) that has the basic plotting funtions with:

```python
    from matplotlib import pyplot
```

Once the `pyplot` module is loaded, its built-in functions are called like this: `pyplot.function(arg)` (where arg is the function argument). 

## An example: size of households in the US

Did you know that the size of households—that is, the number of people living in each household—has been steadily decreasing in the US and many other countries? This has perhaps surprising consequences. Even if population growth slows down, or stops altogether, the number of households keeps increasing at a fast rate.

More households means more $CO_2$ emissions! This is bad for the planet.

In [None]:
occupants = numpy.loadtxt(fname='data/statistic_id183648_average-size-of-households-in-the-us-1960-2016.csv',
              delimiter=',', skiprows=1)

In [None]:
print(occupants)


In [None]:
from matplotlib import pyplot
%matplotlib inline

In [None]:
pyplot.plot(occupants[:,1])

In [None]:
numpy.flipud(occupants)

In [None]:
pyplot.plot(occupants[:,1])

In [None]:
pyplot.plot(occupants[:,0],occupants[:,1])