## Numpy

We discussed previously the built-in functions. They allow us to work faster and more efficiently.

However, you might find yourself trying to solve some specific task. For this you might be interested in using functions which can be found in one of the libraries.

You can think of a library as of box with different functions stored in it. If you bring the box to your program you can use use whatever is in that box.

Now we will use a library called `numpy` which stands for numeric Python and is frequently used for scientific computing. You should use it if you want to do fancy things with numbers, especially if you have matrices or arrays.

To use a library you will need to fist tell Python you want to import it to your program. You should already have `numpy` on the computer as it comes with conda distribution:

In [1]:
import numpy as np

Here we told Python: import library called numpy and from now we will call it np.

To use functions from the numpy library we will now have to type `np.` followed by the function name.

Let's first create a simple numpy array:

In [2]:
x = np.array([2, 3, 4])
print(x)
print(type(x))

[2 3 4]
<class 'numpy.ndarray'>


### Lists vs Numpy arrays

It looks similar to the list. However there are important differences between lists and arrays:

**list**
```
- is resizeable
- can contain elements of different types
```
**numpy array**
```
- takes up less space
- faster than list (performance)
- numpy has optimized functions such as linear algebra operations built in
```

Let's check if this is indeed true.

`range` is a Python buit-in function to get a sequence of integers:

## Memory layout

NumPy array is just a memory block with extra information how to interpret its contents.

To construct an array with pre-defined elements we can also use other built-in helper functions:

np.ones and np.zeros return arrays of 0s or 1s;

In [11]:
np.ones(5)

array([1., 1., 1., 1., 1.])

In [12]:
np.zeros(5)

array([0., 0., 0., 0., 0.])

**np.random.rand** creates an array of random numbers from an interval [0, 1]:

In [15]:
np.random.rand(5)

array([0.72176459, 0.87648887, 0.59005231, 0.02461493, 0.49214193])

## Quick question

What can you do if you want to get random numbers between 0 and 10?

## Answer

np.random.rand(5)*10

We can also construct a two- or more dimensional arrays:

In [17]:
np.array([[1, 2],[3, 4]])

array([[1, 2],
       [3, 4]])

In [18]:
np.ones((2, 2))

array([[1., 1.],
       [1., 1.]])

Alternatively, a n-dimensional array can be obtained by reshaping a 1-D array:

In [19]:
a = np.arange(12)
a.reshape((4,3))

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

## Working with a dataset

We will now work with a dataset. The dataset is of a format `.csv` which means comma separated values. Microsoft excel can also view this type of files. 

To load this dataset to Python we will use function from numpy called `loadtxt()`.

`numpy.loadtxt()` takes two parameters: the name of the wile which we wnt to read and the delimiter that separated values on a line. Both of them are character strings:

In [115]:
import os
data = np.loadtxt(fname='paris_leap_year_weather.csv', delimiter=',')

We will be studying the weather in Paris on the leap years of the first half of XXth century.

out of interest, years: 1904, 1912, 1916, 1920, 1924, 1928, 1932, 1936, 1940, 1944, 1948, 1952

## Explore array

In [122]:
print(data)

[[ 0.3  3.9  6.  ...  7.3 10.2  6.9]
 [ 7.4  8.3  7.4 ... 10.2  9.5  2.6]
 [13.7 13.9 14.1 ... 12.9 11.8 11.7]
 ...
 [ 8.6  9.4  9.6 ...  1.2  2.9  3.8]
 [11.6 13.8 14.4 ...  8.3  6.6  7. ]
 [ 8.7 10.4  5.7 ...  4.2  1.4  0.4]]


In contrast to lists NumPy arrays can store elements of pre-determined type only. The type function will only tell you that a variable is a NumPy array but won’t tell you the type of thing inside the array. We can find out the type of the data contained in the NumPy array.

In [98]:
weather = np.loadtxt('paris_leap_year_weather.csv')

ValueError: could not convert string to float: '2.999999999999999889e-01,3.899999999999999911e+00,6.000000000000000000e+00,6.500000000000000000e+00,4.000000000000000000e+00,6.999999999999999556e-01,-4.000000000000000222e-01,3.799999999999999822e+00,7.000000000000000000e+00,3.100000000000000089e+00,2.299999999999999822e+00,7.400000000000000355e+00,1.140000000000000036e+01,9.500000000000000000e+00,7.200000000000000178e+00,5.000000000000000000e+00,3.799999999999999822e+00,4.400000000000000355e+00,5.000000000000000000e+00,-5.999999999999999778e-01,-6.999999999999999556e-01,2.000000000000000000e+00,1.300000000000000044e+00,1.000000000000000056e-01,8.000000000000000444e-01,-9.000000000000000222e-01,7.000000000000000000e+00,1.040000000000000036e+01,9.599999999999999645e+00,1.059999999999999964e+01,7.599999999999999645e+00,3.600000000000000089e+00,7.299999999999999822e+00,1.080000000000000071e+01,9.099999999999999645e+00,5.000000000000000000e+00,7.000000000000000000e+00,9.000000000000000000e+00,1.109999999999999964e+01,8.099999999999999645e+00,1.109999999999999964e+01,1.130000000000000071e+01,9.699999999999999289e+00,1.130000000000000071e+01,8.199999999999999289e+00,8.000000000000000000e+00,7.400000000000000355e+00,9.900000000000000355e+00,5.400000000000000355e+00,6.299999999999999822e+00,1.050000000000000000e+01,1.180000000000000071e+01,9.300000000000000711e+00,5.700000000000000178e+00,5.099999999999999645e+00,3.799999999999999822e+00,1.399999999999999911e+00,2.200000000000000178e+00,-2.000000000000000111e-01,6.999999999999999556e-01,9.000000000000000222e-01,1.000000000000000000e+00,5.099999999999999645e+00,9.099999999999999645e+00,1.050000000000000000e+01,1.080000000000000071e+01,1.369999999999999929e+01,1.490000000000000036e+01,9.500000000000000000e+00,8.300000000000000711e+00,6.900000000000000355e+00,4.799999999999999822e+00,9.800000000000000711e+00,1.300000000000000000e+01,8.099999999999999645e+00,9.300000000000000711e+00,1.069999999999999929e+01,1.130000000000000071e+01,1.409999999999999964e+01,1.390000000000000036e+01,1.269999999999999929e+01,1.269999999999999929e+01,9.099999999999999645e+00,9.300000000000000711e+00,6.500000000000000000e+00,1.359999999999999964e+01,1.230000000000000071e+01,7.900000000000000355e+00,1.009999999999999964e+01,9.099999999999999645e+00,1.130000000000000071e+01,1.300000000000000000e+01,1.300000000000000000e+01,1.359999999999999964e+01,1.200000000000000000e+01,1.150000000000000000e+01,1.290000000000000036e+01,1.100000000000000000e+01,1.269999999999999929e+01,1.509999999999999964e+01,1.309999999999999964e+01,1.390000000000000036e+01,2.200000000000000000e+01,1.700000000000000000e+01,2.219999999999999929e+01,2.300000000000000000e+01,1.309999999999999964e+01,1.810000000000000142e+01,1.900000000000000000e+01,1.850000000000000000e+01,2.069999999999999929e+01,1.080000000000000071e+01,1.500000000000000000e+01,1.380000000000000071e+01,1.600000000000000000e+01,1.409999999999999964e+01,1.390000000000000036e+01,1.409999999999999964e+01,1.839999999999999858e+01,1.989999999999999858e+01,2.100000000000000000e+01,2.189999999999999858e+01,1.660000000000000142e+01,1.750000000000000000e+01,1.639999999999999858e+01,2.000000000000000000e+01,1.400000000000000000e+01,1.409999999999999964e+01,1.319999999999999929e+01,1.409999999999999964e+01,1.800000000000000000e+01,1.760000000000000142e+01,1.880000000000000071e+01,2.200000000000000000e+01,2.700000000000000000e+01,2.310000000000000142e+01,2.769999999999999929e+01,2.719999999999999929e+01,1.819999999999999929e+01,1.869999999999999929e+01,1.910000000000000142e+01,1.669999999999999929e+01,1.910000000000000142e+01,1.719999999999999929e+01,2.219999999999999929e+01,2.150000000000000000e+01,2.810000000000000142e+01,2.110000000000000142e+01,2.130000000000000071e+01,2.389999999999999858e+01,2.710000000000000142e+01,1.889999999999999858e+01,1.880000000000000071e+01,1.509999999999999964e+01,1.900000000000000000e+01,1.969999999999999929e+01,2.389999999999999858e+01,2.400000000000000000e+01,2.519999999999999929e+01,2.180000000000000071e+01,1.359999999999999964e+01,2.000000000000000000e+01,2.289999999999999858e+01,1.800000000000000000e+01,2.250000000000000000e+01,2.589999999999999858e+01,2.360000000000000142e+01,2.600000000000000000e+01,2.950000000000000000e+01,2.000000000000000000e+01,2.250000000000000000e+01,2.210000000000000142e+01,2.100000000000000000e+01,2.260000000000000142e+01,2.439999999999999858e+01,3.010000000000000142e+01,2.110000000000000142e+01,2.200000000000000000e+01,2.119999999999999929e+01,2.239999999999999858e+01,2.519999999999999929e+01,2.800000000000000000e+01,2.300000000000000000e+01,2.150000000000000000e+01,2.500000000000000000e+01,2.319999999999999929e+01,2.600000000000000000e+01,2.460000000000000142e+01,2.800000000000000000e+01,2.989999999999999858e+01,3.030000000000000071e+01,3.050000000000000000e+01,3.089999999999999858e+01,3.000000000000000000e+01,2.900000000000000000e+01,3.210000000000000142e+01,3.550000000000000000e+01,3.360000000000000142e+01,3.710000000000000142e+01,3.229999999999999716e+01,3.130000000000000071e+01,2.600000000000000000e+01,2.819999999999999929e+01,2.889999999999999858e+01,3.160000000000000142e+01,3.210000000000000142e+01,2.689999999999999858e+01,1.900000000000000000e+01,2.210000000000000142e+01,2.389999999999999858e+01,2.569999999999999929e+01,3.000000000000000000e+01,2.269999999999999929e+01,2.700000000000000000e+01,2.900000000000000000e+01,3.069999999999999929e+01,3.479999999999999716e+01,2.719999999999999929e+01,2.800000000000000000e+01,2.200000000000000000e+01,2.360000000000000142e+01,2.600000000000000000e+01,2.419999999999999929e+01,2.580000000000000071e+01,2.230000000000000071e+01,2.650000000000000000e+01,2.960000000000000142e+01,2.539999999999999858e+01,2.430000000000000071e+01,3.060000000000000142e+01,2.100000000000000000e+01,2.280000000000000071e+01,2.160000000000000142e+01,2.339999999999999858e+01,2.350000000000000000e+01,1.919999999999999929e+01,1.669999999999999929e+01,1.850000000000000000e+01,1.780000000000000071e+01,2.500000000000000000e+01,2.530000000000000071e+01,2.800000000000000000e+01,2.710000000000000142e+01,2.260000000000000142e+01,1.619999999999999929e+01,1.900000000000000000e+01,1.630000000000000071e+01,2.069999999999999929e+01,2.480000000000000071e+01,2.580000000000000071e+01,2.100000000000000000e+01,1.900000000000000000e+01,1.780000000000000071e+01,1.869999999999999929e+01,2.039999999999999858e+01,1.989999999999999858e+01,1.710000000000000142e+01,1.810000000000000142e+01,1.939999999999999858e+01,1.600000000000000000e+01,1.989999999999999858e+01,2.000000000000000000e+01,1.910000000000000142e+01,1.600000000000000000e+01,1.639999999999999858e+01,1.660000000000000142e+01,1.459999999999999964e+01,1.710000000000000142e+01,1.710000000000000142e+01,2.019999999999999929e+01,2.010000000000000142e+01,1.860000000000000142e+01,1.450000000000000000e+01,1.509999999999999964e+01,1.480000000000000071e+01,1.689999999999999858e+01,2.139999999999999858e+01,1.590000000000000036e+01,1.559999999999999964e+01,1.780000000000000071e+01,1.669999999999999929e+01,1.169999999999999929e+01,1.200000000000000000e+01,1.319999999999999929e+01,1.309999999999999964e+01,1.509999999999999964e+01,1.530000000000000071e+01,1.380000000000000071e+01,1.309999999999999964e+01,1.100000000000000000e+01,1.309999999999999964e+01,1.700000000000000000e+01,2.089999999999999858e+01,2.000000000000000000e+01,1.719999999999999929e+01,1.509999999999999964e+01,1.800000000000000000e+01,1.989999999999999858e+01,1.530000000000000071e+01,1.450000000000000000e+01,1.390000000000000036e+01,1.300000000000000000e+01,1.119999999999999929e+01,1.309999999999999964e+01,1.400000000000000000e+01,1.240000000000000036e+01,8.599999999999999645e+00,9.599999999999999645e+00,1.000000000000000000e+01,5.500000000000000000e+00,8.599999999999999645e+00,1.109999999999999964e+01,1.200000000000000000e+01,1.409999999999999964e+01,1.450000000000000000e+01,1.530000000000000071e+01,1.469999999999999929e+01,1.290000000000000036e+01,1.100000000000000000e+01,9.300000000000000711e+00,8.800000000000000711e+00,7.599999999999999645e+00,6.700000000000000178e+00,6.099999999999999645e+00,6.099999999999999645e+00,7.700000000000000178e+00,6.200000000000000178e+00,1.600000000000000089e+00,2.000000000000000000e+00,3.000000000000000000e+00,0.000000000000000000e+00,4.000000000000000000e+00,2.899999999999999911e+00,3.000000000000000000e+00,8.400000000000000355e+00,9.900000000000000355e+00,8.800000000000000711e+00,9.400000000000000355e+00,9.599999999999999645e+00,1.090000000000000036e+01,1.219999999999999929e+01,1.509999999999999964e+01,6.799999999999999822e+00,6.299999999999999822e+00,9.000000000000000000e+00,7.099999999999999645e+00,9.000000000000000000e+00,6.700000000000000178e+00,5.900000000000000355e+00,9.000000000000000000e+00,1.200000000000000000e+01,1.180000000000000071e+01,1.059999999999999964e+01,9.500000000000000000e+00,4.900000000000000355e+00,4.299999999999999822e+00,2.500000000000000000e+00,4.500000000000000000e+00,3.899999999999999911e+00,6.200000000000000178e+00,2.299999999999999822e+00,4.000000000000000000e+00,9.000000000000000222e-01,7.299999999999999822e+00,1.019999999999999929e+01,6.900000000000000355e+00'

In [23]:
weather.shape

(39859, 4)

In [28]:
weather[:, 3]

array([10.1, 12. ,  9.3, ..., 21.4, 19.4, 20.1])

In [83]:
my_dict = {}
print(year)
for idx, year in enumerate(weather[:, 0]):
    #if weather[idx, 1] == 1:
    year = year.astype('int')
    temp_this_year = my_dict.get(year, [])
    temp_this_year.append(weather[idx, 3])
    my_dict[year] = temp_this_year

wrong_keys = []

min_len = 1000
this_key = ''
for key in my_dict.keys():
    len_key = len(my_dict[key])

    if len_key == 366:
        wrong_keys.append(key)
        #print(len_key)
    if min_len > len_key:
            min_len = len_key
            this_key = key
print(wrong_keys)
print(min_len)
print(key)

2014
[1904, 1912, 1916, 1920, 1924, 1928, 1932, 1936, 1940, 1944, 1948, 1952]
1
2014


In [110]:
len(wrong_keys)
leap_weather = np.zeros([12, 366])
for idx, key in enumerate(wrong_keys):
    leap_weather[idx,:] = my_dict[key]

In [111]:
leap_weather.shape

(12, 366)

In [112]:
np.savetxt("paris_leap_year_weather.csv", leap_weather, delimiter=",")