The following introduction to NumPy is based on the following article:
    http://www.datadependence.com/2016/05/scientific-python-numpy/
    
For futher details refer to the official quickstart documentation:
https://docs.scipy.org/doc/numpy-dev/user/quickstart.html

# NumPy

NumPy is the fundamental package for scientific computing with Python, which contains for example a powerful and efficient n-dimensional array object.


# Initialization of NumPy Arrays

You can initialize a one-dimensional NumPy array, for example, by passing a list or a tuple to `np.array`.

In [2]:
import numpy as np


a = np.array([0, 1, 2, 3, 4])
b = np.array((0, 1, 2, 3, 4))

print(a)
print(b)

[0 1 2 3 4]
[0 1 2 3 4]


Alternatively, you can create a one-dimensional NumPy array with the `np.arange` function, which is similar to its counterpart `range` in the standard library. Read it's documentation with `np.arange?`.

In [3]:
c = np.arange(9, 30, 3)
print(c)

[ 9 12 15 18 21 24 27]


Similar to `np.arange`, `np.linspace` creates an array of evenly spaced numbers over a specified interval.

In [6]:
pi_steps = np.linspace(0, 2 * np.pi, 5)
print(pi_steps)

[ 0.          1.57079633  3.14159265  4.71238898  6.28318531]


## Indexing Elements

In [4]:
c = np.arange(9, 30, 3)
print(c)
print(c[0])  # Get element at index position 0
print(c[1])  # Get element at index position 1
print(c[2:6])  # Get subarray from index pos. 2 to excluding index pos. 6
print(c[1:-1:2])  # Get every second element from subarray

[ 9 12 15 18 21 24 27]
9
12
[15 18 21 24]
[12 18 24]


# Multi-Dimensional Arrays

You can initialize multi-dimensional arrays either explicitely as in the following.

In [5]:
two_dim = np.array([[ 0,  1,  2,  3],
                    [ 4,  5,  6,  7],
                    [ 8,  9, 10, 11]])

three_dim = np.array([[[ 0,  1,  2,  3],
                       [ 4,  5,  6,  7],
                       [ 8,  9, 10, 11]],

                      [[12, 13, 14, 15],
                       [16, 17, 18, 19],
                       [20, 21, 22, 23]],

                      [[24, 25, 26, 27],
                       [28, 29, 30, 31],
                       [32, 33, 34, 35]]])
print(two_dim.shape)
print(three_dim.shape)

(3, 4)
(3, 3, 4)


Or you create a one-dimensional array, which you `reshape` into the desired form.

In [7]:
a = np.arange(11, 36).reshape(5,5)

print(a)
print(a[2, 4])
a.shape


[[11 12 13 14 15]
 [16 17 18 19 20]
 [21 22 23 24 25]
 [26 27 28 29 30]
 [31 32 33 34 35]]
25


(5, 5)

When defining shapes the shape coordinates are given in the order '*y*', '*x*' for two-dimensional arrays, '*z*', '*y*', '*x*' for three-dimensional arrays, and so on.

In [7]:
a = np.arange(60).reshape((4,3,5))
print(a)

[[[ 0  1  2  3  4]
  [ 5  6  7  8  9]
  [10 11 12 13 14]]

 [[15 16 17 18 19]
  [20 21 22 23 24]
  [25 26 27 28 29]]

 [[30 31 32 33 34]
  [35 36 37 38 39]
  [40 41 42 43 44]]

 [[45 46 47 48 49]
  [50 51 52 53 54]
  [55 56 57 58 59]]]


## Indexing Elements of Multi-dimensional Arrays

In [8]:
a = np.arange(11, 36).reshape(5,5)

print(a[0, 1:4])  # get one-dim. subarray of row 0
print(a[1:4, 0])  # get one-dim. subarray of column 0
print(a[::2,::2])  # get a two-dim. subarray of every second element by row and column
print(a[:, 1])  # get the second column

[12 13 14]
[16 21 26]
[[11 13 15]
 [21 23 25]
 [31 33 35]]
[12 17 22 27 32]


![slicing_img](https://1.bp.blogspot.com/-7a-mSPtfScw/Vze66QWv_MI/AAAAAAAAC7c/1PaKQQzwGwY6GIo4RX3u-rsDuCoFlkkaACLcB/s1600/numpy_2D_slicing_diagram.jpg)

In [9]:
a = np.arange(10, 70).reshape((4,3,5))
print(a)

print(a[0,1,2])  # equal to a[0][1][2]
print(a[:,2])  # a list of the third rows of each inner 2d-array
print(a[:,:,2])  # a list of the third column of each inner 2d-array
print(a[::2,::2,::2])  # collect every second array of the three dimensions

[[[10 11 12 13 14]
  [15 16 17 18 19]
  [20 21 22 23 24]]

 [[25 26 27 28 29]
  [30 31 32 33 34]
  [35 36 37 38 39]]

 [[40 41 42 43 44]
  [45 46 47 48 49]
  [50 51 52 53 54]]

 [[55 56 57 58 59]
  [60 61 62 63 64]
  [65 66 67 68 69]]]
17
[[20 21 22 23 24]
 [35 36 37 38 39]
 [50 51 52 53 54]
 [65 66 67 68 69]]
[[12 17 22]
 [27 32 37]
 [42 47 52]
 [57 62 67]]
[[[10 12 14]
  [20 22 24]]

 [[40 42 44]
  [50 52 54]]]


# Auxiliary Array Properties

In [12]:
a = np.arange(10, 70, dtype=np.int8).reshape((4,3,5))

print(type(a))
print(a.dtype)
print(a.size)
print(a.shape)
print(a.itemsize)
print(a.ndim)
print(a.nbytes)

<class 'numpy.ndarray'>
int8
60
(4, 3, 5)
1
3
60


## Basic Operators

In [13]:
a = np.ones((3,3))
a[1] = 2
a[2] = 3
b = np.arange(9).reshape(3,3)

print(a)
print(b)

[[ 1.  1.  1.]
 [ 2.  2.  2.]
 [ 3.  3.  3.]]
[[0 1 2]
 [3 4 5]
 [6 7 8]]


In [14]:
a = np.ones((3,3), dtype=int)
a[1] = 2
a[2] = 3
b = np.arange(9).reshape(3,3)

print(a)
print(b) 

print(a + b)

[[1 1 1]
 [2 2 2]
 [3 3 3]]
[[0 1 2]
 [3 4 5]
 [6 7 8]]
[[ 1  2  3]
 [ 5  6  7]
 [ 9 10 11]]


In [15]:
print(a - b)

[[ 1  0 -1]
 [-1 -2 -3]
 [-3 -4 -5]]


In [16]:
print(a * b)

[[ 0  1  2]
 [ 6  8 10]
 [18 21 24]]


In [17]:
print(b / a)

[[ 0.          1.          2.        ]
 [ 1.5         2.          2.5       ]
 [ 2.          2.33333333  2.66666667]]


In [18]:
print(a ** 2)

[[1 1 1]
 [4 4 4]
 [9 9 9]]


In [19]:
print(a < b)

[[False False  True]
 [ True  True  True]
 [ True  True  True]]


In [20]:
print(a > b)

[[ True False False]
 [False False False]
 [False False False]]


In [21]:
print(a.dot(b))

[[ 9 12 15]
 [18 24 30]
 [27 36 45]]


In [22]:
print(a)

print(a.sum())
print(a.cumsum())
print(a.min())
print(a.max())

[[1 1 1]
 [2 2 2]
 [3 3 3]]
18
[ 1  2  3  5  7  9 12 15 18]
1
3


## Fancy Indexing



In [23]:
a = np.arange(0, 100, 10)
indices = np.array([1, 5, -1])
b = a[indices]
print(indices)
print(a)
print(b)

[ 1  5 -1]
[ 0 10 20 30 40 50 60 70 80 90]
[10 50 90]


## Boolean masking



In [13]:
%matplotlib notebook

import matplotlib.pyplot as plt
 

xs = np.linspace(0, 2 * np.pi, 50)
ys = np.sin(xs)
plt.plot(xs, ys)

mask = ys >= 0
plt.plot(xs[mask], ys[mask], 'bo')

mask = (ys <= 0) & (xs >= np.pi)
plt.plot(xs[mask], ys[mask], 'go')
plt.show()


<IPython.core.display.Javascript object>

## Incomplete Indexing

As you saw above, you can omit index numbers, which then default to index 0 or index for the last element of the array.

In [25]:
a = np.arange(0, 100, 10)
b = a[:5]
c = a[a >= 50]
print(b)
print(c)

[ 0 10 20 30 40]
[50 60 70 80 90]


## Where

the `where()` function is another useful way of retrieving elements of an array conditionally. Simply pass it a condition and it will return a tuple of lists of elements where that condition is true.

In [26]:
a = np.arange(0, 100, 10)
b = np.where(a < 50) 
c = np.where(a >= 50)[0]
print(b)
print(c)

(array([0, 1, 2, 3, 4]),)
[5 6 7 8 9]


# Why do we care about NumPy?

Because we want to get things done. As you remember from the last session *14-Intro to Plotting.ipynb*, we had to write quite a bit of code to compute the histograms for the age distribution within Copenhagen citizens per year.

By representing our input data as a matrix, i.e., a two-dimensional array, and using boolean indexing, we can generate for example the histograms way more concisely and with way fewer lines of code.

In the following we us Pandas, to import the CSV file with the statistics about Copenhagen citizens and we convert the read data into a numpy matrix. We will have a closer look on the  Pandas library in the next session.

In [14]:
import os
import requests


def download_csv(url):
    filename = os.path.join('/tmp', os.path.basename(url))
    
    if not os.path.exists(filename):
        # Download the file
        r = requests.get(url)

        with open(filename, 'wb') as f:
            f.write(r.content)
    return filename

In [16]:
import pandas as pd


filename = download_csv('http://data.kk.dk/dataset/76ecf368-bf2d-46a2-bcf8-adaf37662528/resource/9286af17-f74e-46c9-a428-9fb707542189/download/befkbhalderstatkode.csv')

bef_stats_df = pd.read_csv(filename)
bef_stats_df

Unnamed: 0,AAR,BYDEL,ALDER,STATKODE,PERSONER
0,2015,1,0,5100,614
1,2015,1,0,5104,2
2,2015,1,0,5106,1
3,2015,1,0,5110,1
4,2015,1,0,5120,4
5,2015,1,0,5126,1
6,2015,1,0,5130,5
7,2015,1,0,5140,3
8,2015,1,0,5150,5
9,2015,1,0,5154,1


In [18]:
dd = bef_stats_df.as_matrix()
dd

array([[2015,    1,    0, 5100,  614],
       [2015,    1,    0, 5104,    2],
       [2015,    1,    0, 5106,    1],
       ..., 
       [1992,   99,   89, 5100,    1],
       [1992,   99,   90, 5180,    1],
       [1992,   99,   93, 5100,    1]])

### Amount of 18 years old Danes in CPH?

For example, how can we quickly compute the amount of all Danes (`STATKODE == 5100`) with age 18 in all neighbourhoods for the year 2015?

We just collect that data by expressing it as a conjunction of constraints and subsequently sum it up.


In [20]:
mask = (dd[:,0] == 2015) & (dd[:,2] == 18) & (dd[:,3] == 5100)
print(dd[mask])
np.sum(dd[mask][:,4])

[[2015    1   18 5100  378]
 [2015    2   18 5100  577]
 [2015    3   18 5100  513]
 [2015    4   18 5100  309]
 [2015    5   18 5100  428]
 [2015    6   18 5100  349]
 [2015    7   18 5100  406]
 [2015    8   18 5100  339]
 [2015    9   18 5100  359]
 [2015   10   18 5100  424]
 [2015   99   18 5100   20]]


4102

### Distribution of French and Germans in CPH?


Similarly, in which neighbourhood lifed the most Frenchmen and Germans respectively in 2015?

The following dictionary of neighbourhood codes is created from the explanatory text on the municipalities homepage.

In [21]:
neighb = {1: 'Indre By', 2: 'Østerbro', 3: 'Nørrebro', 4: 'Vesterbro/Kgs. Enghave', 
          5: 'Valby', 6: 'Vanløse', 7: 'Brønshøj-Husum', 8: 'Bispebjerg', 9: 'Amager Øst', 
          10: 'Amager Vest', 99: 'Udenfor'}

In [17]:
french_mask = (dd[:,0] == 2015) & (dd[:,3] == 5130)
german_mask = (dd[:,0] == 2015) & (dd[:,3] == 5180)

french = np.array([np.sum(dd[french_mask & (dd[:,1] == n)][:,4]) 
                   for n in neighb.keys()])
germans = np.array([np.sum(dd[german_mask & (dd[:,1] == n)][:,4]) 
                    for n in neighb.keys()])

index_max_fr = np.argmax(french)
index_max_de = np.argmax(germans)

msg = 'The majority of {} {} are living in {}'
print(msg.format(french.max(), 'Frenchmen', neighb[list(neighb.keys())[index_max_fr]]))
print(msg.format(germans.max(), 'Germans', neighb[list(neighb.keys())[index_max_de]]))

The majority of 346 Frenchmen are living in Indre By
The majority of 653 Germans are living in Nørrebro


### From which country come the most imigrants in 2015?

Popular belief says that most imigrants come somewhere from the middle-east. But is it true?

To answer this question, convert the cuntry code data from Danmarks Statistik (http://www.dst.dk/da/Statistik/dokumentation/Times/forebyggelsesregistret/statkode.aspx) into a Python dictionary.

In [22]:
country_codes = {0: 'Uoplyst (1)', 5001: 'Uoplyst (2)', 5100: 'Danmark', 5101: 'Grønland', 
                 5102: 'Udlandet uoplyst', 5103: 'Statsløs', 5104: 'Finland', 
                 5105: 'Island, ligeret dansk', 5106: 'Island', 5107: 'Liechtenstein', 
                 5108: 'Luxembourg', 5109: 'Monaco', 5110: 'Norge', 5114: 'Europa uoplyst', 
                 5115: 'Kongelig', 5120: 'Sverige', 5122: 'Albanien', 5124: 'Andorra', 
                 5126: 'Belgien', 5128: 'Bulgarien', 5129: 'Tjekkoslovakiet', 
                 5130: 'Frankrig', 5134: 'Grækenland', 5140: 'Nederlandene', 
                 5142: 'Irland', 5150: 'Italien', 5151: 'Serbien og Montenegro', 
                 5152: 'Jugoslavien', 5153: 'Malta', 5154: 'Polen', 5156: 'Portugal', 
                 5158: 'Rumænien', 5159: 'San Marino', 5160: 'Schweiz', 
                 5162: 'Sovjetunionen', 5164: 'Spanien', 5170: 'Storbritannien', 
                 5172: 'Tyrkiet', 5174: 'Ungarn', 5176: 'Vatikanstaten', 5180: 'Tyskland', 
                 5182: 'Østrig', 5199: 'Europa uoplyst', 5202: 'Algeriet', 5204: 'Angola', 
                 5207: 'Botswana', 5213: 'Burundi', 5214: 'Etiopien', 5215: 'Comorerne', 
                 5216: 'Eritrea', 5222: 'Gambia', 5228: 'Ghana', 5230: 'Ækvatorialguinea', 
                 5231: 'Guinea-Bissau', 5232: 'Guinea', 5233: 'Kap Verde', 5234: 'Kenya', 
                 5235: 'Lesotho', 5236: 'Liberia', 5238: 'Libyen', 5240: 'Mozambique', 
                 5242: 'Madagaskar', 5243: 'Mali', 5244: 'Marokko', 5245: 'Mauritius', 
                 5246: 'Nigeria', 5247: 'Namibia', 5248: 'Marshalløerne', 
                 5255: 'Sierra Leone', 5258: 'Sudan', 5259: 'Swaziland', 5260: 'Sydsudan', 
                 5262: 'Sydafrika', 5266: 'Tanzania', 5268: 'Tunesien', 5269: 'Uganda', 
                 5272: 'Egypten', 5273: 'Tuvalu', 5274: 'Kiribati', 5275: 'Vanuatu', 
                 5276: 'Centralafrikanske Republik', 5277: 'Cameroun', 
                 5278: 'Congo, Demokratiske Republik', 5279: 'Congo, Republikken', 
                 5281: 'Benin', 5282: 'Elfenbenskysten', 5283: 'Gabon', 5284: 'Mauretanien', 
                 5285: 'Niger', 5287: 'Rwanda', 5288: 'Senegal', 5289: 'Somalia', 
                 5292: 'Tchad', 5293: 'Togo', 5294: 'Burkina Faso', 5295: 'Zimbabwe', 
                 5296: 'Zambia', 5297: 'Malawi', 5298: 'Seychellerne', 
                 5299: 'Afrika uoplyst', 5302: 'Argentina', 5303: 'Bahamas', 
                 5304: 'Bolivia', 5305: 'Barbados', 5306: 'Brasilien', 5308: 'Guyana', 
                 5309: 'Antigua og Barbuda', 5310: 'Nauru', 
                 5311: 'Skt. Vincent og Grenadinerne', 5314: 'Canada', 5316: 'Chile', 
                 5318: 'Colombia', 5319: 'Syd- og Mellemamerika uoplyst', 
                 5322: 'Costa Rica', 5324: 'Cuba', 5326: 'Dominikanske Republik', 
                 5328: 'Ecuador', 5338: 'Guatemala', 5339: 'Grenada', 5342: 'Haiti', 
                 5344: 'Surinam', 5345: 'Dominica', 5347: 'Skt. Lucia', 5348: 'Honduras', 
                 5352: 'Jamaica', 5354: 'Mexico', 5356: 'Nicaragua', 5358: 'Panama', 
                 5364: 'Paraguay', 5366: 'Peru', 5372: 'El Salvador', 
                 5374: 'Trinidad og Tobago', 5376: 'Uruguay', 5390: 'USA', 
                 5392: 'Venezuela', 5395: 'Vestindiske Øer', 5397: 'Nordamerika uoplyst', 
                 5398: 'Syd- og Mellemamerika uoplyst', 5402: 'Yemen', 
                 5403: 'Forenede Arabiske Emirater', 5404: 'Afghanistan', 5406: 'Bahrain', 
                 5408: 'Bhutan', 5410: 'Bangladesh', 5412: 'Brunei', 5414: 'Myanmar', 
                 5416: 'Cambodja', 5418: 'Sri Lanka', 5422: 'Cypern', 5424: 'Taiwan', 
                 5432: 'Indien', 5434: 'Indonesien', 5435: 'Østtimor', 5436: 'Irak', 
                 5438: 'Iran', 5442: 'Israel', 5444: 'Japan', 5446: 'Jordan', 5448: 'Kina', 
                 5452: 'Kuwait', 5454: 'Laos', 5456: 'Libanon', 5457: 'Maldiverne', 
                 5458: 'Malaysia', 5459: 'Mongoliet', 5462: 'Oman', 5464: 'Nepal', 
                 5466: 'Nordkorea', 5468: 'Vietnam (1)', 5471: 'Asien uoplyst', 
                 5472: 'Pakistan', 5474: 'Filippinerne', 5478: 'Saudi-Arabien', 
                 5482: 'Singapore', 5484: 'Sydkorea', 5486: 'Syrien', 
                 5487: 'Mellemøsten uoplyst', 5488: 'Vietnam (2)', 5492: 'Thailand', 
                 5496: 'Qatar', 5499: 'Asien uoplyst', 5502: 'Australien', 5505: 'Tonga', 
                 5508: 'Fiji', 5514: 'New Zealand', 5522: 'Samoa', 5525: 'Djibouti', 
                 5526: 'Belize', 5534: 'Papua Ny Guinea', 5599: 'Øer i Stillehavet', 
                 5607: 'Estland', 5609: 'Letland', 5611: 'Litauen', 
                 5621: 'Sao Tome og Principe', 5623: 'Salomonøerne', 
                 5625: 'Skt. Kitts og Nevis', 5700: 'Rusland', 5704: 'Ukraine', 
                 5706: 'Hviderusland', 5708: 'Armenien', 5710: 'Aserbajdsjan', 
                 5712: 'Moldova', 5714: 'Usbekistan', 5716: 'Kasakhstan', 
                 5718: 'Turkmenistan', 5720: 'Kirgisistan', 5722: 'Tadsjikistan', 
                 5724: 'Georgien', 5750: 'Kroatien', 5752: 'Slovenien', 
                 5754: 'Bosnien-Hercegovina', 5756: 'Makedonien', 5757: 'Serbien', 
                 5758: 'Jugoslavien, Forbundsrepublikken', 5759: 'Montenegro', 
                 5761: 'Kosovo', 5776: 'Tjekkiet', 5778: 'Slovakiet', 5779: 'Cookøerne', 
                 5800: 'Land ukendt (2)', 5901: 'Færøerne uoplyst', 5902: 'Færøerne', 
                 5999: 'Land ukendt (1)'}

In [24]:
# create a set of conuntry codes with people living in Copenhagen
c_keys = np.unique(dd[:,3])
# we are interested in non-danes only
c_keys = c_keys[c_keys != 5100]
# and again we only consider 2015
mask = (dd[:,0] == 2015)

no_per_c = np.array([(c_code, np.sum(dd[mask & (dd[:,3] == c_code)][:,4])) 
                     for c_code in c_keys])
msg = '{} inhabitants come from {}'
country_codes[no_per_c[np.argmax(no_per_c[:,1]),0]]

'Sverige'

In [25]:
%matplotlib notebook
import matplotlib.pyplot as plt


sorted_distribution = no_per_c[no_per_c[:,1].argsort()[::-1]]

c_codes = sorted_distribution[:,0]
c_sums = sorted_distribution[:,1]

c_codes = [country_codes[c] for c in c_codes]

plt.plot(range(len(c_codes)), c_sums)
plt.xticks(range(len(c_codes)), c_codes, size='small', rotation=45)

plt.show()


<IPython.core.display.Javascript object>

### Computing a Histogram Concisely

As said, we are using NumPy to express our data queries more concisely. The first three lines of the following program compute the histogram of ages in 2015.

In [22]:
%matplotlib notebook


mask = (dd[:,0] == 2015)
set_of_ages = np.unique(dd[mask][:,2])
freq_ages = np.array([np.sum(dd[mask & (dd[:,2] == age)][:,4]) 
                      for age in set_of_ages])

plt.bar(set_of_ages, freq_ages)
# plt.show()

<IPython.core.display.Javascript object>

<Container object of 107 artists>

And computing the increase of foreign population from 2014 to 2015 is similarly easy. Read and explain the following program.

In [23]:
%matplotlib notebook


mask = (dd[:,3] != 5100)
sum_14 = np.sum(dd[(dd[:,0] == 2014) & mask][:,4])
sum_15 = np.sum(dd[(dd[:,0] == 2015) & mask][:,4])

plt.axis([2013, 2016, 0, max([sum_14, sum_15]) + 2000])
plt.bar([2014, 2015], [sum_14, sum_15], width=0.8,  align='center')
plt.xticks([2014, 2015])
plt.ticklabel_format(useOffset=False)


<IPython.core.display.Javascript object>

# For what else can we use NumPy?

Images are just multi-dimensional arrays. So we can do image processing effectively and efficiently.

Let's for example have a look on some data on ice concentrations in the arctic...

Let's start by collecting some data.

In [24]:
import os


base_url = 'ftp://osisaf.met.no/archive/ice/conc/201{}/02/ice_conc_{}h_polstere-100_multi_201{}02011200.nc'
current_dir = os.getcwd()

os.chdir('./data')
for idx in range(5, 8):
    for h in ['n', 's']:
        url = base_url.format(idx, h, idx)
        print('Downloading {}'.format(url))
        os.system('wget {}'.format(url))
        
os.chdir('../')

Downloading ftp://osisaf.met.no/archive/ice/conc/2015/02/ice_conc_nh_polstere-100_multi_201502011200.nc
Downloading ftp://osisaf.met.no/archive/ice/conc/2015/02/ice_conc_sh_polstere-100_multi_201502011200.nc
Downloading ftp://osisaf.met.no/archive/ice/conc/2016/02/ice_conc_nh_polstere-100_multi_201602011200.nc
Downloading ftp://osisaf.met.no/archive/ice/conc/2016/02/ice_conc_sh_polstere-100_multi_201602011200.nc
Downloading ftp://osisaf.met.no/archive/ice/conc/2017/02/ice_conc_nh_polstere-100_multi_201702011200.nc
Downloading ftp://osisaf.met.no/archive/ice/conc/2017/02/ice_conc_sh_polstere-100_multi_201702011200.nc


In [28]:
%%bash

ls -ltr data

total 5317992
-rw-r--r-- 1 vagrant vagrant       8413 Aug  4 18:00 boliga_1050-1549_small.csv
-rw-r--r-- 1 vagrant vagrant       4225 Aug  4 18:00 boliga_1550-1799_small.csv
-rw-r--r-- 1 vagrant vagrant        436 Aug  7 11:07 example.html
-rw-r--r-- 1 vagrant vagrant       3721 Aug  7 13:18 boliga_1050-1549_2.md
-rw-r--r-- 1 vagrant vagrant       3683 Aug  7 13:18 boliga_1050-1549_1.md
-rw-r--r-- 1 vagrant vagrant       3512 Aug  7 13:18 boliga_1550-1799_1.md
-rw-r--r-- 1 vagrant vagrant       7066 Aug  7 13:20 boliga_1550-1799_1.html
-rw-r--r-- 1 vagrant vagrant       7374 Aug  7 13:20 boliga_1050-1549_1.html
-rw-r--r-- 1 vagrant vagrant       7412 Aug  7 13:20 boliga_1050-1549_2.html
-rw-r--r-- 1 vagrant vagrant       3163 Aug  7 14:40 1550-1799.csv
-rw-r--r-- 1 vagrant vagrant       6580 Aug  7 14:40 1050-1549.csv
-rw-r--r-- 1 vagrant vagrant      59657 Sep  1 10:06 california_cities.csv
-rw-r--r-- 1 vagrant vagrant     758801 Sep  1 13:55 zip_areas_zealand.geojson
-rw-r--r-- 1 vag

Now we read the data, which is distributed in NetCDF format. That is a file format, which is often used in scientific computing. In case the library is not installed on your VM, install it with `conda install netcdf4`

In [29]:
from netCDF4 import Dataset


nc = Dataset('./data/ice_conc_nh_polstere-100_multi_201502011200.nc')
nc2 = Dataset('./data/ice_conc_nh_polstere-100_multi_201602011200.nc')

In [31]:
%matplotlib notebook

import matplotlib.pyplot as plt


plt.subplot(1, 3, 3)
plt.title('2015')
plt.imshow(nc.variables['ice_conc'][0], interpolation='None')
plt.subplot(1, 3, 2)
plt.title('2016')
plt.imshow(nc2.variables['ice_conc'][0], interpolation='None')
plt.subplot(1, 3, 1)
plt.title('Diff')
plt.imshow(nc2.variables['ice_conc'][0] - nc.variables['ice_conc'][0], interpolation='None')

plt.show()

<IPython.core.display.Javascript object>