# Weather Station 15

**Primary Source:** *Python and HDF5* by Andrew Collette, O'Reilly 2013.

<a href="https://www.amazon.com/Python-HDF5-Collette/dp/1449367836/ref=tmm_pap_swatch_0?_encoding=UTF8&qid=&sr="><img src="./img/h5py.jpg"></a>

*Our first task is to store temperature and wind measurements from a network of weather stations.*

In [1]:
import numpy as np

The last 1024 temperature samples might look like this:

In [2]:
temperature = np.random.random(1024)

In [3]:
temperature

array([ 0.64248708,  0.32321578,  0.62666741, ...,  0.31430819,
        0.04177788,  0.1150275 ])

Ditto for the wind:

In [4]:
wind = np.random.random(2048)

Let's stick this into an HDF5 file, and make it a little more *informative* along the way!

In [5]:
import h5py

In [6]:
f = h5py.File("weather.h5", "w")

We store information from different stations in different **HDF5 group**s ("folders").

In [7]:
f["/15/temperature"] = temperature

Keep track of *metadata*, e.g., the temperature unit and the sampling interval, in **HDF5 attribute**s.

In [8]:
f["/15/temperature"].attrs["unit"] = "celsius"

In [9]:
f["/15/temperature"].attrs["dt"] = 10.0  # Temperature sampled every 10 seconds

In [10]:
from time import asctime, gmtime

In [11]:
f["/15/temperature"].attrs["start_time"] = asctime(gmtime())  # GMT time stamp

In [12]:
f["/15/wind"] = wind

In [13]:
f["15/wind"].attrs["dt"] = 5.0  # Wind sampled every 5 seconds

File and group objects support the Python dictonary syntax.

In [14]:
list(f.keys())

[u'15']

In [15]:
list(f["/15"].keys())

[u'temperature', u'wind']

We can "slice and dice" datasets NumPy-style. 

In [16]:
dataset = f["/15/temperature"]

In [17]:
dataset[0:10]

array([ 0.64248708,  0.32321578,  0.62666741,  0.14530955,  0.04747263,
        0.62772381,  0.09123268,  0.60801067,  0.30068538,  0.87747458])

In [18]:
dataset[0:10:2]

array([ 0.64248708,  0.62666741,  0.04747263,  0.09123268,  0.30068538])

An object's attribute collection also supports the Python dictionary syntax.

In [19]:
list(dataset.attrs.keys())

[u'unit', u'dt', u'start_time']

In [20]:
for key, value in dataset.attrs.items():
    print("%s: %s" % (key, value))

unit: celsius
dt: 10.0
start_time: Tue Nov 22 17:00:59 2016


In [21]:
f.close()