# Weather Station 15

**Primary Source:** *Python and HDF5* by Andrew Collette, O'Reilly 2013.

You might find the following titles helpful:
<table>
<tr>
<td><a href="https://www.amazon.com/Python-HDF5-Collette/dp/1449367836/ref=tmm_pap_swatch_0?_encoding=UTF8&qid=&sr="><img src="./img/h5py.jpg"></a></td>
<td><a href="https://www.amazon.com/Effective-Computation-Physics-Anthony-Scopatz/dp/1491901535/ref=tmm_pap_swatch_0?_encoding=UTF8&qid=&sr="><img src="./img/eff_comp_phys.jpg"></a></td>
<td><a href="https://www.amazon.com/Python-Essential-Reference-David-Beazley/dp/0672329786/ref=mt_paperback?_encoding=UTF8&me="><img src="./img/beazley.jpg"></a></td>
</tr>
</table>

*Our first task is to store temperature and wind measurements from a network of weather stations.*

In [1]:
import numpy as np

The last 1024 temperature samples might look like this:

In [2]:
temperature = np.random.random(1024)

In [3]:
temperature

array([ 0.62721836,  0.78295604,  0.05988774, ...,  0.34764036,
        0.50080988,  0.61436234])

Ditto for the wind:

In [4]:
wind = np.random.random(2048)

Let's stick this into an HDF5 file, and make it a little more *informative* along the way!

In [5]:
import h5py

In [6]:
f = h5py.File("weather.h5", "w", libver="latest")  # Believe it or not, the HDF5 file format keeps changing (slowly).

We store information from different stations in different **HDF5 group**s ("folders").

In [7]:
f["/15/temperature"] = temperature

Keep track of *metadata*, e.g., the temperature unit and the sampling interval, in **HDF5 attribute**s.

In [31]:
f["/15/temperature"].attrs["unit"] = "celsius"

In [22]:
f["/15/temperature"].attrs["dt"] = 10.0  # Temperature sampled every 10 seconds

In [23]:
from time import asctime, gmtime

In [24]:
f["/15/temperature"].attrs["start_time"] = asctime(gmtime())  # GMT time stamp

In [25]:
f["/15/wind"] = wind

RuntimeError: Unable to create link (Name already exists)

In [26]:
f["15/wind"].attrs["dt"] = 5.0  # Wind sampled every 5 seconds

File and group objects support the Python dictonary syntax.

In [27]:
list(f.keys())

[u'15']

In [28]:
list(f["/15"].keys())

[u'temperature', u'wind']

We can "slice and dice" datasets NumPy-style. 

In [16]:
dataset = f["/15/temperature"]

In [17]:
dataset[0:10]

array([ 0.62721836,  0.78295604,  0.05988774,  0.69280468,  0.87805751,
        0.09602902,  0.44572149,  0.12817129,  0.46718643,  0.27554185])

In [18]:
dataset[0:10:2]

array([ 0.62721836,  0.05988774,  0.87805751,  0.44572149,  0.46718643])

An object's attribute collection also supports the Python dictionary syntax.

In [29]:
list(dataset.attrs.keys())

[u'unit', u'dt', u'start_time']

In [32]:
for key, value in dataset.attrs.items():
    print("%s: %s" % (key, value))

dt: 10.0
start_time: Mon Oct 24 13:17:19 2016
unit: celsius


In [None]:
f.close()