# Weather Station 15

**Primary Source:** *Python and HDF5* by Andrew Collette, O'Reilly 2013.

<a href="https://www.amazon.com/Python-HDF5-Collette/dp/1449367836/ref=tmm_pap_swatch_0?_encoding=UTF8&qid=&sr="><img src="./img/h5py.jpg"></a>

*Our first task is to store temperature and wind measurements from a network of weather stations.*

In [1]:
import numpy as np

The last 1024 temperature samples might look like this:

In [2]:
temperature = np.random.random(1024)

In [3]:
temperature

array([ 0.08903496,  0.97260236,  0.20384326, ...,  0.07204098,
        0.89709195,  0.12329886])

Ditto for the wind:

In [4]:
wind = np.random.random(2048)

Let's stick this into an HDF5 file, and make it a little more *informative* along the way!

In [5]:
import h5py

In [6]:
f = h5py.File("weather.h5", "w")

We store information from different stations in different **HDF5 group**s ("folders").

In [7]:
f["/15/temperature"] = temperature

Keep track of *metadata*, e.g., the temperature unit and the sampling interval, in **HDF5 attribute**s.

In [8]:
f["/15/temperature"].attrs["unit"] = "celsius"

In [9]:
f["/15/temperature"].attrs["dt"] = 10.0  # Temperature sampled every 10 seconds

In [10]:
from time import asctime, gmtime

In [16]:
when = asctime(gmtime())
print when

Thu Nov 24 09:14:01 2016


In [15]:
f["/15/temperature"].attrs["start_time"] = asctime(gmtime())  # GMT time stamp

In [17]:
f["/15/wind"] = wind

In [18]:
f["15/wind"].attrs["dt"] = 5.0  # Wind sampled every 5 seconds

File and group objects support the Python dictonary syntax.

In [19]:
list(f.keys())

[u'15']

In [20]:
list(f["/15"].keys())

[u'temperature', u'wind']

We can "slice and dice" datasets NumPy-style. 

In [21]:
dataset = f["/15/temperature"]

In [22]:
dataset[0:10]

array([ 0.08903496,  0.97260236,  0.20384326,  0.16810597,  0.63059859,
        0.26480896,  0.52281932,  0.31947622,  0.33981204,  0.1268195 ])

In [23]:
dataset[0:10:2]

array([ 0.08903496,  0.20384326,  0.63059859,  0.52281932,  0.33981204])

An object's attribute collection also supports the Python dictionary syntax.

In [19]:
list(dataset.attrs.keys())

[u'unit', u'dt', u'start_time']

In [25]:
for key, value in dataset.attrs.items():
    print("%s: %s" % (key, value))

unit: celsius
dt: 10.0
start_time: Thu Nov 24 09:13:36 2016


In [26]:
f.close()

In [27]:
!ls weather.h5

weather.h5


In [29]:
!h5ls -vr weather.h5

Opened "weather.h5" with sec2 driver.
/                        Group
    Location:  1:96
    Links:     1
/15                      Group
    Location:  1:1072
    Links:     1
/15/temperature          Dataset {1024/1024}
    Attribute: dt scalar
        Type:      native double
        Data:  10
    Attribute: start_time scalar
        Type:      variable-length null-terminated ASCII string
        Data:  "Thu Nov 24 09:13:36 2016"
    Attribute: unit scalar
        Type:      variable-length null-terminated ASCII string
        Data:  "celsius"
    Location:  1:800
    Links:     1
    Storage:   8192 logical bytes, 8192 allocated bytes, 100.00% utilization
    Type:      native double
/15/wind                 Dataset {2048/2048}
    Attribute: dt scalar
        Type:      native double
        Data:  5
    Location:  1:10912
    Links:     1
    Storage:   16384 logical bytes, 16384 allocated bytes, 100.00% utilization
    Type:      native double
