# Loading data into memory

Loading API is central to a lot of nilmtk operations and provides a great deal of flexibility. Let's look at ways in which we can load data from a NILMTK DataStore into memory. To see the full range of possible queries, we'll use the [iAWE data set](http://iawe.github.io) (whose HDF5 file can be downloaded [here](https://copy.com/C2sIt1UfDx1mfPlC)).

The `load` function returns a *generator* of DataFrames loaded from the DataStore based on the conditions specified. If no conditions are specified, then all data from all the columns is loaded.  (If you have not come across Python generators, it might be worth reading [this quick guide to Python generators](http://stackoverflow.com/a/1756156/732596).)

In [2]:
from nilmtk import DataSet

iawe = DataSet('/nilmtk/data/ukdale.h5') #('/nilmtk/data/iawe.h5')
elec = iawe.buildings[1].elec
elec

MeterGroup(meters=
  ElecMeter(instance=2, building=1, dataset='UK-DALE', appliances=[Appliance(type='boiler', instance=1)])
  ElecMeter(instance=3, building=1, dataset='UK-DALE', appliances=[Appliance(type='solar thermal pumping station', instance=1)])
  ElecMeter(instance=4, building=1, dataset='UK-DALE', appliances=[Appliance(type='laptop computer', instance=1), Appliance(type='laptop computer', instance=3)])
  ElecMeter(instance=5, building=1, dataset='UK-DALE', appliances=[Appliance(type='washer dryer', instance=1)])
  ElecMeter(instance=6, building=1, dataset='UK-DALE', appliances=[Appliance(type='dish washer', instance=1)])
  ElecMeter(instance=7, building=1, dataset='UK-DALE', appliances=[Appliance(type='television', instance=1)])
  ElecMeter(instance=8, building=1, dataset='UK-DALE', appliances=[Appliance(type='light', instance=1), Appliance(type='light', instance=2)])
  ElecMeter(instance=9, building=1, dataset='UK-DALE', appliances=[Appliance(type='HTPC', instance=1)])
  Ele

Let us see what measurements we have for the fridge:

In [8]:
fridge = elec['fridge']
fridge.available_columns()

[('power', 'active')]

## Loading data

### Load all columns (default)

In [9]:
df = next(fridge.load())
df.head()

physical_quantity,power
type,active
2012-12-14 22:21:32+00:00,85
2012-12-14 22:21:38+00:00,85
2012-12-14 22:21:44+00:00,84
2012-12-14 22:21:50+00:00,85
2012-12-14 22:21:56+00:00,85


### Load a single column of power data

Use `fridge.power_series()` which returns a generator of 1-dimensional `pandas.Series` objects, each containing power data using the most 'sensible' AC type:

In [10]:
series = next(fridge.power_series())
series.head()

2012-12-14 22:21:32+00:00    85
2012-12-14 22:21:38+00:00    85
2012-12-14 22:21:44+00:00    84
2012-12-14 22:21:50+00:00    85
2012-12-14 22:21:56+00:00    85
Name: (power, active), dtype: float32

or, to get reactive power:

In [11]:
series = next(fridge.power_series(ac_type='active'))
series.head()

2012-12-14 22:21:32+00:00    85
2012-12-14 22:21:38+00:00    85
2012-12-14 22:21:44+00:00    84
2012-12-14 22:21:50+00:00    85
2012-12-14 22:21:56+00:00    85
Name: (power, active), dtype: float32

### Specify physical_quantity or AC type

In [13]:
df = next(fridge.load(physical_quantity='power', ac_type='active'))
df.head()

physical_quantity,power
type,active
2012-12-14 22:21:32+00:00,85
2012-12-14 22:21:38+00:00,85
2012-12-14 22:21:44+00:00,84
2012-12-14 22:21:50+00:00,85
2012-12-14 22:21:56+00:00,85


To load voltage data:

In [15]:
df = next(fridge.load(physical_quantity='voltage'))
df.head()

physical_quantity,power
type,active
2012-12-14 22:21:32+00:00,85
2012-12-14 22:21:38+00:00,85
2012-12-14 22:21:44+00:00,84
2012-12-14 22:21:50+00:00,85
2012-12-14 22:21:56+00:00,85


In [16]:
df = next(fridge.load(physical_quantity = 'power'))
df.head()

physical_quantity,power
type,active
2012-12-14 22:21:32+00:00,85
2012-12-14 22:21:38+00:00,85
2012-12-14 22:21:44+00:00,84
2012-12-14 22:21:50+00:00,85
2012-12-14 22:21:56+00:00,85


### Loading by specifying AC type

In [17]:
df = next(fridge.load(ac_type='active'))
df.head()

physical_quantity,power
type,active
2012-12-14 22:21:32+00:00,85
2012-12-14 22:21:38+00:00,85
2012-12-14 22:21:44+00:00,84
2012-12-14 22:21:50+00:00,85
2012-12-14 22:21:56+00:00,85


### Loading by resampling to a specified period

In [18]:
# resample to minutely (i.e. with a sample period of 60 secs)
df = next(fridge.load(ac_type='active', sample_period=60))
df.head()

physical_quantity,power
type,active
2012-12-14 22:21:00+00:00,84.800003
2012-12-14 22:22:00+00:00,85.300003
2012-12-14 22:23:00+00:00,89.0
2012-12-14 22:24:00+00:00,91.099998
2012-12-14 22:25:00+00:00,86.222221
