# Loading data into memory

Loading API is central to a lot of nilmtk operations and provides a great deal of flexibility. Let's look at ways in which we can load data from a NILMTK DataStore into memory. To see the full range of possible queries, we'll use the [iAWE data set](http://iawe.github.io) (whose HDF5 file can be downloaded [here](https://copy.com/C2sIt1UfDx1mfPlC)).

The `load` function returns a *generator* of DataFrames loaded from the DataStore based on the conditions specified. If no conditions are specified, then all data from all the columns is loaded.  (If you have not come across Python generators, it might be worth reading [this quick guide to Python generators](http://stackoverflow.com/a/1756156/732596).)

**NOTE**: If you are on Windows, remember to escape the back-slashes, use forward-slashs, or use raw-strings when passing paths in Python, e.g. one of the following would work:

```python
iawe = DataSet('c:\\data\\iawe.h5')
iawe = DataSet('c:/data/iawe.h5')
iawe = DataSet(r'c:\data\iawe.h5')
```

In [1]:
from nilmtk import DataSet

iawe = DataSet('/data/iawe.h5')
elec = iawe.buildings[1].elec
elec

MeterGroup(meters=
  ElecMeter(instance=1, building=1, dataset='iAWE', site_meter, appliances=[])
  ElecMeter(instance=2, building=1, dataset='iAWE', site_meter, appliances=[])
  ElecMeter(instance=3, building=1, dataset='iAWE', appliances=[Appliance(type='fridge', instance=1)])
  ElecMeter(instance=4, building=1, dataset='iAWE', appliances=[Appliance(type='air conditioner', instance=1)])
  ElecMeter(instance=5, building=1, dataset='iAWE', appliances=[Appliance(type='air conditioner', instance=2)])
  ElecMeter(instance=6, building=1, dataset='iAWE', appliances=[Appliance(type='washing machine', instance=1)])
  ElecMeter(instance=7, building=1, dataset='iAWE', appliances=[Appliance(type='computer', instance=1)])
  ElecMeter(instance=8, building=1, dataset='iAWE', appliances=[Appliance(type='clothes iron', instance=1)])
  ElecMeter(instance=9, building=1, dataset='iAWE', appliances=[Appliance(type='unknown', instance=1)])
  ElecMeter(instance=10, building=1, dataset='iAWE', appliances=[A

Let us see what measurements we have for the fridge:

In [2]:
fridge = elec['fridge']
fridge.available_columns()

[('current', None),
 ('power', 'active'),
 ('frequency', None),
 ('power factor', None),
 ('power', 'apparent'),
 ('power', 'reactive'),
 ('voltage', None)]

## Loading data

### Load all columns (default)

In [3]:
df = next(fridge.load())
df.head()

physical_quantity,current,power,frequency,power,power,voltage
type,Unnamed: 1_level_1,active,Unnamed: 3_level_1,apparent,reactive,Unnamed: 6_level_1
2013-07-13 05:30:00+05:30,0.011,0.166925,50.157169,2.660094,2.652679,241.49472
2013-07-13 05:31:00+05:30,0.010981,0.169385,50.14846,2.647615,2.640115,242.189423
2013-07-13 05:32:00+05:30,0.011,0.177887,50.143394,2.672245,2.666358,243.750381
2013-07-13 05:33:00+05:30,0.010982,0.175929,50.095535,2.685518,2.677607,245.13179
2013-07-13 05:34:00+05:30,0.010978,0.177044,50.099998,2.694733,2.6882,246.001328


### Load a single column of power data

Use `fridge.power_series()` which returns a generator of 1-dimensional `pandas.Series` objects, each containing power data using the most 'sensible' AC type:

In [4]:
series = next(fridge.power_series())
series.head()

2013-07-13 05:30:00+05:30    0.166925
2013-07-13 05:31:00+05:30    0.169385
2013-07-13 05:32:00+05:30    0.177887
2013-07-13 05:33:00+05:30    0.175929
2013-07-13 05:34:00+05:30    0.177044
Name: (power, active), dtype: float32

or, to get reactive power:

In [5]:
series = next(fridge.power_series(ac_type='reactive'))
series.head()

2013-07-13 05:30:00+05:30    2.652679
2013-07-13 05:31:00+05:30    2.640115
2013-07-13 05:32:00+05:30    2.666358
2013-07-13 05:33:00+05:30    2.677607
2013-07-13 05:34:00+05:30    2.688200
Name: (power, reactive), dtype: float32

### Specify physical_quantity or AC type

In [6]:
df = next(fridge.load(physical_quantity='power', ac_type='reactive'))
df.head()

physical_quantity,power
type,reactive
2013-07-13 05:30:00+05:30,2.652679
2013-07-13 05:31:00+05:30,2.640115
2013-07-13 05:32:00+05:30,2.666358
2013-07-13 05:33:00+05:30,2.677607
2013-07-13 05:34:00+05:30,2.6882


To load voltage data:

In [7]:
df = next(fridge.load(physical_quantity='voltage'))
df.head()

physical_quantity,voltage
type,Unnamed: 1_level_1
2013-07-13 05:30:00+05:30,241.49472
2013-07-13 05:31:00+05:30,242.189423
2013-07-13 05:32:00+05:30,243.750381
2013-07-13 05:33:00+05:30,245.13179
2013-07-13 05:34:00+05:30,246.001328


In [8]:
df = next(fridge.load(physical_quantity = 'power'))
df.head()

physical_quantity,power,power,power
type,active,apparent,reactive
2013-07-13 05:30:00+05:30,0.166925,2.660094,2.652679
2013-07-13 05:31:00+05:30,0.169385,2.647615,2.640115
2013-07-13 05:32:00+05:30,0.177887,2.672245,2.666358
2013-07-13 05:33:00+05:30,0.175929,2.685518,2.677607
2013-07-13 05:34:00+05:30,0.177044,2.694733,2.6882


### Loading by specifying AC type

In [9]:
df = next(fridge.load(ac_type='active'))
df.head()

physical_quantity,power
type,active
2013-07-13 05:30:00+05:30,0.166925
2013-07-13 05:31:00+05:30,0.169385
2013-07-13 05:32:00+05:30,0.177887
2013-07-13 05:33:00+05:30,0.175929
2013-07-13 05:34:00+05:30,0.177044


### Loading by resampling to a specified period

In [10]:
# resample to minutely (i.e. with a sample period of 60 secs)
df = next(fridge.load(ac_type='active', sample_period=60))
df.head()

physical_quantity,power
type,active
2013-07-13 05:30:00+05:30,0.166925
2013-07-13 05:31:00+05:30,0.169385
2013-07-13 05:32:00+05:30,0.177887
2013-07-13 05:33:00+05:30,0.175929
2013-07-13 05:34:00+05:30,0.177044
