# Loading data into memory

Loading API is central to a lot of nilmtk operations and provides a great deal of flexibility. Let's look at ways in which we can load data from a NILMTK DataStore into memory. To see the full range of possible queries, we'll use the [iAWE data set](http://iawe.github.io) (whose HDF5 file can be downloaded [here](https://copy.com/C2sIt1UfDx1mfPlC)).

The `load` function returns a *generator* of DataFrames loaded from the DataStore based on the conditions specified. If no conditions are specified, then all data from all the columns is loaded.  (If you have not come across Python generators, it might be worth reading [this quick guide to Python generators](http://stackoverflow.com/a/1756156/732596).)

In [12]:
from nilmtk import DataSet

iawe = DataSet('/home/shifona/Downloads/mini_project/REDD/redd.h5')
elec = iawe.buildings[1].elec
elec

MeterGroup(meters=
  ElecMeter(instance=1, building=1, dataset='REDD', site_meter, appliances=[])
  ElecMeter(instance=2, building=1, dataset='REDD', site_meter, appliances=[])
  ElecMeter(instance=5, building=1, dataset='REDD', appliances=[Appliance(type='fridge', instance=1)])
  ElecMeter(instance=6, building=1, dataset='REDD', appliances=[Appliance(type='dish washer', instance=1)])
  ElecMeter(instance=7, building=1, dataset='REDD', appliances=[Appliance(type='sockets', instance=1)])
  ElecMeter(instance=8, building=1, dataset='REDD', appliances=[Appliance(type='sockets', instance=2)])
  ElecMeter(instance=9, building=1, dataset='REDD', appliances=[Appliance(type='light', instance=1)])
  ElecMeter(instance=11, building=1, dataset='REDD', appliances=[Appliance(type='microwave', instance=1)])
  ElecMeter(instance=12, building=1, dataset='REDD', appliances=[Appliance(type='unknown', instance=1)])
  ElecMeter(instance=13, building=1, dataset='REDD', appliances=[Appliance(type='electric 

Let us see what measurements we have for the fridge:

In [3]:
fridge = elec['fridge']
fridge.available_columns()

[('power', 'active')]

## Loading data

### Load all columns (default)

In [4]:
df = fridge.load().next()
df.head()

physical_quantity,power
type,active
2011-04-18 09:22:13-04:00,6.0
2011-04-18 09:22:16-04:00,6.0
2011-04-18 09:22:20-04:00,6.0
2011-04-18 09:22:23-04:00,6.0
2011-04-18 09:22:26-04:00,6.0


### Load a single column of power data

Use `fridge.power_series()` which returns a generator of 1-dimensional `pandas.Series` objects, each containing power data using the most 'sensible' AC type:

In [5]:
series = fridge.power_series().next()
series.head()

2011-04-18 09:22:13-04:00    6.0
2011-04-18 09:22:16-04:00    6.0
2011-04-18 09:22:20-04:00    6.0
2011-04-18 09:22:23-04:00    6.0
2011-04-18 09:22:26-04:00    6.0
Name: (power, active), dtype: float32

or, to get reactive power:

In [7]:
#series = fridge.power_series(ac_type='reactive').next()
#series.head()

### Specify physical_quantity or AC type

In [5]:
#df = fridge.load(physical_quantity='power', ac_type='reactive').next()
#df.head()

physical_quantity,power
type,reactive
2013-06-07 05:30:00+05:30,2.483
2013-06-07 05:30:01+05:30,2.547
2013-06-07 05:30:02+05:30,2.48
2013-06-07 05:30:03+05:30,2.444
2013-06-07 05:30:04+05:30,2.51


To load voltage data:

In [7]:
df = fridge.load(physical_quantity='voltage').next()
df.head()

physical_quantity,voltage
type,Unnamed: 1_level_1
2013-06-07 05:30:00+05:30,235.070007
2013-06-07 05:30:01+05:30,235.020004
2013-06-07 05:30:02+05:30,234.979996
2013-06-07 05:30:03+05:30,235.0
2013-06-07 05:30:04+05:30,234.949997


In [9]:
df = fridge.load(physical_quantity = 'power').next()
df.head()

physical_quantity,power,power,power
type,apparent,active,reactive
2013-06-07 05:30:00+05:30,2.486,0.111,2.483
2013-06-07 05:30:01+05:30,2.555,0.2,2.547
2013-06-07 05:30:02+05:30,2.485,0.152,2.48
2013-06-07 05:30:03+05:30,2.449,0.159,2.444
2013-06-07 05:30:04+05:30,2.519,0.215,2.51


### Loading by specifying AC type

In [10]:
df = fridge.load(ac_type = 'active').next()
df.head()

physical_quantity,power
type,active
2013-06-07 05:30:00+05:30,0.111
2013-06-07 05:30:01+05:30,0.2
2013-06-07 05:30:02+05:30,0.152
2013-06-07 05:30:03+05:30,0.159
2013-06-07 05:30:04+05:30,0.215


### Loading by resampling to a specified period

In [10]:
# resample to minutely (i.e. with a sample period of 60 secs)
df = fridge.load(ac_type = 'active', sample_period=60).next()
df.head()

physical_quantity,power
type,active
2011-04-18 09:22:00-04:00,
2011-04-18 09:23:00-04:00,6.0
2011-04-18 09:24:00-04:00,6.0
2011-04-18 09:25:00-04:00,6.0
2011-04-18 09:26:00-04:00,6.0
