# Overview UK-DALE 2017 Dataset

NILMTK uses an open file format based on the HDF5 binary file format to store both the power data and the metadata. The very first step when using NILMTK is to convert your dataset to the NILMTK HDF5 file format.

UK Domestic Appliance Level Electricity (UK-DALE).  This is the full UK-DALE-disaggregated h5 format data, which had ~5GB raw size.

Created at: **11/4/2022**

**References**
- [General Info for UK-DALE 2017](https://data.ukedc.rl.ac.uk/browse/edc/efficiency/residential/EnergyConsumption/Domestic/UK-DALE-2017/ReadMe_DALE-2017.html)
- [Download Raw Dataset at ukedc](https://data.ukedc.rl.ac.uk/browse/edc/efficiency/residential/EnergyConsumption/Domestic/UK-DALE-2017/UK-DALE-FULL-disaggregated)
- [More NILMTK APIs](http://nilmtk.github.io/nilmtk/master/index.html)
- HDF5 - Hierarchical Data Formats
  - [What is HDF5 format](https://en.wikipedia.org/wiki/Hierarchical_Data_Format)
  - [HDF5 for Python](https://www.h5py.org)
- Documentation
  - [Google Python Style Guide](https://google.github.io/styleguide/pyguide.html)
  - [Markdown Cheat Sheet -MD](https://www.markdownguide.org/cheat-sheet/)
  - [pandas 0.25.3 Documentation](https://pandas.pydata.org/pandas-docs/version/0.25.3/)
  - [series Documentation](https://pandas.pydata.org/pandas-docs/version/0.25.3/reference/series.html)
  - [Python 3.6.10 Documentation](https://docs.python.org/release/3.6.10/)


# Initialization for Python and NILMTK


In [1]:
import os
import platform
import sys

import pandas as pd

import nilmtk as ntk

## Get Version Information

In [2]:
print("Python Version\n{}".format(sys.version))

print("\nPandas Version\n{}".format(pd.__version__))

print("\nNILMTK Version\n{}".format(ntk.__version__))

print("\nOS Version\n{}".format(platform.platform()))

Python Version
3.6.10 |Anaconda, Inc.| (default, Mar 23 2020, 23:13:11) 
[GCC 7.3.0]

Pandas Version
0.25.3

NILMTK Version
0.4.0.dev1+git.303d45b

OS Version
Linux-5.13.0-44-generic-x86_64-with-debian-buster-sid


## Define Global, Constant Variables and Functions

In [4]:
FILENAME = "ukdale.h5"

## Open UK-DALE 2017 HDF5 file in NILMTK

In [5]:
ds = ntk.DataSet(FILENAME)

raw_file_size = os.path.getsize(FILENAME)
print("Raw h5 file size is ~{:.2f}GB".format(raw_file_size / 1024 / 1024 / 1024)) 

OSError: No such file as ukdale.h5

In [None]:
print("ds data type is {} - {}".format(type(ds), "Hello"))

# Exploring the Dataset

This is to hava a quick look into whats in UK-DALE dataset object.  

Reference:
- [Python 3.6 Doc - collection.OrderedDict](https://docs.python.org/3.6/library/collections.html#ordereddict-objects)
- [Python 3.6 Doc - dict](https://docs.python.org/3.6/tutorial/datastructures.html#dictionaries)

## Dataset.Metadata 
There is a lot of metadata associated with the dataset, including information about the 
models of meter device the authors used to record UK-DALE.

In [None]:
print(type(ds.metadata))

### Print out the metadata to see the key and value

In [None]:
ntk.utils.print_dict(ds.metadata)

### Get - "ds.buildings" in 'collections.OrderedDict'

In [None]:
ntk.utils.print_dict(ds.buildings)

In [None]:
print(type(ds.buildings))

In [None]:
len(ds.buildings)

## Dataset.Buildling 

Each building has a little bit of metadata associated with it (there isn't much building-specific metadata in UK-DALE):

In [None]:
# Define building/house number
mybuilding = ds.buildings[1]
mybuilding

In [None]:
type(ds.buildings[1])

### Access Building 1 metadata

In [None]:
type(mybuilding.metadata)

In [None]:
# Get all keys from dictionary
mybuilding.metadata.keys()

In [None]:
ds.buildings[1].metadata['energy_improvements']

#### Extract specified key - timeframe

In [None]:
# Get the value of 'timeframe' by specifing the key in dictionary
dict_metadata_timeframe = ds.buildings[1].metadata['timeframe']
print(dict_metadata_timeframe)

In [None]:
# Extract the value of 'start' in 'timeframe'
dict_timeframe_start = ds.buildings[1].metadata['timeframe']['start']
print(dict_timeframe_start)
print("")

#### Extract specified key - rooms

In [None]:
mybuilding.metadata['rooms']

In [None]:
mybuilding.metadata['rooms'][1]

### Building identifies

In [None]:
mybuilding.identifier

In [None]:
type(mybuilding.identifier)

### Show buidling metadata

In [None]:
mybuilding.metadata

# Quick view of Building.MeterGroup

To access the building metergroup from Dataset class. It uses to formulate the group of appliances, and form it metergroup.  Also, it shows various detail appliances in the building or house in "nilmtk.metergroup.MeterGroup"

In [None]:
house_data = ds.buildings[1].elec

type(house_data)

In [None]:
# Each building has an elec attribute which is a MeterGroup object (much more about those soon!)

house_data

In [None]:
for item in house_data.appliances:
    print(item)

In [None]:
ds.buildings[1].elec.appliances[0]

In [None]:
house = ds.buildings[1].elec

In [None]:
house.select_using_appliances(type=['light', 'kettle', 'toaster'])

In [None]:
house.select_using_appliances(room='bathroom')

In [None]:
house.select_using_appliances(category='lighting')

In [None]:
print(type(ds.buildings[1].elec))

In [None]:
house.__getitem__('toaster')

# Workout & Findings

After the completion this notebook, prepare a simple presentation slide to express your summary and new discovery information from here.