Proposal for a 'simple' NILM Metadata schema #16

JackKelly · 2015-01-14T10:09:32Z

NILM Metadata tries to make it possible to capture pretty much any conceivable scenario. But, as more datasets become available, it appears that a large proportion of datasets could be described using a simpler metadata schema. It would be great to discuss the design of a "Simple NILM Metadata" schema which could exist along side "NILM Metadata". Perhaps CSV is even easier to read than YAML in Matlab, Java etc so it might be nice if we can use CSV. We'd have a check-list to help people decide whether they require the full expressive power of "NILM Metadata" or if they can get by with "Simple NILM Metadata".

The simple schema could also be used for adding metadata to the output of disaggregation algorithms (hence helping to simplify NILMTK disaggregation algorithm implementation); and for describing the training dataset and the responses for any future NILM competition or validation tool (I'm working with a group of MSc students who aim to produce a proof-of-concept NILM validation tool by the end of this term; here's the project spec.)

So, here's an initial proposal, using REDD as an example:

building1_labels.csv

This looks a little like labels.dat in the REDD format except that:

we use a comma as a separator (which is standard for CSV, and also allows us to use spaces in strings without using quotes)
we use the file suffix csv not dat (so that spreadsheet applications know how to open the file)
we use our NILM Metadata controlled vocabulary for appliance names
we give an instance number for each appliance
if there are multiple appliances measured by a meter then separate them by a semicolon e.g. 6,television#1;light#1
we could optionally use a third column to specify the submeter_of property. If this is not specified then we assume that anything that isn't a site meter is downstream of all site meters, and that all site meters should be summed to get the total whole-house power demand. Or maybe we should keep "Simple NILM Metadata" as simple as possible and say that any non-standard wiring hierarchy simply cannot be expressed using "Simple NILM Metadata"?

meter instance, label

1,site meter
2,site meter
3,electric oven#1
4,electric oven#1
5,fridge#1

meter_devices.csv

We also need to specify what is measured in each data file. In NILM Metadata this is done in meter_devices.yaml. In "Simple NILM Metadata" this could be done in a meter_devices.csv files. The file would contain three columns; each row would be a <meter device name>,<key>,<value> tuple. e.g.:

meter device name,key,value

site meters,sample period,1
site meters,measurements,active power;apparent power
submeters,sample period,3
submeters,measurements,active power
submeters,model,eMonitor
submeters,manufacturer,Powerhouse Dynamics

The assumption would be that all meters with the label site meter would take attributes from site meters and all other meters would take attributes from submeters. If this is not the case (e.g. if there are several types of submeter) then we could do the following (and we'd only have to specify this for the meters for which the default assumption does not hold).

meter_devices_mapping.csv

building instance,meter instance,meter device name

1,1,Current Cost
1,2,SCPM

Any thoughts? If you use Matlab / Java / Scala / Julia / C++ etc, would you find it easier to load metadata described using CSV files rather than YAML files? If you maintain a dataset, is there anything in your own dataset that the proposal above cannot express?

The text was updated successfully, but these errors were encountered:

JackKelly · 2015-01-14T14:36:51Z

Some feedback from Peter Davies over email:

My vote would be csv. Very universal and everyone understands it.

eleijonmarck · 2015-02-06T15:32:29Z

Hey, I would also prefer CSV files. It would be great to have a universal way of interacting with NILM data as it seems to just hit the market and start fresh.

JackKelly · 2015-02-06T15:46:22Z

cool, thanks for the reply! just to clarify: we do have the NILM Metadata schema already, which does work. It's just that it can be a little over-complex for simple domestic installations. Hence why we're considering building a more simple schema. Also, of course, we have NILMTK for loading and playing with NILM data.

Artform · 2015-02-13T14:53:07Z

+1 for CSV

JackKelly changed the title ~~Proposal for a 'simple' version of the metadata~~ Proposal for a 'simple' NILM Metadata schema Jan 14, 2015

JackKelly mentioned this issue Jun 17, 2015

Missing control components #23

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal for a 'simple' NILM Metadata schema #16

Proposal for a 'simple' NILM Metadata schema #16

JackKelly commented Jan 14, 2015

JackKelly commented Jan 14, 2015

eleijonmarck commented Feb 6, 2015

JackKelly commented Feb 6, 2015

Artform commented Feb 13, 2015

Proposal for a 'simple' NILM Metadata schema #16

Proposal for a 'simple' NILM Metadata schema #16

Comments

JackKelly commented Jan 14, 2015

building1_labels.csv

meter_devices.csv

JackKelly commented Jan 14, 2015

eleijonmarck commented Feb 6, 2015

JackKelly commented Feb 6, 2015

Artform commented Feb 13, 2015