Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for a 'simple' NILM Metadata schema #16

Open
JackKelly opened this issue Jan 14, 2015 · 4 comments
Open

Proposal for a 'simple' NILM Metadata schema #16

JackKelly opened this issue Jan 14, 2015 · 4 comments

Comments

@JackKelly
Copy link
Contributor

NILM Metadata tries to make it possible to capture pretty much any conceivable scenario. But, as more datasets become available, it appears that a large proportion of datasets could be described using a simpler metadata schema. It would be great to discuss the design of a "Simple NILM Metadata" schema which could exist along side "NILM Metadata". Perhaps CSV is even easier to read than YAML in Matlab, Java etc so it might be nice if we can use CSV. We'd have a check-list to help people decide whether they require the full expressive power of "NILM Metadata" or if they can get by with "Simple NILM Metadata".

The simple schema could also be used for adding metadata to the output of disaggregation algorithms (hence helping to simplify NILMTK disaggregation algorithm implementation); and for describing the training dataset and the responses for any future NILM competition or validation tool (I'm working with a group of MSc students who aim to produce a proof-of-concept NILM validation tool by the end of this term; here's the project spec.)

So, here's an initial proposal, using REDD as an example:

building1_labels.csv

This looks a little like labels.dat in the REDD format except that:

  • we use a comma as a separator (which is standard for CSV, and also allows us to use spaces in strings without using quotes)
  • we use the file suffix csv not dat (so that spreadsheet applications know how to open the file)
  • we use our NILM Metadata controlled vocabulary for appliance names
  • we give an instance number for each appliance
  • if there are multiple appliances measured by a meter then separate them by a semicolon e.g. 6,television#1;light#1
  • we could optionally use a third column to specify the submeter_of property. If this is not specified then we assume that anything that isn't a site meter is downstream of all site meters, and that all site meters should be summed to get the total whole-house power demand. Or maybe we should keep "Simple NILM Metadata" as simple as possible and say that any non-standard wiring hierarchy simply cannot be expressed using "Simple NILM Metadata"?
meter instance, label

1,site meter
2,site meter
3,electric oven#1
4,electric oven#1
5,fridge#1

meter_devices.csv

We also need to specify what is measured in each data file. In NILM Metadata this is done in meter_devices.yaml. In "Simple NILM Metadata" this could be done in a meter_devices.csv files. The file would contain three columns; each row would be a <meter device name>,<key>,<value> tuple. e.g.:

meter device name,key,value

site meters,sample period,1
site meters,measurements,active power;apparent power
submeters,sample period,3
submeters,measurements,active power
submeters,model,eMonitor
submeters,manufacturer,Powerhouse Dynamics

The assumption would be that all meters with the label site meter would take attributes from site meters and all other meters would take attributes from submeters. If this is not the case (e.g. if there are several types of submeter) then we could do the following (and we'd only have to specify this for the meters for which the default assumption does not hold).

meter_devices_mapping.csv
building instance,meter instance,meter device name

1,1,Current Cost
1,2,SCPM

Any thoughts? If you use Matlab / Java / Scala / Julia / C++ etc, would you find it easier to load metadata described using CSV files rather than YAML files? If you maintain a dataset, is there anything in your own dataset that the proposal above cannot express?

@JackKelly JackKelly changed the title Proposal for a 'simple' version of the metadata Proposal for a 'simple' NILM Metadata schema Jan 14, 2015
@JackKelly
Copy link
Contributor Author

Some feedback from Peter Davies over email:

My vote would be csv. Very universal and everyone understands it.

@eleijonmarck
Copy link

Hey, I would also prefer CSV files. It would be great to have a universal way of interacting with NILM data as it seems to just hit the market and start fresh.

@JackKelly
Copy link
Contributor Author

cool, thanks for the reply! just to clarify: we do have the NILM Metadata schema already, which does work. It's just that it can be a little over-complex for simple domestic installations. Hence why we're considering building a more simple schema. Also, of course, we have NILMTK for loading and playing with NILM data.

@Artform
Copy link

Artform commented Feb 13, 2015

+1 for CSV

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants