-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for a 'simple' NILM Metadata schema #16
Comments
Some feedback from Peter Davies over email:
|
Hey, I would also prefer CSV files. It would be great to have a universal way of interacting with NILM data as it seems to just hit the market and start fresh. |
cool, thanks for the reply! just to clarify: we do have the NILM Metadata schema already, which does work. It's just that it can be a little over-complex for simple domestic installations. Hence why we're considering building a more simple schema. Also, of course, we have NILMTK for loading and playing with NILM data. |
+1 for CSV |
NILM Metadata tries to make it possible to capture pretty much any conceivable scenario. But, as more datasets become available, it appears that a large proportion of datasets could be described using a simpler metadata schema. It would be great to discuss the design of a "Simple NILM Metadata" schema which could exist along side "NILM Metadata". Perhaps CSV is even easier to read than YAML in Matlab, Java etc so it might be nice if we can use CSV. We'd have a check-list to help people decide whether they require the full expressive power of "NILM Metadata" or if they can get by with "Simple NILM Metadata".
The simple schema could also be used for adding metadata to the output of disaggregation algorithms (hence helping to simplify NILMTK disaggregation algorithm implementation); and for describing the training dataset and the responses for any future NILM competition or validation tool (I'm working with a group of MSc students who aim to produce a proof-of-concept NILM validation tool by the end of this term; here's the project spec.)
So, here's an initial proposal, using REDD as an example:
building1_labels.csv
This looks a little like
labels.dat
in the REDD format except that:csv
notdat
(so that spreadsheet applications know how to open the file)6,television#1;light#1
submeter_of
property. If this is not specified then we assume that anything that isn't asite meter
is downstream of all site meters, and that all site meters should be summed to get the total whole-house power demand. Or maybe we should keep "Simple NILM Metadata" as simple as possible and say that any non-standard wiring hierarchy simply cannot be expressed using "Simple NILM Metadata"?meter_devices.csv
We also need to specify what is measured in each data file. In NILM Metadata this is done in
meter_devices.yaml
. In "Simple NILM Metadata" this could be done in ameter_devices.csv
files. The file would contain three columns; each row would be a<meter device name>,<key>,<value>
tuple. e.g.:The assumption would be that all meters with the label
site meter
would take attributes fromsite meters
and all other meters would take attributes fromsubmeters
. If this is not the case (e.g. if there are several types of submeter) then we could do the following (and we'd only have to specify this for the meters for which the default assumption does not hold).Any thoughts? If you use Matlab / Java / Scala / Julia / C++ etc, would you find it easier to load metadata described using CSV files rather than YAML files? If you maintain a dataset, is there anything in your own dataset that the proposal above cannot express?
The text was updated successfully, but these errors were encountered: