brickify TableHandler: should we disallow 'query' as an operation? #59

epaulson · 2021-07-19T18:31:05Z

In looking at the Brickify TableHandler, it supports running 'query' as an operation from the YAML/JSON config. This seems like it's a little odd, since every entry in 'operation' is run once for every line in the CSV, instead of just once over the graph like it is in the Handler class. So what we're doing is running 'query' on the graph over and over again, even in the beginning when the graph might be very small.

I suppose you can be very clever about how you create your CSV and use the 'conditions' option for the 'operation', but this feels kinda dangerous since it'd be relying on the TableHandler always processing the CSV one line at a time, top to bottom.

Maybe it'd be better to disallow 'query' as an operation in the TableHandler family of classes, or have a new flavor of 'operation' that is guaranteed to only run once - like in a TableHandler we could run the 'query' operations but only after all of the rows from the input CSV files have loaded and been run through the 'translate' function?

As a related crazy idea, maybe we add a way for Brickify to load some sort of 'base' graph first? Like you'd load up a base graph and then run the TableHandler to append the from the CSV data, processed line by line, to that base graph? I think the 'query' operation would make more sense in that case?

shreyasnagare · 2021-07-20T03:29:29Z

Technically, every operation processed by the TableHandler generates a SPARQL Update query. Even operations with data /template blocks are used to create update queries (wrapped by INSERT DATA { .* } for adding triples to the base graph (blank, in the current implementation):

operations:
  -
    data: |-
      bldg:{VAV name}  brick:hasPoint bldg:{temperature sensor} .
      bldg:{VAV name}  brick:hasPoint bldg:{temperature setpoint} .

But I think it makes sense to let the data and template (Jinja) blocks be specifically for row-wise insertions (exclusive to TableHandler and its subclasses?) and use the query blocks for performing queries (one-time, like Handler does) after the tabular insertion is complete.

py-brickschema/tests/data/brickify/jinja2/template.yml

Lines 12 to 16 in 8d1740f

    
               data: |- 
        
                 bldg:{VAV name} rdf:type brick:RVAV . 
        
             - template: |- 
        
                 {{ num_triples(value['VAV name'], "brick:hasPoint", value['temperature sensor'], value['sensors'], "brick:Temperature_Sensor") }}

As a related crazy idea, maybe we add a way for Brickify to load some sort of 'base' graph first?

I didn't think of it at the time but it seems like having non-blank base graphs preloaded could be really useful.

epaulson assigned shreyasnagare Jul 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

brickify TableHandler: should we disallow 'query' as an operation? #59

brickify TableHandler: should we disallow 'query' as an operation? #59

epaulson commented Jul 19, 2021

shreyasnagare commented Jul 20, 2021

brickify TableHandler: should we disallow 'query' as an operation? #59

brickify TableHandler: should we disallow 'query' as an operation? #59

Comments

epaulson commented Jul 19, 2021

shreyasnagare commented Jul 20, 2021