Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

brickify TableHandler: should we disallow 'query' as an operation? #59

Open
epaulson opened this issue Jul 19, 2021 · 1 comment
Open
Assignees

Comments

@epaulson
Copy link
Contributor

In looking at the Brickify TableHandler, it supports running 'query' as an operation from the YAML/JSON config. This seems like it's a little odd, since every entry in 'operation' is run once for every line in the CSV, instead of just once over the graph like it is in the Handler class. So what we're doing is running 'query' on the graph over and over again, even in the beginning when the graph might be very small.

I suppose you can be very clever about how you create your CSV and use the 'conditions' option for the 'operation', but this feels kinda dangerous since it'd be relying on the TableHandler always processing the CSV one line at a time, top to bottom.

Maybe it'd be better to disallow 'query' as an operation in the TableHandler family of classes, or have a new flavor of 'operation' that is guaranteed to only run once - like in a TableHandler we could run the 'query' operations but only after all of the rows from the input CSV files have loaded and been run through the 'translate' function?

As a related crazy idea, maybe we add a way for Brickify to load some sort of 'base' graph first? Like you'd load up a base graph and then run the TableHandler to append the from the CSV data, processed line by line, to that base graph? I think the 'query' operation would make more sense in that case?

@shreyasnagare
Copy link
Member

Technically, every operation processed by the TableHandler generates a SPARQL Update query. Even operations with data /template blocks are used to create update queries (wrapped by INSERT DATA { .* } for adding triples to the base graph (blank, in the current implementation):

operations:
  -
    data: |-
      bldg:{VAV name}  brick:hasPoint bldg:{temperature sensor} .
      bldg:{VAV name}  brick:hasPoint bldg:{temperature setpoint} .

But I think it makes sense to let the data and template (Jinja) blocks be specifically for row-wise insertions (exclusive to TableHandler and its subclasses?) and use the query blocks for performing queries (one-time, like Handler does) after the tabular insertion is complete.

data: |-
bldg:{VAV name} rdf:type brick:RVAV .
- template: |-
{{ num_triples(value['VAV name'], "brick:hasPoint", value['temperature sensor'], value['sensors'], "brick:Temperature_Sensor") }}

As a related crazy idea, maybe we add a way for Brickify to load some sort of 'base' graph first?

I didn't think of it at the time but it seems like having non-blank base graphs preloaded could be really useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants