You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In looking at the Brickify TableHandler, it supports running 'query' as an operation from the YAML/JSON config. This seems like it's a little odd, since every entry in 'operation' is run once for every line in the CSV, instead of just once over the graph like it is in the Handler class. So what we're doing is running 'query' on the graph over and over again, even in the beginning when the graph might be very small.
I suppose you can be very clever about how you create your CSV and use the 'conditions' option for the 'operation', but this feels kinda dangerous since it'd be relying on the TableHandler always processing the CSV one line at a time, top to bottom.
Maybe it'd be better to disallow 'query' as an operation in the TableHandler family of classes, or have a new flavor of 'operation' that is guaranteed to only run once - like in a TableHandler we could run the 'query' operations but only after all of the rows from the input CSV files have loaded and been run through the 'translate' function?
As a related crazy idea, maybe we add a way for Brickify to load some sort of 'base' graph first? Like you'd load up a base graph and then run the TableHandler to append the from the CSV data, processed line by line, to that base graph? I think the 'query' operation would make more sense in that case?
The text was updated successfully, but these errors were encountered:
Technically, every operation processed by the TableHandler generates a SPARQL Update query. Even operations with data /template blocks are used to create update queries (wrapped by INSERT DATA { .* } for adding triples to the base graph (blank, in the current implementation):
But I think it makes sense to let the data and template (Jinja) blocks be specifically for row-wise insertions (exclusive to TableHandler and its subclasses?) and use the query blocks for performing queries (one-time, like Handler does) after the tabular insertion is complete.
In looking at the Brickify TableHandler, it supports running 'query' as an operation from the YAML/JSON config. This seems like it's a little odd, since every entry in 'operation' is run once for every line in the CSV, instead of just once over the graph like it is in the Handler class. So what we're doing is running 'query' on the graph over and over again, even in the beginning when the graph might be very small.
I suppose you can be very clever about how you create your CSV and use the 'conditions' option for the 'operation', but this feels kinda dangerous since it'd be relying on the TableHandler always processing the CSV one line at a time, top to bottom.
Maybe it'd be better to disallow 'query' as an operation in the TableHandler family of classes, or have a new flavor of 'operation' that is guaranteed to only run once - like in a TableHandler we could run the 'query' operations but only after all of the rows from the input CSV files have loaded and been run through the 'translate' function?
As a related crazy idea, maybe we add a way for Brickify to load some sort of 'base' graph first? Like you'd load up a base graph and then run the TableHandler to append the from the CSV data, processed line by line, to that base graph? I think the 'query' operation would make more sense in that case?
The text was updated successfully, but these errors were encountered: