-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Idea : how-to implement the SERVICE keyword for federated querying #352
Comments
The question of the population of lists or autocomplete widgets also needs to be dealt with.
But this imply that the data is homogeneously represented in all possible services, as the query of the datasource remains the same. To be completely open, the property should be associated to multiple datasources, each associated to a different endpoint. A possibility for configuration is the following:
|
Regarding #352 (comment) Nr: 1 Regarding #352 (comment) Nr.2: This way it would also be possible to keep using the optional/notexists together with the service keyword. If then further use cases arise where advanced users would like to choose the endpoint (which implies the user knows the different information contained in the selectable KGs. Since he/she must determine if he wants the label (or so) from wikidata or dpedia), the we can still implement it. |
Yes my assumption is that they are exclusive. Also for visual reason (it would be hard to find a border mixing combination of e.g. optional + service)
I agree that we can limit it to a single service per property, so that the user does not have to choose from different remote services (or we could also simply say : "if only one service is configured on the property, then don't show the service dropdown, otherwise show it").
Isn't it also an advanced use-case ? I suggest we don't allow the combination of optional/not exists with service, so that we can use the same visual green arrow components. Otherwise we need to find another visual solution. |
My suggestion is actually to not have any visual components at all for the SERVICE keyword. It only depends on the person who configures Sparnatural to decide when the SERVICE keyword is injected. Additionally, imagine following scenario: A user goes to Sparnatural and sees the Service arrow rendered when he/she starts to build a query. The service btn asks if the query should be done on KG_B since this KG is configured as service in the config. Now how does the user know if he/she should click the arrow? The user probably doesn't even know its querying KG_A. This scenario implies the user being aware of multiple things:
Of course with some different naming for the SERVICE keyword, it might be easier to understand. But even then I'm certain its going to be difficult for people choosing between different KGs.
Yes it is a bit advanced but again only for the person configuring Sparnatural. The user doesn't even realize he/she is querying multiple KGs. They can build their query as if it would be one single KG (which I think, is the true strength of the SERVICE keyword). Instead of an arrow i would recommend the following workaround for selecting endpoints: |
Yes yes all of this makes total sense. I agree with the approach. I see 2 questions now : 1/ the configuration and 2/ the query execution ordering. ConfigurationIn order to configure the service I suggest that we rely on the SPARQL service description vocabulary Service class, coupled with a new
And I suggest that the existing Query execution orderingQuery execution ordering is key when working with SERVICE, and imply using subqueries. Subqueries are the only way to control the ordering of query execution. There are 2 scenarios involving SERVICE and requiring a different ordering in query execution: SERVICE as a an additional criteria (executed before main query)Typical use-case : "I want all the Museums located in Country where [SERVICE] Country part of Europe [end SERVICE]" We want the "Country part of Europe" criteria executed before, and then joined with our local list of countries. The SERVICE keyword needs to be put in a subquery.
Which also implies that the datasource for dbo:partOf needs to be associated with the DBPedia service to properly fetch value dbpedia:Europe. SERVICE to fetch additional metadata (executed after main query)Typical use-case : "I want all the Museums located in Country where Country part of Europe and [SERVICE] Country has population Number [end SERVICE]" We want the "Musems located in Country part of Europea" criteria executed before, and then we want the SERVICE clause executed to fetch the population of those Countries only (and not all Countries)
DiscussionHow do we know in which situation we are ? if there is no filtering criteria (no value selected, only the eye clicked), than we are in the second situation. Otherwise we are in the first. |
Configuration: Yes, I like the configuration proposal a lot. Query execution ordering:
OR
My thoughts:
This actually only says that subqueries are logically evaluated first. It doesn't say anything about how to technically retrieve these values. Anyway, I don't think it is wise to engage in query optimization. The query plan behind Sparnatural can change depending of the implementation. Query optimization is not really a responsibility Sparnatural should handle but the implementation. If query execution planning and optimization is needed I would suggest looking for the proper execution engine. I saw some execution engines allow for query planning and optimization in their configs: I propose the following: Implementing SERVICE without optimization first. (leaving it up to the implementation of the query execution engine) What do you think? |
This is what I meant. Everywhere I wrote "executed first", please read "logically executed first". This is the only thing we care about. Of course, we can progress step-by-step, write the vanilla query first, hit a wall, and then climb the wall to progress, so I agree with your proposal of implementing the SERVICE without anything else. See this discussion on the SPARQL 1.2 group where I actually learned about the "subquery trick" : w3c/sparql-dev#21 |
Okay cool then let's start simple first and work our way up. I'll proceed with the implementation then. |
My ideas to implement the "SERVICE" keyword:
Step 1: a new option similar to "optional" and "not exists" is proposed on properties where federation makes sense:
Step 2: when user clicks on "Service..." a dropdown is displayed with available federation services
Step 3: user selects a federation service, and the "service" arrow is renamed to the selected service and gets activated
We note that the SERVICE would apply to the whole "subtree" like the optional or not exists.
To do that this would require:
The text was updated successfully, but these errors were encountered: