Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #321 : Add blank nodes to attribute definition #324

Merged
merged 15 commits into from
May 10, 2022

Conversation

mboudet
Copy link
Contributor

@mboudet mboudet commented Apr 4, 2022

Fixes #321

This is an overhaul of the way to describe "attributes"
As explained in the linked issue, the current way of translating to TTL leads to a mix-up if the same attribute name is used between different entities, but with a different type). This means Askomics will not know which type belong to each entity, leading to a possibly wrong query field type (integer instead of string for instance). It is especially an issue if one attribute is specified as a 'Category' for one entity, since all entities will display this attribute as a category.

This PR is a proposal to write attributes using blank nodes, making sure there are no mix-up between entities. This blank node will be linked to the attribute uri with the askomics:uri predicate.
Namely, an attribute will be written as follows: (ex: Expression attribute of the DifferentialExpression entity)

_:blank_node  askomics:uri  :Expression ;
              a owl:DatatypeProperty ;
              rdfs:label "Expression" ;
              rdfs:domain :DifferentialExpression ;
              rdfs:range xsd:string ;

To maintain retrocompatibility with existing datasets, the SPARQL query listing relations has been slightly modified to return both new and old relations.
(There is no way to convert automatically the old datasets, since the information is lost. The files needs to be re-integrated. )

The new SPARQL query is:

        SELECT DISTINCT ?graph ?entity_uri ?attribute_uri ?attribute_type ?attribute_faldo ?attribute_label ?attribute_range ?category_value_uri ?category_value_label
        WHERE {
            # Graphs
            ?graph askomics:public ?public .
            ?graph dc:creator ?creator .
            GRAPH ?graph {
                ?node a ?attribute_type .
                VALUES ?attribute_type { owl:DatatypeProperty askomics:AskomicsCategory }
                ?node rdfs:label ?attribute_label .
                ?node rdfs:range ?attribute_range .
                # Retrocompatibility
                OPTIONAL {?node askomics:uri ?attribute_uri}
                BIND( IF(isBlank(?node),?attribute_uri, ?node) as ?attribute_uri )
                # Faldo
                OPTIONAL {
                    ?node a ?attribute_faldo .
                    VALUES ?attribute_faldo { askomics:faldoStart askomics:faldoEnd askomics:faldoStrand askomics:faldoReference }
                }
                # Categories (DK)
                OPTIONAL {
                    ?attribute_range askomics:category ?category_value_uri .
                    ?category_value_uri rdfs:label ?category_value_label .
                }
            }
            # Attribute of entity (or motherclass of entity)
            {
                ?node rdfs:domain ?mother .
                ?entity_uri rdfs:subClassOf ?mother .
            } UNION {
                ?node rdfs:domain ?entity_uri .
            }

Both old-style and new-style relations will work for now.

Since the new request is slightly cleaner, it was also updated for relations (from #268)

@mboudet mboudet changed the title [WIP] Fix #321 : Add blank nodes to attribute definition Fix #321 : Add blank nodes to attribute definition Apr 4, 2022
@mboudet
Copy link
Contributor Author

mboudet commented Apr 4, 2022

@ofilangi if you want to check the code.

@ofilangi
Copy link
Contributor

ofilangi commented Apr 5, 2022

This implementation seems good . This is great to manage attribute with blank node
Beware of the intensive use of the keyword "OPTIONAL". If the Triplestore contains a lot of data, there may be performance issues

@mboudet
Copy link
Contributor Author

mboudet commented Apr 5, 2022

Agreed on the OPTIONAL cost. (Though the only new lines are these two) :

OPTIONAL {?node askomics:uri ?attribute_uri}
BIND( IF(isBlank(?node),?attribute_uri, ?node) as ?attribute_uri )

But it's the only way to manage retro-compatibility.
We could add a config option LEGACY_ABSTRACTION config option that would toggle this part on/off I guess?

@ofilangi
Copy link
Contributor

ofilangi commented Apr 5, 2022

no needed....there no triplet under this optional request...should be quickly if askomics has been deployed in a old way....

@mboudet mboudet merged commit cdc4f3b into askomics:dev May 10, 2022
@mboudet mboudet deleted the fix_321 branch May 3, 2023 09:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mismatch when using the same column name for different columns types (and entities)
2 participants