Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mapping to QUDT2 #60

Open
brandonnodnarb opened this issue Oct 26, 2017 · 15 comments
Open

Mapping to QUDT2 #60

brandonnodnarb opened this issue Oct 26, 2017 · 15 comments

Comments

@brandonnodnarb
Copy link
Member

This issue relates to #23 and #55. What is the extent of QUDT's units coverage as compared to SWEET? Could QUDT be imported--either partially (modular) or entirely--instead of having replication?

An initial mapping should provide a basis for further informed discussion.

@SirkoS
Copy link

SirkoS commented Oct 26, 2017

Hi,
I'm one of the authors involved in the work leading to #55 .
Part of that work was also the overlap in units (and related concepts) between several ontologies.
So I can give you some numbers for the comparison between SWEET 2.3 and QUDT 1.1 .
(I'm not sure if you're targetting QUDT 2 here, but we excluded that for now as it is work in progress).

Concept #Unique to SWEET 2.3 #Unique to QUDT 1.1
Field of Application 49 0
Dimension 0 128
Prefix 0 16
Kind of Quantity 769 164
System 0 11
Unit 53 678

For a detailed list of individuals you can have a look at this list. This is the result of a tool we built to compare two versions of an ontology. I abused it here to show the difference between Sweet 2.3 and QUDT 1.1, so added are individuals present in QUDT but not in SWEET and removed vica versa.

PS: A publication for the complete ontology evaluation is currently under review. Most of the (data) results, however, are already publicly available as are the scripts to generate them.

@lewismc
Copy link
Member

lewismc commented Oct 26, 2017

Thank you @SirkoS would you be interested in joining the ESIPFed Org on Github and being notified of updates to SWEET?

@SirkoS
Copy link

SirkoS commented Oct 28, 2017

@lewismc I already added this repo to my watchlist. I think this should send me messages regarding any updates.
I currently have no time to get involved in another project, but I will to provide you with data and explanations of our work, if needed.

@lewismc lewismc modified the milestones: 3.2.0, 3.3.0 Mar 7, 2018
@lewismc lewismc removed this from the 3.3.0 milestone Jul 18, 2019
@nicholascar
Copy link
Collaborator

The Geological Survey of Queensland where I work is keen to see SWEET/QUDT integration. They will use QUDT for basic units of measure and will declare specialised geo units in QUDT form as well as contribute a QUDT Geo discipline (http://qudt.org/doc/2019/10/DOC_VOCAB-DISCIPLINES-v2.1.html).

They are already using SWEET objects here and there, e.g.

<borehole/1>
    sosa:hasUltimateFeatureOfInterest <http://sweetontology.net/realmEarthReference/EarthCrust> ;

But now we have to choose units of measure and we see QUDT as a deep ontology with strong mathematical backing where SWEET units are a bit lighter on. Sweet units do have mathematical elements, e.g. hasScalingNumber, baseUnit in http://sweetontology.net/reprSciUnits/kilometer but the QUDT equivalent, http://sweetontology.net/reprSciUnits/kilometer, has more: conversionMultiplier, conversionOffset and strong classing (qudt:DerivedUnit & qudt:LengthUnit) etc.

Some touch points are:

@prefix qudt: <http://qudt.org/schema/qudt/> . 
@prefix unit: <http://qudt.org/vocab/unit/> .

Classes:
http://sweetontology.net/propQuantity/Quantity <--> qudt:Quantity

Properties:
http://sweetontology.net/relaSci/hasUnit <--> qudt:unit

Unit instances:
http://sweetontology.net/reprSciUnits/kilometer <--> unit:KiloM
http://sweetontology.net/reprSciUnits/MHz <--> unit:MegaHZ

@lewismc
Copy link
Member

lewismc commented Dec 4, 2019

@SirkoS do you have an update of your work? Thank you

@lewismc lewismc added the SCIWS2 label Dec 4, 2019
@SirkoS
Copy link

SirkoS commented Dec 5, 2019

We only created mappings for the unit instances. Other classes etc would have to be added. The mappings we created at the time can be found in the respective Github repo in the file a100 ontology - unitMapping.ttl. (I just fixed the mappings there as only a subset was uploaded previously by mistake.)

I extracted the mappings between SWEET and QUDT into the following file: SWEET3 - QUDT mapping.txt. This also contains the mappings within SWEET and QUDT themselves, as we found some duplicates within the ontologies at the time.

This is also supported by the provided scripts. So if you need another mapping subset can either run the scripts yourself (available here - again the a001 ... file would need a small change at the top of the script) or ping me again.

I was not able to rerun the whole analysis but just reused our old results. So the mappings are based on SWEET 3.1 - I don't know if there were any major changes to the units themselves since then.

We were/are also planning to push all our mappings to Wikidata, which might be relevant for #156 . However, there are two obstacles right now:

  • The property for mapping QUDT needs to be fixed. It's formatter-URL only works for some mappings, It has a fixed namespace, where QUDT uses multiple ones. So before we can push, this needs to be taken care of.
  • We never really curated the Wikidata mappings for completeness. That's a manual process and Wikidata has lots of units and almost as many issues with regard to this (duplicates/typos/homonyms etc).
  • Lack of time right now 😉

As for the general outlook: @jmkeil is working on generalizing the approach, but I have no clue about progress and schedule for that.

@dr-shorthair
Copy link
Collaborator

  1. The QUDT mappings a100 ontology - unitMapping.ttl appear to be broken - the first in the list is http://qudt.org/vocab/unit#MinuteAngle which does not exist at QUDT - think the URI is http://qudt.org/vocab/unit/ARCMIN

  2. Similarly, looking in SWEET3 - QUDT mapping.txt I see http://qudt.org/vocab/unit#DegreeFahrenheit which should be http://qudt.org/vocab/unit/DEG_F

So I'm not sure if these tabulations are helpful at present.

@SirkoS
Copy link

SirkoS commented Dec 6, 2019

  1. The QUDT mappings a100 ontology - unitMapping.ttl appear to be broken - the first in the list is http://qudt.org/vocab/unit#MinuteAngle which does not exist at QUDT - think the URI is http://qudt.org/vocab/unit/ARCMIN

  2. Similarly, looking in SWEET3 - QUDT mapping.txt I see http://qudt.org/vocab/unit#DegreeFahrenheit which should be http://qudt.org/vocab/unit/DEG_F

So I'm not sure if these tabulations are helpful at present.

I guess you are referring to QUDT2. As I tried to mention, I just ran the respective script based on our old results, which only included QUDT1.1 at the time. QUDT1.1 still has those units in (see http://qudt.org/1.1/vocab/OVG_units-qudt-(v1.1).ttl ):

I will try a rerun for QUDT2, but this will take some time. Also I currently do not have the time for a manual revision. So there might be some mappings missing, but maybe it can serve as the basis for your efforts here.

@dr-shorthair
Copy link
Collaborator

I guess you are referring to QUDT2

I was looking at the thing that the base URI http://qudt.org/vocab/unit resolves to. Now I fully understand that QUDT was a bit late to the party in publishing their resources as linked data, but anyone who doesn't know about the OVG file would likely try to resolve the URIs in the mapping file first.

@dr-shorthair
Copy link
Collaborator

dr-shorthair commented Dec 6, 2019

Meanwhile, I've submitted an issue to QUDT to add sameAs links from the old URIs to the new ones into the graph.

@SirkoS
Copy link

SirkoS commented Dec 14, 2019

So I managed to rerun the scripts now also for QUDT2. Again the warning that this is not validated as carefully as our previous paper, but may be the starting point for you. I found some issues in QUDT2 (see the issue in their repo), which should not affect the mapping - none of the affected IRIs are contained in the mapping.

Mapping SWEET - QUDT2

PS: Due to the structure of our scripts this also includes the mappings between duplicates within SWEET (e.g., meter and metre).

PPS: Right now the scripts find ~71% of SWEET's units in QUDT2, whereas it only sees 10% of QUDT2's units in SWEET. So right now QUDT2 has a lot more units: 901 vs 137 (just by IRIs).

@lewismc
Copy link
Member

lewismc commented Dec 15, 2019

@SirkoS this is excellent.

I have a few questions

  1. Regarding the PS, do you think we should remove these 'duplicates'? Do you have any idea of how many of them exist?
  2. Do you think it would be worthwhile us notifying QUDT2 dev's that there is a chunk (~29%) of (somewhat poorly defined e.g. no definitions) knowledge which exists in SWEET which should maybe be added to QUDT2?
  3. Generally, it would be great to run this alignment maybe prior to every release of SWEET such that the alignment graph is updated to reflect changes in both SWEET and more likely QUDT2. Do you have a documentation process which we could follow to reproduce your result?

Thanks

@lewismc lewismc removed the SCIWS2 label Dec 15, 2019
@lewismc lewismc changed the title Mapping to QUDT Mapping to QUDT2 Dec 15, 2019
@dr-shorthair
Copy link
Collaborator

QUDT development has recently moved to an open GitHub repository here: https://github.com/qudt/qudt-public-repo - @SirkoS and I have contributed issues there

@SirkoS
Copy link

SirkoS commented Dec 15, 2019

  1. Regarding the PS, do you think we should remove these 'duplicates'? Do you have any idea of how many of them exist?

I exported a full list of "duplicates" here. However, I want to add some word of caution: I think most (all?) of them are already connected via owl:sameAs or owl:equivalentClass (that's how we extracted them). I think the question is more, if these require multiple IRIs or if alternative labels would be enough in most cases. There are a few exceptions, though (just listing examples - I have not checked all in detail):

/phenEnergy/Geothermal /realmLandVolcanic/Geothermal
/reprSciUnits/permil /reprSciUnits/perMil
/propChemical/Alkalic /propChemical/Alkaline /propChemical/Alkalinity /propChemical/Basic /propChemical/Basicity

The first seems to be an undetected duplicate across different namespaces.
The second one looks like a typo.
The last one is actually something I disagree with. This whole connection starts with the associations in propChemical/Alkalinity this is connected via owl:equivalentClass to /propChemical/Basic. Also /propChemical/Alkalic is also connect to propChemical/Alkalinity in the sane way. By transitivity this results in /propChemical/Alkalic and /propChemical/Basic to be equivalent. I think a weaker relationship would be in order here.

  1. Do you think it would be worthwhile us notifying QUDT2 dev's that there is a chunk (~29%) of (somewhat poorly defined e.g. no definitions) knowledge which exists in SWEET which should maybe be added to QUDT2?

Right now I don't think that there is one ontology with the claim to be the universal unit ontology (although most names suggest otherwise). Each one caters to their specific domain and will in time contain the units relevant there. As QUDT2 now has a more open process compared to the first version, I expect them to stabilize as well in a while.
If you really want to have a unit-ontology "to rule them all", first you would have to discuss the modelling. There some more decisions made for QUDT2 I disagree with.
Furthermore, than it is not only about those few SWEET units missing in QUDT2. Wikidata even in our old study included over 4000 units [1]. Do you want to included all of them? If not, by which criteria would you decide?
So in summary, I don't think it is worthwhile to push larger sets of units into an ontology without a specific (application-driven) demand for that.

[1] I'm currently unable to update the results for Wikidata, as I keep running into timeouts. I fear, this will require at least a substantial rewrite of the queries, maybe even of the scripts, so we can split the queries in multiple smaller ones.

  1. Generally, it would be great to run this alignment maybe prior to every release of SWEET such that the alignment graph is updated to reflect changes in both SWEET and more likely QUDT2. Do you have a documentation process which we could follow to reproduce your result?

In theory, all scripts are openly available here. I had to make some minor adjustments to make them run again, but in general the instructions should suffice. However, it still requires quite some manual effort to validate all results, which we could not automate at the time.

Regarding you proposition, @jmkeil and I had discussed about something similar in the past already. As mentioned before, he is also still working in that direction. I'll try to talk to him, whether he sees this as a potential application of his work and to join the discussions here. Nevertheless, I think this can only be some kind of dashboard giving you hints towards issues and not a fully automated quality-check pipeline.

@dr-shorthair
Copy link
Collaborator

To assist with unit mappings I added UCUM codes in #240 and #238.

(not QUDT, but as both SWEET and QUDT have UCUM codes, then mapping SWEET to QUDT is much easier.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants