Create and document selection criteria for inclusion in the ontology #7

mellybelly · 2019-02-27T19:07:38Z

We should have a modular strategy for inclusion of ontology content. What we don't want is a random smattering of content from numerous resources as these will not have good semantic interoperability. We want to be able to do analytics on these data, not just link them.

My understanding is that the goal here is purely as an application ontology for our discovery tools. As such we should not need any new content.

Some considerations:
For library related document types we should use whatever is standard in the LinkedData4Libraries project or in MARC. @eichmann please inform this.

There could be a priority order - such as VIVO-ISF>OBI>NCIT etc. I would avoid one-off avoid one-off content from domain specific sources, such as a flybase controlled vocabulary.

We should choose content that is resolvable and has definitions, synonyms, and is updated regularly.

Lets also think about the modules for import purposes - we can't maintain things very well if there is no modularity, and its also more useful to others if its modular.

marijane · 2019-02-27T20:01:09Z

I think the Outputs & Activities Mapping spreadsheet might be a good place to prioritize which ontologies we want to borrow from: https://docs.google.com/spreadsheets/d/1Mw8gK2NUGM8po7GGRtJTShRM2QNFoR19VJrjQuVCDW8/edit#gid=1412211690
We're using ROBOT to create most of the extraction modules. You can see them in the src/ontology/imports folder. They've all been generated from downloadable OWL files for the source ontologies, so they are resolvable, but I do think some of them may be missing definitions. There are a couple ontologies from which we are only taking a single term, we have also created modules for them, but manually rather than with ROBOT.
We need to make some decisions about what kind of extraction we want. The initial extractions brought in a bunch of extra content, basically everything mentioned in class annotations, I think.
There are some output types for which we have not been able to find existing ontology terms, so those have been created as new classes.

nicolevasilevsky · 2019-02-27T20:06:09Z

From my discussion with Melissa, it sounds like we want to try to only extract terms from a limited number of ontologies, and if we cannot find existing terms, we should make new term requests to the ontologies we will use

nicolevasilevsky · 2019-02-27T20:07:22Z

I created a new tab in the spreadsheet- see NISO output-ROOmapping2019-02-27

I am making note of all the ontologies we've used so far, and noting where we need new term requests (marked as NTR)

@marijane we can discuss further when we meet today at 4pm

and the rest of us can discuss when we meet tomorrow.

nicolevasilevsky · 2019-02-27T20:54:22Z

In this spreadsheet, you can see we are currently using the following ontologies in ROO:

Ontology	Number of classes in ROO currently
Bibo	6
BRO	1
CLO	1
Edam	1
Fabio	1
IAO	4
MeSH	1
NCIT	55
OBI	3
OMIT	17
SIO	3
VIVO-ISF	32

There are 35 terms that didn't map to existing terms, that we added as new classes and have ROO IDs, which I guess can be considered placeholders, and we'll need to request new terms to our preferred ontologies, once we determine what those are.

Questions/Blockers

@mellybelly and all - what ontologies do we want to remove from the list above (and re-map those terms)
What other ontologies/vocabularies should we try to use?
2a. What is the priority for each ontology, ie VIVO-ISF > NCIt, etc.?
Can someone review the mappings (when they are redone) and confirm they are happy with these mappings from the ontologies we do decide to use?

nicolevasilevsky · 2019-02-27T21:10:56Z

@tricfran can we discuss this ticket on our call tomorrow?

mellybelly · 2019-02-28T20:04:07Z

Can we first hear from some of the linkedData4libraries folks @eichmann @marijane or folks at NWU about what is used most for bibliographic info?

Will review spreadsheet in meantime.
@marijane don't we want the class annotations?

nicolevasilevsky · 2019-02-28T20:05:49Z

thanks for reviewing the spreadsheet @mellybelly

marijane · 2019-02-28T20:08:53Z

@eichmann can you post some links to the LD4L stuff you showed us in the call today?

@mellybelly we want the labels/comments/etc, but ROBOT appears to have pulled in all of the logical definitions as well, and everything referenced in them. There are whole trees of stuff that got pulled in, I'm not sure we want that? You can see what I mean if you open the OWL file in Protege.

mellybelly · 2019-02-28T20:20:26Z

Ok i made notes in the document. I think we should prioritize bibliographic resources, resources with synonyms/text definitions, and those where we'd use more than one term.

We also need some basic requirements analysis for what this application ontology structure should be. There is no benefit of having an ontology if there is no classification, else its just a tag library (which might be fine also!)

mellybelly assigned marijane, nicolevasilevsky and eichmann Feb 27, 2019

mellybelly mentioned this issue Feb 27, 2019

Create and document selection criteria for inclusion in the ontology data2health/architecting_attribution#19

Closed

mellybelly mentioned this issue Feb 28, 2019

requirements analysis and documentation for ROO #8

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create and document selection criteria for inclusion in the ontology #7

Create and document selection criteria for inclusion in the ontology #7

mellybelly commented Feb 27, 2019

marijane commented Feb 27, 2019 •

edited

nicolevasilevsky commented Feb 27, 2019

nicolevasilevsky commented Feb 27, 2019 •

edited

nicolevasilevsky commented Feb 27, 2019 •

edited

nicolevasilevsky commented Feb 27, 2019

mellybelly commented Feb 28, 2019

nicolevasilevsky commented Feb 28, 2019

marijane commented Feb 28, 2019

mellybelly commented Feb 28, 2019

Create and document selection criteria for inclusion in the ontology #7

Create and document selection criteria for inclusion in the ontology #7

Comments

mellybelly commented Feb 27, 2019

marijane commented Feb 27, 2019 • edited

nicolevasilevsky commented Feb 27, 2019

nicolevasilevsky commented Feb 27, 2019 • edited

nicolevasilevsky commented Feb 27, 2019 • edited

nicolevasilevsky commented Feb 27, 2019

mellybelly commented Feb 28, 2019

nicolevasilevsky commented Feb 28, 2019

marijane commented Feb 28, 2019

mellybelly commented Feb 28, 2019

marijane commented Feb 27, 2019 •

edited

nicolevasilevsky commented Feb 27, 2019 •

edited

nicolevasilevsky commented Feb 27, 2019 •

edited