Skip to content

20180823 Ontology Change Improvement Call

marijane white edited this page Aug 23, 2018 · 1 revision

2018.08.23

Attendees: Marijane White, Damaris Murry, Ralph O'Flinn, Tatiana Walther, Mike Conlon, Brian Lowe, Christian Hauschke, Muhammad Javed

Agenda: https://wiki.duraspace.org/display/VIVO/2018-08-23+Ontology+Improvement+Call

Ongoing Issues: these are in the background, ongoing, moving on to item 2.

Language skills:

  • Ralph suggested we use an existing ISO standard for this
  • Mike notes there are existing URIs for this
  • We're also talking about internationalization on the ontology side
  • Ralph wants to make sure we're using the same set of information for lanugages everywhere
  • Mike notes we also use international standard language tags on text strings. It's a different standard but they are standard.
  • Ralph says its about using international standard.
  • We all seem to be in agreement about this
  • We need to enumerate what kind of language skills we want to capture. Reading, speaking, etc.
  • Brian notes there is a scale in European CVs, and we should try to map to those
  • Ralph say sElements tracks this too, they've included a standard for this
  • Mike asks if there is a standard for this, Brian and Ralph will look into it
  • Brian shared https://en.wikipedia.org/wiki/Common_European_Framework_of_Reference_for_Languages
  • There will of course be modeling decisions about how we want to represent these in the ontology
  • Christian says the European standard is well known, and we should look for an existing ontology
  • Ralph shared http://id.loc.gov/vocabulary/iso639-5.html (ISO Standard)
  • First step is finding the standards, second step is finding some models
  • Tatiana says we should also have languages for publications
  • Ralph notes the ISO standard incorporates the European standard

Identifiers:

  • TIB has been discussing for years issues around using identifiers. They have a German identify file, GND, and they would like to use them for concepts, person IDs, and sometimes organization IDs. They want to model this in the ontology and they are not sure how to best do it. Should they create a new property and link directly to the GND ID URI as an object property rather than a datatype property?
  • Mike says having it as a property is very useful for queries.
  • ORCiD is a URI in VIVO, which still needs more work, but most identifiers are properties
  • Brian is not completely convinced we're doing this the right way, it might be nice to not have so many. Seems nice to have at least a sameAs link to the URIs to local data so you can serve data from your local server and have data you control while also linking out to a standard URI.
  • Mike notes we have a lot of old-fashioned US IDs like the ERA Commons ID, which will never be anything but text strings.
  • But also there are things like ISSNs which are currently literals but there is a standard URI scheme we could also use.
  • We need to better understand our two use cases in VIVO and perhaps think about how it could be improved. Things that are URIs and then also text strings that are properties of entities instead of literals. Need to understand what intentions were. Mike expects we will find inconsistencies
  • This will definitely be a project, there is not a simple answer. Mike likes the sameAs idea.
  • TIB has an internal conflict between someone who is a big sameAs fan and Christian, who is not.
  • Marijane shared http://patterns.dataincubator.org/book/equivalence-links.html
  • Mike is interested in cleaning up the mess, and the GND identifiers are an opportunity to look at how we're doing things and do a better job. The idea that there needs to be a good global solution for identifiers can be decoupled from the idea that we need to implement the GND identifiers.
  • Tatiana shared https://www.w3.org/2009/12/rdf-ws/papers/ws21
  • Mike thinks we need a use case paragraph from Christian describing what TIB is trying to get done. Christian will try to do this.
  • Mike will do a review of identifiers in the VIVO ontology. Proliferation of identifiers, which we're unhappy about, the ORCiD implementation, and the mystery of how vivo:identifier is being used. All of which may not address TIB's issue.

The Arts:

  • We have two examples of this (see links in agenda at the Duraspace wiki, linked above)
  • We were calling this Humanities, but it turns out the examples are more about the Arts
  • Mike would still like to understand the needs of the Humanities if we can find examples
  • The Arts includes things like performance art, etc.
  • Things like Art History can have both the publication needs of the Humanities and the event needs of The Arts
  • Damaris put together some slides on Duke's Artistic Works Ontology
  • Just adding simple types of works was too simple
  • Attempt to work with Brown also became very complex. Works change over time, people want to model very detailed things like what works are derived from, etc. The Artistic Works Ontology is a middle ground. Works, Roles, Events with venue/date info as their own entities, because doing otherwise muddled things. It covers mostly visual arts, vs performing arts, with a few things that speak to the humanities like exhibits, audio recordings, etc. Faculty outside of the Arts hate that these things are called artistic works because they don't feel very creative, so Duke call it "Artistic Works and Non-Print Media". Ultimately everyone wants it to be called publication, or Mike suggests "scholarly works".
  • Mike notes some of these things already exist in VIVO.
  • Mike asks how the types have held up? Damaris says pretty well, they've added maybe two or three over the 4-5 years they've been using it with a variety of faculty. Duke encourages faculty to pick as many as they want.
  • Role was even more important. People often have more than one role, and they change over time, like performer for a season for something that runs many seasons, people want to specify roles so they don't misrepresent their work.
  • Mike asks what a "Restager" is. Damaris says they take an existing play/choreography and fix it. An original presentation of it that goes beyond the original stage direction. This is something that happens all the time, and it is considered a creative activity.
  • Damaris says a challenge in picking the types and roles was not wanting to add too much duplication, like "a painter of a painting", which is implied. They kept the role Creator, which gets used a lot. You can call yourself a painter but the role is creator.
  • Duke also added the idea of related works, which can be things that change over time or a piece of a larger whole, like when someone is representing a production, there are light designers, costume designers, composers, which all may have really well fleshed out representations, Duke wants them to link to those entities rather than describe them themselves.
  • Cited Artists/Collaborators - They try to link to people as they can, but collaborators may not be at Duke, so those are text strings. Mike likes that they don't use words like "External" or "Other", this is really well done.
  • Commissioned By is also a text string. Duke could not find anything existing like this. Mike says Florida makes these Organizations, so you can find all the works an organization has commissioned, but it means having to add them as they go. Similar to the Cited Artists implementation
  • Events were initially attributes of the work itself, but many faculty felt events were the same thing as a work, so they weren't showing up on profiles they way they wanted, so now Events and Artistic Works are separate sections, and there's a lot of duplication of information for each individual performance. Mike notes this is analogous to Presentations in VIVO, which is an event where their slide deck is associated with each event.
  • Javed did some work on this kind of stuff with BIBFRAME, where they consider the event and performances at the event as two separate works.
  • Mike notes the recording of the performance is another separate thing.
  • Damaris says Duke allows people to do something like that, but they can't add too much detail, events have to be added separately.
  • There's a lot to think about here, we need to proceed carefully.
  • Duke ran into some workflow issues around allowing people to share works, find others with profiles and linking to them, but this group is not there yet.
  • Mike says Duke is very good at workflow, which helps get good data, so it's worth paying attention to it.
  • Damaris notes that if a company is selling a copy of a performance, it may have an identifier, and people want to be able to add that. Mike says he's had people who want to turn their VIVO profile into their store, he encourages them to link to their store instead.
  • "That's a good idea, we'll look into it" is always a good way to respond to faculty suggestions.
  • Mike asks TIB for the German perspective, Christian says this is outside his area of expertise.
  • Mike says the Duke model represents years of experience.
  • Christian and Mike want to share Damaris' slides with colleagues to get feedback.
  • Tatiana shared https://www.performing-arts.eu/ as another example
  • We also have the Carnegie Hall model, which was presented at the VIVO conference. It may have more detail than we need. It is more focused on music.

Other things:

  • Marijane notes that Harvard is adding stuff to the ISF for eagle-i this summer
  • Brian asks if there is an obvious place on the wiki to document triplestore update process, Mike says we can move any pages might be created.

The VIVO-ISF ontology is an information standard for representing scholarly work.

Additional Resources

Clone this wiki locally