Skip to content
kim pham edited this page Nov 8, 2017 · 10 revisions

Time/Place

This meeting is a hybrid teleconference and IRC chat. Anyone is welcome to join. Here is the info:

Attendees

  • Rachel Tillay
  • Jonathan Roby
  • Mike Bolam
  • Amanda Lehman
  • Bethany Seeger
  • Kim Pham
  • Melissa Anez
  • Jared Whiklo 🌟
  • Jon Green
  • Natkeeran
  • A. Soroka

Agenda

  1. Go back to just 1:00 Eastern all the time?
  2. Second Install Sprint: https://github.com/orgs/Islandora-CLAW/projects/2
  3. Metadata Interest Group: We'd like your collaboration with how we can represent metadata in RDF, and yet still display and facet on useful parameters. For instance, if I have a digitized copy of Anne of Green Gables then I want to use Linked Data to say dcterms:creator http://id.loc.gov/authorities/names/n81018346. Or maybe it's a local history and the author doesn't have a LOC uri, so I made an "authority" for them myself and state that dcterms:creator <myurl.org/people/1023>. I (as the user and metadata manager) should not need to enter any more than this URI into the book's metadata. Logically, that single URI connects the information the system knows about the author (the spelling of their name, their dates, etc) with the book. But now I want to display the author on the book's landing, or facet, or search by name. Unless the string "Montgomery, Lucy Maud" is indexed somewhere with the book, we're going to be in some Special User Hell of navigating a linked data graph (see: British Museum) instead of a repository with human-readable metadata. What is our plan? Is there a microservice that can solve this?

Minutes

  1. Meeting time

    9:00 am ET meeting attendance is pretty low. The point was to encourage participation from Europe after the IslandoraCamp but no one from Europe is attending that call. Most people that do attend are already attending the 1:00 pm ET call anyways. So can we cancel that team and move all calls back to 1:00 pm ET.

    Consensus to move back to all 1:00 pm ET calls for now.

  2. Second install sprint

    • focus is feature parity with CLAW vagrant so we can deprecate it (CLAW vagrant).
    • working to allow multi-server
    • Centos/RHEL support
    • When the sprint is done if we are at feature parity (between claw-playbook and claw_vagrant) then we will deprecate claw_vagrant and push for using the Ansible playbook.
    • If you are interested in following along, then please do so.
  3. Handing fields that might be a string OR a URI.

    Options for turning

    • make it a string
    • make it a URI to some controlled vocabulary

    Most have a mixture of things we could map to URIs, but if we did then we'd need to create a local URI because it is a one-off

    Could we use the linked data element that allows a broader type of content, but seeing if Islandora could use that string and see if it is a URI and instead of displaying the string. But display it as the result of that URI, if nothing resolves then display the URI as a string.

    What makes that difficult is having two different ranges12 for the predicate.

    Also makes it difficult to share your metadata, with other. Depending on the interpretation of your string.

    Bethany - Our plan is we are going to mint a new URI for that resource.

    Could you have multiple RDF mappings?

    Are you thinking that if it is a URI then mint a new resource for that?

    Some of the MIG might have concerns around different names (ie. Anne Smith) so if you created an object for each time that name occurred and if you mesh them together you'd end up with one resource for 3 different people.

    Adam - The fundamental problem is that people have been using a name as an identifier and they aren't an identifier. Now we are trying to alter that and metadata to move from using names as identifiers to where they are just names.

    It seems that we can't generically solve that problem.

    This could be a migration problem which we could solve later, do we just mint a new URI for every instance of "Anne Smith" knowing that some are duplicates. These will all display the correct name in indexing so that is not an issue. Once you want to start de-duplicating then you need to start finding which URIs reference the same person and fixing them. You can do this in a piecemeal fashion.

    It is better to mint more URIs than less because it is easier to say "these 3 things refer to the same resource" than the reverse of saying which of these 3 resources are different and tear them apart.

    Be able to migrate based on what editing ability is available at the time, so some might be able to migrate early where some might require more advanced editing capabilities to merge/edit some of these resources.

    You're always going to have the capacity of editing stuff, so you will be able to do some of these. Change the display, etc.

    What about external vocabularies? Could we use them?

    There isn't a lot doing very much with this yet, we are on the cutting edge.

    Why would you not just link to the LoC name authority? There are new names added all the time, so we may not know at the time of migration.

    Make sure we also include different linking to a different vocabulary issue. Situations where you locally need to create a city in a state and the state has an external URI, but the city does not. Will you need to ingest the parent object or parent object hierarchy to get the faceting to work, or will it be possible to have the indexer work.

    If you are doing this at ingest then you are either doing a lookup at indexing time, or cache them locally so you can access them.

    Would you append to some of the local data to your authority, but still reference an external authority.

    For the most case, I've never run into a situation where we more information about a creator than LoC does. I don't know how frequently it would come up.

    We have a lot of local history stuff that contains names, places, etc that aren't important enough to rise to a national level.

    previously in MIG we were going the direction of mapping our data to the linked data elements that allowed a string. Now it seems that we should instead use an element that holds a URI and later there will be some way to merge/edit the URIs.

    Each element range (ie. Author, Place, etc) would be a Drupal content-type so when we migrate your data, part of the migration will be to generate a new URI for each instance of author/place.

    Possibly keep track of what you have as you go and possibly re-use entities during the migration or create if they don't exist.

    Each institution's mapping will be different. It will be great to re-use predicates where people are describing the same data, but there will be unique differences.

    Kim - UTSC is working on moving some metadata in to CLAW. "where we have a flat list of vocabularies that become nodes, then we ingest records of people/documents that have fields that use these vocabularies that reference those nodes or generate new nodes if those vocabulary terms don’t exist" https://github.com/digitalutsc/dragomans_migrate

    Bethany - in case it's helpful, this is a great website to show linked data in action on a digital repo - and how you can keep going through related information: http://lod.library.unlv.edu/nav/jhp/#http://ld.library.unlv.edu/jhp000065

This is an archive. For new Tech Call notes, click here

⚠️ ARCHIVED Islandora Tech Calls

⚠️ ARCHIVED Islandora User Calls

Clone this wiki locally