Skip to content
Nick Ruest edited this page Apr 20, 2016 · 8 revisions

Time/Place

This meeting is a hybrid teleconference and IRC chat. Anyone is welcome to join. Here is the info:

Attendees

  • Nick Ruest
  • Melissa Anez
  • Marcus Barnes
  • Ben Rosner
  • Jared Whiklo 🌟
  • Peter Murray
  • Esmé Cowles
  • Trey Pendragon
  • Ed Fugikawa

Agenda

  1. PCDM Works & CLAW/Hydra alignment
  2. Docker work
  3. Sprint discussion
  4. ... (feel free to add agenda items)

Minutes

  1. Discussion at LDCX about making sure that PCDM is implemented in a consistent manner in Islandora & Hydra. Figure out how we can coordinate to ensure it remains interoperable.

    A big piece of the discussion is that Hydra has fallen on the Works extensions. Which offer a couple restrictions, and different names. There are only so many restrictions that you can include in an ontology, so it is more about how you combine them together.

    PCDM does not allow a PCDM:File to have any descriptive metadata, so the solution was to have a FileSet in front of the Files to include that metadata. Therefore a "book" PCDM:Object does not have a file directly, instead it has a PCDMWorks:FileSet and then the PCDM:Files are inside that FileSet. The File is a PCDM object representing the bitstream (the file on the disk).

    We are kinda of doing that in this Islandora diagram.

    The FileSet would be between the /files and the pcdm:Object in that diagram.

    Could a FileSet be an RdfSource and an Indirect Container?

    A FileSet needs to be a BasicContainer with direct containment. There has been some discussion about whether files should be directly contained or could indirectly contained.

    It would allow you to maintain different types of files. For the same object.

    The FITS things is an object to describe another object. The current HydraWorks code does characterization and stores the information on the FileSet. But they might have a change to put this (FITS) information on the File. If you want to have a File that describes another File, then you should put it somewhere else in the repository (perhaps a common area) and have it iana:describes the first file.

    Restriction in PCDM is that you can only have technical metadata on a pcdm:File, so all other metadata being stored on the FileSets.

    Hydra is implementing the Technical Metadata Working Group specs, they think. Definitely where Hydra needed a predicate they would look to the Working Group's recommendations first.

    The other piece of PCDM:Works is a recommendation that there is a Work for every realistic piece of the object.

    The old way was, for example, a "Book" as a Work and 10 FileSets each having a page, with a label and a picture of the page.

    Now you do it as parts. You get Book (a Work), which has 10 Works, each Work has a FileSet for the image, each FileSet has a File which stores the actual bitstream. editor's note: my apologizes if I butchered this or the previous example.

    Chapters are done using TopRanges and Ranges and reusing the Pages from the core Book, because these are a different representation or segmentation from the physical one.

    Islandora is not using ore:Aggregation correctly as we don't implement ore:ResourceMaps.

    Hydra are not implementing ore:ResourceMaps as a separate object. They are using the object as both an ore:Aggregation and a ore:ResourceMap.

    The core problem Hydra found with PCDM is no way to tell where you are in the structure with only two classes. The good thing that PCDM has is the optional ordering and predicates for interoperability.

    HydraWorks allows you to tell where you are in the structure. The Hydra developer community felt that if there had been more time to flesh out the model to PCDM, it would have been closer to HydraWorks.

    If everyone thinks that PCDM is not enough to use by itself then we should fold the HydraWorks (now PCDMWorks) in to core PCDM.

    How does Hydra handle access to a specific page of an ordered object. They don't currently have a use case for access to a page by page number. Rather you want a page from a Work with specific text. This can come from the index.

    FileSet holds common metadata for all Files in a FileSets. FileSets are a way to group automatically generated derivatives, you could upload a big tiff and derive OCR and thumbnail and JPEG2000 versions. If all of those Files represent the same thing, then they fit in the same FileSets.

    If you have multiple thumbnails from a single File, they you might need different FileSets for each.

    Use a predicate to say what the original file is. The alternative is you don't say which is the original and choose the best version of the file for your particular needs. For example: when generating deriviatives then original == to highest quality version available.

  2. Another sharing conversation is the Islandora-CLAW Docker/Ansible stuff.

    This is now live and this type of devops stuff could be shared between the two communities.

We should check in every couple of months to make sure the two communities are staying on-track.

Also Ben Armintor put out a call for PCDM:Works to get use cases to work into test fixtures. This can be used for a TCK for PCDM compliance.

This compilation of minutes is the recollection of Jared. Until such time as someone corrects them, they shall be considered facts.

This is an archive. For new Tech Call notes, click here

⚠️ ARCHIVED Islandora Tech Calls

⚠️ ARCHIVED Islandora User Calls

Clone this wiki locally