Skip to content

Archive to IIIF Presentation model transforms

jabrah edited this page Jan 29, 2015 · 6 revisions
Store                                 top Collection
  - BookCollection                        - Collection
      - Book                                  - Manifest
                                              - Sequence
                                              - Canvases
                                              - Annotation Lists
                                              - Ranges
                                              - Layers

Collections are transformed naturally. Books, however, contain more information than is just in a Manifest. A book in the archive has all the information needed to build a manifest, sequence, all canvases in the sequence, other structures such as ranges and layers, and all interesting annotations, collected into annotation lists. The transforms themselves are fairly straightforward.

Store to Top Collection

Store.listCollections()   ---->    (top) Collection

At the top level, there exists the archive, which is not directly modeled in rosa-archive-model. Instead, it is given to the Store when it is initialized. All of the collections are held in the archive, and accessible through the Store. A list of all collections is kept in the top level Collection in the rosa-iiif-presentation-model. This is a regular Collection object always has a label, 'top' and holds a list of URIs pointing to the other Collections.

BookCollection to Collection

BookCollection                     Collection
  - list of books                    - list of Manifests
  - list of languages                - supported languages in TextValue
  - missing_image.tif                - (used for substitute images in Manifests)
  - narrative_sections.csv           - ()
  - illustration_titles.csv          - (used to generate annotations for illustrations in Manifests)
  - character_names.csv              - (used to generate annotations for illustrations in Manifests)

Each BookCollection in the Store has a list of books, which is turned into a list of URIs pointing to a Manifest for each book. The rest of the data is shared between all the books and is used while transforming Books into Manifests plus related objects.

Book to Manifest

Book                                  Manifest
                                        - type = sc:Manifest
                                        - viewing hint = paged
                                        - viewing direction = left-to-right
                                        - default sequence = the only sequence generated
                                        - thumbnail = thumbnail of the generated sequence
                                        - ranges set according to image names, illustrations, etc
  - book collection ID + book ID        - URL ID
  - image / cropped images              - a Sequence
  - illustration tagging                - used to generate annotations while generating annotation lists
  - book metadata                       - used to fill in Manifest data
      - common name                         - manifest label
      - repository + shelfmark              - manifest description
      - other metadata                      - manifest metadata map
  - permissions                         - Manifest attribution statement
  - AoR data                            - used to generate annotations while building annotation lists

The Manifest is used to represent an entire book. It embeds all canvases inside a single sequence. This is the best place to attach book metadata.

ImageList to Sequence, Range

Book/ImageList                       Sequence
                                       - label = "reading-order"
                                       - starting canvas = first content page
  - book collection ID + book ID        - URL ID
  - each BookImage                     - a Canvas

The sequence defines all of the images that will be viewable and in what order to view them. The image list in each book contains all relevant images for the book and has a definite ordering to those images. It is a natural transformation between the archive image list and the presentation sequence. This transform will generate the only sequence for these books.

The manifest can have other structures outside of the main sequence, such as ranges. A range is similar to a sequence, in that it lists canvases that are related to each other in some way. It is a logical grouping of canvases, as opposed to a list specifying order. Ranges can overlap one another and contain fragments of canvases. The names of the images in an image list is used to generate several ranges that can be used to navigate through the canvases more effectively. An obvious use of ranges is a table of contents, where there exists a range for each section in the table of contents. These ranges are created by taking advantage of the image naming scheme of the archive.

BookImage to Canvas

Book                                 Canvas
  - image tagging                      - possible text annotations on the canvas
BookImage                            
  - width/height                       - dimensions of the canvas
  - ID/is missing?                     - default image to display

The Canvas will always have at least one image associated with it. If the image is not present in the archive, this image will be the missing_image.tif from the collection. The presence of a reference to an annotation list for 'other content' is determined by the presence of illustrations on the page and the presence of AoR data for the page.

Image tagging and AoR data to Annotations

The default annotation list attached to canvases are named according to the page number it is associated with, EX: 001r.all, 002v.all, etc. The .all designation means that all annotations are included, except images. All annotations include tagging for illustrations on a page and all AoR data. All of these are transformed into text annotations that can be displayed on the canvas.