Skip to content
This repository has been archived by the owner on Mar 9, 2021. It is now read-only.

Archipelago Roadmap 2019 #6

Open
DiegoPino opened this issue Apr 29, 2019 · 14 comments
Open

Archipelago Roadmap 2019 #6

DiegoPino opened this issue Apr 29, 2019 · 14 comments
Assignees
Labels
Documentation Help make this easier for people to use enhancement New feature or request help wanted Extra attention is needed Roadmap Things to be considered in our Official Roadmap
Milestone

Comments

@DiegoPino
Copy link
Member

DiegoPino commented Apr 29, 2019

Archipelago 2019 Roadmap

Tentative enumeration of concrete tasks, per Component and Service, for public evaluation (and comments).

All tasks listed here are leading to our first stable production release in July 2019 and full feature release end of 2019. Checked tasks are ready, unchecked are in progress or planned. Priority is not given by this order.

There are a lot more things involved and many are already done and coded, but i wanted to have a list that was closer to features checklist than the actual code milestones we are managing internally. Please feel free to comment, request more info or ask for clarification. Feature requests are also highly appreciated and taken in account (always, please!).

Strawberryfield

  • Field Property exposure to Drupal strategies
    • JSON KEY Provider (flattener)
    • JSON Flatten Keys
    • JSONPATH/JMESPATH
    • Entity Reference Casting Provider (Using UUID loading and configurable entity type)
    • JSON stored Service Endpoints with extended logic (e.g HOCR)
    • Multi Map/ join: many properties to single. e.g All keys - Authorities- referring to creators, contributors etc unified as Agents keys. This leads to Fractal Ontologies and our Buckets approach.

JSON representation and enrichment

  • Better File management (Better than Drupal)
    • File referencing via UUID instead of via Entity ID
    • Handle temporary files when moving from TEMP storage to PERMANENT
    • Increment file usage count on new versions
    • Decrement file usage count on version removal
    • Change file usage on Delete, EDIT on existing active content and versions
    • Add Webform based UI managment (reorder, replace, delete) for files
    • File based Plugins callable by Webhooks
      • TECHMD
      • ZIP/UNZIP
      • Derivative for larger MEDIA (video and Sound)
      • Pronom Service/Preservation
  • New JSON Service Architecture reference
  • Deposit/save on Node save whole, selfs sustainable Strawberry JSON blob in S3/Minio/FileSystem
  • Keep track of Service and action on Ingest/edit using Activity Streams
  • Add more agent information on our activity streams for provenance and tracking.

Webforms integration

  • Webform Driven UI Ingest with custom handler and widget
  • Create a set of Demo Webforms that cover base of our GLAM source data needs
  • Allow Webform Field Widget selection be driven by RDF type and permissions.
  • Create new, better, LoD Webform elements
    • WIKIDATA
    • LoC
    • WIKIDATA Agents with LD Roles
    • Viaf
    • Getty
  • Create Stub (temporary) WIKIDATA entities if query shows desired WIKIDATA entity does not exist upstream.
    • "publish" to wikibase functionality
    • Replace repo wide stub uri with official one once pushed.
    • Keep track on the stub who is referencing it is (bidirectional reference?)
  • Move Strawberryfield harvest Webform handler's logic to plugins
    • Deal with as:images
    • Deal with as:documents, as:video, as:sound, as:dataset elements
    • Deal with as:models
  • Allow anonymous submits to be converted into proper Nodes by Admin (Self deposit, crowd sourced metadata)
  • Make Webform API Interaction more versatile for our use. Use as schema validator.
  • Add JS to avoid main node CRUD to submit/validate embeded Webform as widget

Media Displays

  • Add expected mime/type output to Media displays. Allows to tag media displays as JSON, XML or HTML only.
    • React to mime type to allow JSON or XML output to be downloaded too.
  • Add new Data Views Plugin integration to allow Media Displays to preprocess values on views exposed as API endpoints
  • Version Media Displays
  • Provide example Twig templates for DC, MODS and JSON-LD (schema based)

Field Formatters

  • Static IIIF Images
  • Open Seadragon IIIF Images
    • Add thumbnail navigation
  • IABookreader IIIF Images
    • Integrate Flavor based Solr Search
  • Panorama via IIIF
  • Metadata up-casters
  • Metadata up-casters with download endpoint
  • Video (HTML5)
  • Audio (HTML5)
  • Web annotations (IIIF)
  • Complex nested structures (Whole graphs)
  • 3D! (Three + JSM)

API Ingest, Migration and backup

  • Strawberryfield Normalizer: expands JSON string as a JSON when exporting
  • Strawberryfield denormalizer: string-ify JSON when importing
  • Wrap JSONAPI on a set of Drush script to (Strawberry Seeds)
    • Allow Single command line invoke files and node ingest
    • Create virtual field Entity "buckets" to allow Media to be ingested into those as links and routed to internal Strawberryfield elements (utility methods for ingest)
  • AMI (Archipelago Multi Import) First iteration
    • API Source (Other repos, ContentDM, Solr)
    • Google Spreadsheets (same as IMI)
    • Complete Drush 9 integration
  • Filesystem drop-and-forget ingest. You save a JSON file into S3, Archipelago creates entities and relationships.

Service Architecture (Strawberry Runners)

  • Develop webhook driven notification service for derivatives
  • Document/deploy webhook triggers for minio S3 per mimetype
  • Document/deploy webhook triggers for AWS S3 (via lambda) per mimetype
  • Develope Shell processing using Symfony callables and user configurable for each case (rule system)
  • Generate JSON reference-able Services for complex non descriptive metadata and data
    • HOCR
    • TECHMD
    • Web Annotations
    • Tabular datasets
    • Transcripts (similar to Web Annotations, mostly dependant)
    • Build slim Content entity that can be used to index natively that content into Solr via search API
    • Allow Services to be self explaining of its capabilities. only GET will be allowed
    • Allow Drupal to discover hierarchically via EDISMAX

SEO and API

  • Allow Media displays output to be embeded in HTML head for SEO
  • Test/Develop nested DATA VIEWS integration for OAI-ORE and OAI-PMH
  • Create (TWIG, metadata displays) and expose as endpoints full set of IIIF API JSON outputs. UPDATED
    • Add helper methods and twig extensions to allow Metadata displays to access pre existing views (like object listings for a collection) to help build those lists.

ACL / Permissions

  • Integrate custom ACL with JSON Paths into per NODE ACL. Allowing this way to apply permissions to individual metadata elements/paths.
  • Same but needs better UI for referenced Services and Media
  • Allow Metadata (rule) to trigger ACL permissions. e.g if embargo_date == bla bla = remove public access
  • Allow for ACL inheritance (from parent, recursive) without hard copies.

Deployment and DevOPS

  • Sync Configurations and remove non used ones for minio branch / periodic for each Drupal release
  • Site-build and remove orphan blocks
  • Add more utility views
  • Enable JSONAPI by default on minio branch
  • Create jsonapi user with jsonapi credentials for minio branch
  • Create basic scripts to automate Docker/Bash operations
  • Update AWS deployer to match minio including docs and Cloud Services integration

Batch Operations

  • Bulk Batch Views JSONPATH plugin to
    • Replace existing JSON values
    • Add to existing Values
    • Respect data type casted values, (entities, file references)
  • Bulk Batch Views MEDIA plugin to
    • Replace Media
    • Add Media
  • Bulk Batch Views ACL plugin to
    • Replace ACL and inheritance
    • Replace ACL individual Control List Elements
    • Add ACL individual Control List Elements
  • Integrate into Solr Results and Strawberryfield Taxonomy Term pages

Future roadmap

  • D9 readyness 😄
  • Solr Cloud/ Consortial ensemble
  • Native Wikibase/Wikidata publishing

Documentation:

  • Devops and new repository deployers
  • Migration to and from.
  • Backup and restoring
  • Permissions, access and ACLs.
  • Metadata Professionals, JSON schema and schema-less. AS, DR and AP internal ontologies. UPDATED
  • Metadata Professionals, Key concepts of Archipelago
  • Metadata, Ingest and edit workflows.
  • Displays, Formatters and Media Plugins (Twig)
  • Views Integration (Solr and Blocks)
  • Strawberry Field Exposed Keys and Plugins
    • Property Exposing strategies and configs
  • Media Management
  • Solr and Discovery
  • Extending and Coding
  • SEO
@DiegoPino DiegoPino added this to the Release 1.0.0 milestone Apr 29, 2019
@DiegoPino DiegoPino self-assigned this Apr 29, 2019
@DiegoPino DiegoPino added Documentation Help make this easier for people to use enhancement New feature or request help wanted Extra attention is needed Roadmap Things to be considered in our Official Roadmap labels Apr 29, 2019
@DiegoPino DiegoPino pinned this issue Apr 29, 2019
@giancarlobi
Copy link
Collaborator

Field Formatters
Static IIIF Images
Open Seadragon IIIF Images
IABookreader IIIF Images
Panorama via IIIF
Metadata up-casters
Metadata up-casters with download endpoint
Video (HTML5)
Audio (HTML5)
Web annotations (IIIF)
Complex nested structures (Whole graphs)

I'm thinking about one main image and some secondary images, a recurring need in my experience. I know you can use OpenSeaDragon paging feature, in addition, do we have to add a specific Multiple Images field formatter?

@giancarlobi
Copy link
Collaborator

SEO and API
Allow Media displays output to be embeded in HTML head for SEO
Test/Develop nested DATA VIEWS integration for OAI-ORE and OAI-PMH

What about add also IIIF Presentation API here?

@giancarlobi
Copy link
Collaborator

The original one
image
... a colored note among lines of code!

@DiegoPino
Copy link
Member Author

That is a great image!!! 😍

Answering some questions with more questions =)

I'm thinking about one main image and some secondary images, a recurring need in my experience. I know you can use OpenSeaDragon paging feature, in addition, do we have to add a specific Multiple Images field formatter?

Right now the Open Seadragon IIIF Images formatter can display all images as a single viewer (with next, prev buttons) or can display every image as independent single Openseadragon viewer instance (many viewers, not too useful i think if too many but i'm pretty sure people could find a user). So my questions are:
1.- Would you like a different Open Seadragon Formatter that deals with many images in a different way?
2- Or are thinking about a one or many static images like this with better options? What options do you think could be useful there?
3.- Or are you thinking about a totally different formatter? (we can do that too of course!) and if so what would that formatter allow?
4.- About main, versus secondary. We could add another key in the as:images json structure that allows people to select which is the primary (example for thumbnail use) like with what we do with "sequence" element for images when used as pages. Then we could allow all formatters to show only the ones tagged with "something" as an option. Like a config option that says: "which key to use as filter" = 'role', 'use' ? and what to filter against? ('primary', 'thumbnail'). We could, as we do now simply default to no filter and to get either the first or all of them.

A lot of questions from me! Lets figure this out here and we can add that new formatter

SEO and API
Allow Media displays output to be embeded in HTML head for SEO
Test/Develop nested DATA VIEWS integration for OAI-ORE and OAI-PMH

True, was planned, we need to add it to the list. We already have IIIF Manifest Metadata display in twig and our IABookreader can use it already. We could just need to

  1. generate the rest of the IIIF structures (collections, etc) as twig template with some helper methods we could pass to twig so the users can decide on paging, etc. and
  2. Solve
  • Add new Data Views Plugin integration to allow Media Displays to preprocess values on views exposed as API endpoints"
    That would allow us to expose media displays as per node endpoints. I will add this to the list. Thanks.

@DiegoPino
Copy link
Member Author

@giancarlobi i will also add today my thoughts on collection management. Since collection membership is just a value inside a json key i was almost thinking of not dealing with it as a separate Feature list (implicit in many of the existing tasks) but i will add a tiny list explaining the basic exposed interaction ways (move, migrate, batch update, ingest "into") to help with documenting and UI too. We already have that included into the main ingest webform workflow but it won't harm if we generalize it even more.

@giancarlobi
Copy link
Collaborator

I'm thinking about one main image and some secondary images, a recurring need in my experience. I know you can use OpenSeaDragon paging feature, in addition, do we have to add a specific Multiple Images field formatter?

Right now the Open Seadragon IIIF Images formatter can display all images as a single viewer (with next, prev buttons) or can display every image as independent single Openseadragon viewer instance (many viewers, not too useful i think if too many but i'm pretty sure people could find a user). So my questions are:
1.- Would you like a different Open Seadragon Formatter that deals with many images in a different way?
2- Or are thinking about a one or many static images like this with better options? What options do you think could be useful there?
3.- Or are you thinking about a totally different formatter? (we can do that too of course!) and if so what would that formatter allow?

As first instance (we have a lot of other important issues to solve), I think Open Seadragon IIIF Images formatter is a good choice for multiple images if we can add (I don't know if OSD includes this feature) a small thumbnail of images clickable that is to be able to click thumbnail image to scroll to selected. This option could be a second level as priority.

4.- About main, versus secondary. We could add another key in the as:images json structure that allows people to select which is the primary (example for thumbnail use) like with what we do with "sequence" element for images when used as pages. Then we could allow all formatters to show only the ones tagged with "something" as an option. Like a config option that says: "which key to use as filter" = 'role', 'use' ? and what to filter against? ('primary', 'thumbnail'). We could, as we do now simply default to no filter and to get either the first or all of them.

In any case, we need a primary/thumbnail image when multiple images are present so I think we can use A) the first one but as you know Json other is not so reliable or B) use "sequence" key as per pages, the sequence number 1 is the primary/thumbnail image.

@giancarlobi
Copy link
Collaborator

@giancarlobi i will also add today my thoughts on collection management. Since collection membership is just a value inside a json key i was almost thinking of not dealing with it as a separate Feature list (implicit in many of the existing tasks) but i will add a tiny list explaining the basic exposed interaction ways (move, migrate, batch update, ingest "into") to help with documenting and UI too. We already have that included into the main ingest webform workflow but it won't harm if we generalize it even more.

I fully agree. About value of membership into Json, are you planning to use UUID as for images, nodes, etc. ? Finally, the collection is another type of DO and so a node, right?

@DiegoPino
Copy link
Member Author

DiegoPino commented Apr 30, 2019

As first instance (we have a lot of other important issues to solve), I think Open Seadragon IIIF Images formatter is a good choice for multiple images if we can add (I don't know if OSD includes this feature) a small thumbnail of images clickable that is to be able to click thumbnail image to scroll to selected. This option could be a second level as priority.

True. Thumbnail overview Is an openseadragon extension, we just need to enable the JS, some CSS and add the option to the formatter config. Will add as a dependant feature to the list. I think it will be appreciated by a lot of folks. There is also so much we can do with Views, Slideshows, etc. We should provide a few (Famous, infamous Carrousels too)

In any case, we need a primary/thumbnail image when multiple images are present so I think we can use A) the first one but as you know Json other is not so reliable or B) use "sequence" key as per pages, the sequence number 1 is the primary/thumbnail image.

Totally right. Our as:images items are JSON objects, so we can't trust order, thanks for being the active part of my brain! I will figure out both options: tagging and sequence. We can allow both, no issue at all. Will add to the features list

I fully agree. About value of membership into Json, are you planning to use UUID as for images, nodes, etc. ? Finally, the collection is another type of DO and so a node, right?

Right. Right now, because of the way Drupal deals with referencing entities (mess!) we are using entity IDS (primary keys), that was my first iteration to prove the concept and show how we make collections and connect Archipelago generated JSON structure to webform form field generated elements. But, this week that will change, all Drupal entities will be permanently stored referencing UUIDs instead in the strawberry JSON. I will also, backwards, allow people to use IDs if they need to (if someone invents a new ingest method or uses some custom entity reference field) and i will convert them back to UUIDs on the fly on node save, This means we will always have only UUIDs permanently stored, makes migration and ingest via JSONAPI easier, actually its the only way. To make this cross compatible i will also transform them on the fly back to IDS when the widget that is going to interact with them requires that (webform reference fields, e.g). I also want to allow a third option. Full URIs. Who says that an object can not be part of another servers collection? =) Or we can not reference an external image? We kinda have that already in the as:image structure (url key) but i want to formalize that as code inside the formatters and the node save handlers.

@DiegoPino
Copy link
Member Author

@giancarlobi i checked some of the roadmap items here, have been doing this slowly as we advance. Still not everything green (or course..) and for 1.0.0-beta i will target the essential features == what other repositories have + what makes us awesome.

@DiegoPino
Copy link
Member Author

@giancarlobi 3D Formatters are up! esmero/format_strawberryfield@50063da
3D Archipelago

@giancarlobi
Copy link
Collaborator

@DiegoPino you are the best coder and you are making Archipelago the best framework!!

@DiegoPino
Copy link
Member Author

@giancarlobi totally group effort and so thankful for your kind words. Thanks for all the support, code reviews and good architectural ideas, this would have been impossible without you. This is a long path we have to walk, a marathon, a sightseeing route too to make sure we keep our eyes open to address what people need, to make a good and fair, open and inclusive software and not a sprint even when there is an implicit rush to get this out before our effort looses momentum, that is something i can not control. I'm aware there are many things we can do better, simplify and rethink in the future, but still happy our original ideas and code are keeping up their promises. Extending and creating new formatters, ways of interacting is something that is getting easier and easier. We will have to work twice as hard to build a balanced community around this. I just hope energy can be kept high enough for the challenges that are ahead.

@natehill
Copy link

I am here to help build the community around this, @DiegoPino and @giancarlobi - this is what METRO is for. Thanks, I have such tremendous respect for both of your work. It’s amazing.

@DiegoPino
Copy link
Member Author

  • Open Seadragon IIIF Images
    • Add thumbnail navigation

Done with config option.

I will now devote some quality time (2-3 hours) to the IABook reader to finish the search piece. Will be derailed to other projects and deadlines later but will come back tonight.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Documentation Help make this easier for people to use enhancement New feature or request help wanted Extra attention is needed Roadmap Things to be considered in our Official Roadmap
Projects
None yet
Development

No branches or pull requests

3 participants