Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

OAI-ORE support #999

Closed
kaplun opened this Issue · 2 comments

2 participants

@kaplun
Collaborator

Originally on 2012-04-10

Open Archives Initiative Object Reuse and Exchange defines standards for the description and exchange of aggregations of Web resources [...]

In Invenio aggregation might come from different sources:

  • the DB:
    • collections aggregate records
    • records links to documents
    • documents links to revisions of documents
    • revisions of document link to formats of documents
  • MARC 76x-78x fields (at CERN e.g. these are used to link a record to the official publication, or a conference with its talks, a talk with the contribution and the proceeding, etc. or can be used to link a photo to a photoshoot). (in OpenAIREplus datasets will be linked to publications).

Implementation details will be added as comments to this ticket.

@lnielsen
Owner

Originally on 2013-03-08

OAI-ORE prototype notesUse cases of OAI-ORE:

  • Data exchange (primary):
    • CDS, Inspire, ADS, arXiv
    • OpenAIREplus Orphan Repository, OpenAIREplus repository.
  • Visualisation (secondary):
    • Enhanced publications: Browsing the archives via Firefox plugin (small showcase). Candidate aggregation examples:
  • General examples:
    • Collection aggregating:
      • Collections
      • Records
      • Feed
    • Record aggregating:
      • Metadata record (perhaps via OAI-PMH)
      • Authors
      • Documents (PDFs, Images, Videos, Audio)
      • Bibliographic descriptions: BibTex, MARC, MARCXML
      • Comments
      • External links
      • Similar to relationship: DOI, arXiv id
      • See also
    • Record citations (this could also just be added as a relationship to other resources)
      • Records
    • Record translations
      • Records
    • Documents aggregating:
      • Revisions
    • Revisions aggregating:
      • Formats
  • Specific examples:
    • Logs
    • Login information?
    • Photo shoot aggregating:
      • Photos (isnâ��t this the same as a record aggregating documents?)
    • Conference aggregating:
      • Contributions
      • Proceeding
      • Notes
      • Posters
      • Talks
      • Slides
    • Book aggregating:
      • Chapters
    • Periodical
      • Journals
        • Volumes -> Issues -> Record
    • OpenAIRE:
      • Funding scheme aggregating
        • Projects
          • Records (data, publications)
      • Publications aggregating
        • Data
        • Project(s)
        • Funding scheme
      • Data aggregating
        • Publications
        • Project(s)
        • Funding scheme
  • See also videos
  • Similar records
  • https://twiki.cern.ch/twiki/bin/view/Inspire/TalkORE

Abstract data model

  • Resource: anything of interest - resources are identified by HTTP URIs
    • Information resource: Any kind of document, image, video etc that when you access the URI get information back (i.e like we know the web).
    • Non-information resource: The HTTP URI doesnâ��t return information - just a name for a â��real-worldâ�� object
  • Aggregation: a set of resources (a non-information resource).
  • Aggregated resource: a resource in an aggregation (which can be an aggregation). Important: Anything that should be in an aggregation, must have a URL (e.g project, funding scheme, etc)
  • Resource map: a description of one aggregation (i.e an information resource)
  • Proxy: used for ordering

HTTP implementation

  • Each URI defined in resource maps must resolve
  • Separate resource maps
    • Model 1:
      • http://foo/aggregation/a (aggregation - redirects with 303 via content negotiation)
      • http://foo/aggregation/a.html (resource)
      • http://foo/aggregation/a.rdf (resource map)
      • http://foo/aggregation/a.atom (resource map)
    • Model 2:
      • http://foo/aggregation/a.rdf#aggregation (aggregation)
      • http://foo/aggregation/a.html (resource)
      • http://foo/aggregation/a.rdf (resource map)
    • Pros:
      • Clear standalone resource map
    • Cons:
      • Redirects will degrade harvester performance
  • Embedded resource map via RDFa:
    • Model 3 (without redirect):
      • http://foo/aggregation/a.html#aggregation (aggregation)
      • http://foo/aggregation/a.html (resource map + resource)
    • Model 4 (with redirect):
      • http://foo/aggregation/a (aggregation)
      • http://foo/aggregation/a.html (resource + resource map)
    • Pros:
      • Resource map is embedded in splash page (no redirects needed)
    • Cons:
      • Size of HTML (perhaps with gzip compression itâ��s negligible).
      • Depending on size
      • Load issues during harvesting
  • Resource Map discovery:
    • Generate site map xml
    • Generate atom feed
    • Via OAI-PMH (could possibly avoid redirects from aggregation to resource map)
    • Insert link-tag in HTML

Risks/concerns

  • Inclusion of other relationships and metadata:
    • How much (see 4.5 Relationships to other Resources and Types)? Citation links, translations.
  • Exporting very large aggregations
  • HTTP implementation of OAI-ORE incompatible with Invenio URL scheme?
  • Efficiency of protocol
    • Redirects in resource map discovery
    • One aggregation per resource map (means lots of HTTP requests to harvest #records).
  • Enforcing structural constraints of aggregation graph

Relation with OAI-PMH

  • OAI-PMH be used to support resource map discovery
  • OAI-ORE can be used to include a link to a OAI-PMH metadata record

Integration in Invenio

  • URL Scheme + Data model
    • Anything that needs to be referenced from an aggregation needs a HTTP URI (there are ways to express relationships with other entities though).
    • The data model and URL scheme is tightly connected.
  • Resource Map generation framework:
    • Mapping of Invenio data to resource maps
    • Module for mapping anything in Invenio to the OAI-ORE data model
@lnielsen lnielsen added the r_someday label
@lnielsen
Owner

This is an idea, and as far as I'm aware no one are planning to work on this so I'll close it as an idea for some day.

@lnielsen lnielsen closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.