Signposting for the scholarly web
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
css
examples
graphserver
notes
.gitignore
Makefile
README.md
TO_DO.md
arxiv_no_item.dot
arxiv_no_item.png
arxiv_no_item.svg
arxiv_plan.dot
arxiv_plan.png
arxiv_plan.svg
dlib_article.dot
dlib_article.png
dlib_article.svg
graphserver.md
graphserver.py
journal_with_pdf_html.dot
journal_with_pdf_html.png
journal_with_pdf_html.svg
journal_with_pdf_html_img.dot
journal_with_pdf_html_img.png
journal_with_pdf_html_img.svg
multiple_resolution_by_html.dot
multiple_resolution_by_html.png
multiple_resolution_by_html.svg
plos_with_component_image.dot
plos_with_component_image.png
plos_with_component_image.svg
scenarios.md
setup.py

README.md

Signposting the Scholarly Web

How do we link scholarly information on the web in a way that humans and machines can find their way around?

We are moving to a world with most scholarly information on the web and toward common use of identifiers for works/instances of both papers and data (CrossRef and DataCite DOIs in particular), and for people (ORCID). There are initiatives in versioning and connection to independent archives (Memento), and in annotation (W3C OA and Hypothes.is). Work on the semantic web is shifting to a more pactical focus on linked open data (including LDP and JSON-LD). A persistent problem is that there are very inconsistent linking practices and when a machine or user gets to a particular resource, it is often hard to work out what the resource is, and what the context is. There are several web standards that might help with this but they often solve only part of the problem, are little understood and inconsistently used. Can we work out patterns of linking, using HTTP Link headers in particular, that would help solve so key use cases?

WARNING - EARLY THOUGHTS, SKETCHY DRAFTS

Background

Use Cases

  1. Citation, Altmetrics, Annotations - follow rel="canonical" and, if necessary rel="collection", links that would usually get up to a DOI or similar identifier for the citable object (work or expression).

Example story: A user pastes a splash page URI into a citation management system, how can the system understand that there is a DOI for this item and offer the option to cite that instead so that the resulting is more robust and more likely to be associated with the work in question?

  1. Preservation - Use case like LOCKSS is the need to answer the question: What are all the components of this work that should be preserved? Follow all rel="describedby" and rel="item" links (potentially multiple levels perhaps through describedby and item). This could also be done with ORE aggregates so perhaps include ore:aggregates links too.

  2. Crawler with preferred formats - look for rel="alternate" links to preferred formats and understand that content in different formats is equivalent. (Note that alternate in intended to be transitive per http://www.w3.org/TR/html5/links.html#rel-alternate).

Where now?