Skip to content

kltm/lala

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LALA (Light Application-Level API)

Current work and implementations

Presentations and context

This proposal has its origins in initial work done with Noctua communicating with Textpresso Central, for the use cases of 1) adding annotons and 2) to enrich the evidence captured by Noctua with TPC textual evidence IDs. Later work was done building on that base for communication between Noctua and PubAnnotation, at BioHackathon 16, to capture text spans to add to evidence to models. Full generalization was started in building up to and during BLAH 3.

Overview

Much thought and paper has been put into sharing information between resources, from the use of common identifiers to use of common data stores. We would like to explore an area that has not, to the authors' knowledge, been as well explored: the sharing of simple information directly between web-based applications. For example, a curator may wish to import IDs from one tool to another, curate the same paper using different tools, or a researcher may wish to analyze the same piece of genome in different analysis tools.

More concretely, the use case that we wish to look at is how to obtain small packets of specific information from an external resource that has its own associated web application; we look to do this by round-tripping a packet of information (acting as a "black box" or "piggy bank") through the external application using basic HTTP methods, allowing an easy high-level "federation". In an environment where much effort is being spent on the creation of rich web applications tailored to the habits of specific groups of scientists and curators, users and engineers would have much to gain by being able to reuse functionality from external applications in their workflows and stacks, as well as having the added benefit of driving traffic to external web applications that implement such a common specification.

In creating this protocol, we want to captures the following qualities:

  • Easy to implement
    Complexity can be a barrier to adoption and implementation
  • Basic HTTP tooling
    Any system should have easy access to the tools necessary
  • "Stateless"
    Simplifying debugging and implementation
  • Minimal need for initial coordination
    Beyond what data is to be returned, the external application does not need to understand the transiting packet
  • No need for calling application to coordinate changes to own API after initial coordination
    As the calling application is responsible for decoding the information it initially sent and the location it is sent to, major API changes can occur without the need to coordinate with any other resource
  • Have the ability to perform operations "remotely" while still logged-in to the calling application, without the need to coordinate cross-site logins
    This can be done by placing an authorization token in the transiting packet

Taking inspiration from the methods used by Galaxy to pull data in from external applications, we'd like to propose (and get feedback) about a general pattern for passing relatively simple data directly between applications. Variations of this proposal have been implemented or explored in applications such as: Noctua, Textpresso Central, AmiGO, and PubAnnotation.

Details

The above tickets go a long way for details and discussion, but very brief summary would be:

  • caller passes "black box" to external
  • external holds on to "black box" during workflow, adding data to "black box"
  • external passes "black box" back to caller

A more detailed summary, paying a little more attention to detail would be:

  • caller links/POSTs to external with the additional parameters endpoint_url (a URL of the caller's choice) and endpoint_arguments (an encoded JSON blob)
  • in the external application, a button (or what have you) is created that will POST the data in endpoint_arguments, plus external's additions into the requests field, to endpoint_url, using whatever workflow is made available
  • endpoint_url resolves to an endpoint that recreates the calling state from caller and runs the additional data from external in the context of the passed-through data
  • depending on a (TBD) passed parameter, external either continues the session or closes and passes control back to caller (e.g. forwards the user to the next page in the workflow or returns a raw JSON response to the external application)

A canned interaction example--somewhat more complicated, taken from Textpresso Central interaction with Noctua--might have outgoing URL GET parameters:

endpoint_url = URL encoded location of caller endpoint (POST to this URL)
endpoint_arguments = URL encoded stringified JSON blob structured as below
{
    "token": "uzrtkn",
    "provided-by": [
        "http://foo.bar"
    ],
    "x-user-id": "http://orcid.org/foo",
    "x-model-id": "gomodel:01234567",
    "x-client-id": "tpc",
    "requests": []
}

Incoming/returning data, POSTed to the endpoint_url, would be the same as the outgoing data above, except with requests added in the proper location:

{
    "token": "uzrtkn",
    "provided-by": [
        "http://foo.bar"
    ],
    "x-user-id": "http://orcid.org/foo",
    "x-model-id": "gomodel:01234567",
    "x-client-id": "tpc",
    "requests": [
        {
            "database-id": "UniProtKB:A0A005",
            "evidence-id": "ECO:0000314",
            "class-id": "GO:0050689",
            "reference-id": "PMID:666333",
            "external-id": "XXX:YYYYYYY",
            "comments": [
                "foo",
                "bar"
            ]
        }
    ]
}

Keep in mind that the actual contents of the passed packet are "abritrary" and completely decided by the caller, except for the requests list field that can be filled by the external application.

Control after POSTing to the caller's API would be handled using HTTP codes by the caller. It might be interesting to consider a more interactive use case for calling applications, such as passing control back to the external appication after the data has ben processed by the caller. As the flow of control is left to the calling application, as it may have opened a new window for the external application that it wants closed, or have some other process in place to get the user re-oriented.

A fuller explanation, with examples, would look an awful lot like the tickets above. A good place to start might be the later pubannotation/pubannotation#3 .

Open questions

While we have worked out the details to satisfy our core use cases, if not implement them all, we are interested in federating more generally with the wide range of biological and curation applications available. We would like input on how that could happen from the community. Ideas include:

  • better documentation
  • changes to the specification
  • concrete implementations and examples of and/or use of full specification
  • generic implementations/middleware
  • a central registry or interaction with a current registry

As well, there are some edge cases, especially around passing control (or not) at the end of an interaction that we would like to clarify by working through implementations in the real world.

Availability

GitHub

License

This work is a proposal for a software protocol, not a yet concrete or generalized implementation. As such, the text of this work is available under the CC-BY 4.0 license.

About

A description of a light application-level API for passing control and simple data between web-based scientific applications.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages