Skip to content
mtholder edited this page Sep 9, 2014 · 16 revisions

Overview

For a general schematic of the open tree architecture, see this [architectural diagram] 1. The basic goal of this repo is provide web-service interfaces to the corpus of phylogenetic studies. The corpus is referred to as phylesystem.

The phylesystem-api provides basic read/write access and format conversion. Search functionality is supplied by oti.

Most of the business logic of dealing with the phylesystem corpus is coded in peyotl. This is nice because if facilitates easier code reuse and testing that does not require running the full web stack. This is a pain because devs will need to coordinate merges of peyotl branches and phylesystem-api branches that depend on them.

Workflows

Study editing

A typical series of study edit operations, as choreographed by the open tree curator app (which is running the code in the curator subdir of the opentree repo is shown below. We are in the process of moving from v1 of the API to v2, so some of the URLs used could be stale. The template configuration file holds the patterns used to construct the actual URLs used by the curator app; so you should use that if you need the exact URLs.

  1. request brief list of studies and metadata from oti's findAllStudies service
  2. user selects a study, and curator app fetches a "NexSON with extra info" using a GET to phylesystem-api's v1/study/{STUDY_ID}.
  3. the user corrects various deficiencies of the study, and the curator app saves these changes using a PUT to phylesystem-api's v1/study/{STUDY_ID}

Study creation

  1. the curator app prompts the user to enter a new study from scratch or upload a file.

  2. No studies "in the wild" will be in NexSON. If the user uploads data to be imported, the curator app uses its own controllers to convert the inputs to NexSON. These calls are documented in the opentree/curator README. Briefly, they are:

    1. to_nexson with the blob of input to use NCL to convert to NeXML and peyotl to convert the NeXML to open tree NexSON.

    2. If there is a previous NexSON blob associated with this study (e.g. if the user is uploading trees as separate newick tree in a series of operations), then merge_otus is called because the conversion of "external" sources to NexSON is not aware of previously created IDs

  3. Alternatively, the user can create a new OT study using a tree base ID, in this case the curator app just prompt the user for that ID.

  4. A POST to phylesystem-api's v1/study will validate the input, create a new study ID, and return a receipt with the ID and git SHA's for the new study.

Overview of the phylesystem-api's implementation of its part of the workflows

Step 2 of editing - the GET call to a study in phylesystem-api:

On the server side this triggers several calls to peyotl's Phylesystem wrapper. The key one's are:

  1. phylesystem.return_study
  2. phylesystem.add_validation_annotation
  3. phylesystem.get_version_history_for_study_id

In terms of the actions performed on the server, these steps entail.

  1. the phylesystem-api waits for lock on the phylesystem git repo
  2. the master branch is checked out
  3. the requested study is read.
  4. If a no cached validation annotation for the study is available, then one is generated.
  5. The annotation injected into the NexSON
  6. the version history of the study is constructed (this is where the call will be after https://github.com/OpenTreeOfLife/phylesystem-api/issues/107 is fixed).
  7. the phylesystem git repo is unlocked.
  8. the "extra info" is added to the response JSON which will also hold the NexSON

Step 3 of editing - the PUT to a study in phylesystem-api:

  1. make sure that the client sent in a valid starting_commit_SHA arg that will identify the parent commit for this edit. This should be the commit SHA of the version of the study that was shown to the user so that his/her the history correctly reflects the lineage of files being edit.
  2. call peyotl.phylesystem.git_workflows.validate_and_convert_nexson to validate the NexSON and convert it to the version of NexSON syntax that is being used by the phylesystem-api.
  3. call peyotl.phylesystem.annotate_and_write to write make the new commit.
  4. If the commit can be merged to master (which hopefully will be almost all the time - the only exceptions should be if 2 users are editing the same study at the same time. In that case the first PUT should be merge-able, but the second will not be), then a deferred "push to github" call will be spawned.
  5. return the info about the commit.

Side note (or slide note, if you prefer)

The slide presentation containing the [architectural diagram] 1 is posted at http://phylo.bio.ku.edu/slides/ot.html and it can be regenerated by running https://github.com/OpenTreeOfLife/phylesystem-api/blob/master/docs/build-presentations.sh)