Skip to content

VEuPathDB/service-eda-compute

Repository files navigation

EDA Compute

Plugin Development

Adding a Plugin

  1. Add the plugin RAML to new or existing files in the plugins schema directory.
    (See the example plugin schema)

  2. Add the new plugin endpoint to the project’s API Definition RAML.

  3. Run the command make raml-gen-code to generate the new Java source code for the plugin being added.

  4. Add a new package for your plugin in the plugins package
    (See the example plugin package)

  5. Follow the steps outlined in the plugins package readme.

  6. Add an endpoint for your new plugin in the Plugin Controller

Plugin RAML

TODO

API RAML

Plugins MUST add at least one endpoint to the api.raml file. This endpoint must accept a POST request with a request body that extends the ComputeRequestBase type defined in the computes.raml RAML library file. This endpoint MUST specify, at minimum, a 200 status response returning the RAML type lib.JobResponse in the media type application/json.

Additionally, each plugin execution endpoint definition MUST contain a sub-endpoint that will be used to download result files from completed jobs from that endpoint’s plugin. This sub-endpoint SHOULD accept a URI parameter used to specify the type or name of the file that will be downloaded.

Example Plugin Definition
  /example:
    post:
      body:
        application/json: lib.ExamplePluginRequest
      responses:
        200:
          body:
            application/json: lib.JobResponse
    /{file}:
      uriParameters:
        file:
          type: string
          description: MUST be one of "meta", "tabular", "statistics".
      post:
        body:
          application/json: lib.ExamplePluginRequest
        responses:
          200:
            body:
              text/plain: any

Plugin Context

On instantiation plugin implementations are provided a PluginContext instance which provides the following:

  • Access to the plugin job’s input data.

  • Access to the plugin job’s local filesystem workspace

  • Method for starting/running arbitrary shell commands in the plugin job’s context.

Input Data

On execution, plugins will be provided with the following values in addition to named tabular data files generated from the StreamSpec list the plugin defines.

Standard Inputs
  • The original HTTP request sent in to trigger the execution of a job.

  • The job configuration (pulled from and additionally available as part of the original HTTP request).

  • Study metadata retrieved from the EDA Subsetting service.

The tabular data retrieve from the EDA Merge Service based on the plugin’s provided StreamSpec instances will be written to the plugin’s local scratch workspace and will be available via the workspace property provided on the AbstractPlugin base type.

For example, if your plugin defines a StreamSpec instance with the name foobar, on plugin start, the data retrieved from the EDA Merge Service for your StreamSpec would be available by doing the following.

Opening an Input.
// Open the input as an InputStream
getWorkspace().openStream("foobar")

// Open the input as a Reader
getWorkspace().openReader("foobar")

// Get full input as a string
getWorkspace().readAsString("foobar")

Validating Inputs

Compute plugins may want to declare constraints on the types of inputs they can take. This can be declared in the plugin’s PluginProvider implementation in the getConstraintSpec method. These constraints get exposed in the /computes endpoint which enables clients to determine validity of inputs.

Output Data

Plugins are expected to output specific target files which will be persisted to a cache in S3.

Plugins are not required to output every file listed below, but only the listed files will be persisted.

Expected Output Files
output-stats

Statistics data generated by the plugin execution.

This file will be made available for download via the HTTP API.

output-meta

Metadata generated by the plugin execution.

This file will be made available for download via the HTTP API.

output-data

Tabular data generated by the plugin execution.

This file will be made available for download via the HTTP API.

error.log

STDERR output from the execution of a shell command via the Compute Service’s CLI call API.

Plugins do not need to and should not write to this file directly, the ComputeProcessBuilder utility made available through the provided plugin context will handle configuring external processes to write to this file.

exception.log

Exception stacktrace output. This file is created and populated with the stacktrace of uncaught exception thrown by a plugin’s execution.

Plugins may choose to write to this file if they handle their own exceptions internally and do not throw uncaught exceptions.

Plugin Workspace

When executed, a plugin job will be provided with a temporary local scratch workspace. Plugins are expected to write their output data into this workspace from where it will be persisted to the S3 cache.

On completion of the plugin’s execution, the workspace will be deleted.

Plugins may use this workspace for any additional filesystem based operations desired provided they do not extend beyond the lifecycle of the source job itself.

Metrics

Metric Description

EDA Compute Specific Metrics

plugin_exec_time

Histogram of plugin execution time in seconds by plugin.

plugin_successes

Counter of successful plugin executions by plugin.

plugin_failures

Counter of failed plugin executions by plugin.

Async Platform Metrics

queue_time

Histogram of times spent queued by queue.

queued_jobs

Gauge of the number of currently queued jobs by queue.

job_successes

Counter of successful job executions by queue.

job_failures

Counter of failed job executions by queue.

Project Development

Project Structure

This project is written in and divided into sections for two languages, Java and Kotlin. The core of the service and its internals are all written in Kotlin, and the segment of the project made for plugin writers is in Java. The two source sets exist under src/main/java and src/main/kotlin.

The intention here is a clear separation between service internals and plugin code that allows plugin developers to work entirely in Java in a space free from service implementation clutter.