Skip to content

ngageoint/seed-cli

Repository files navigation

Seed CLI

seed seed cli

The Seed team provides a fully featured Command-Line Interface (CLI) for algorithm developers looking to offer Seed compliant packaging of their jobs. Using the CLI will enable you to quickly iterate without getting bogged down learning how to interface directly with the underlying container technologies (Docker). It also simplifies the process of attaching Seed metadata to your algorithm prior to publish.

The following sections provide a brief overview of the most commonly used Seed CLI commands. For more detailed explanations, refer to the Seed user guide.

Usage

Seed CLI offers sub-commands that correlate to common operations performed by an algorithm developer. These can be seen from the command-line by launching the CLI without any arguments:

seed

Note that many seed commands call docker in the background, and those docker commands often require elevated privileges. When necessary, seed will attempt to run these commands as sudo and you may be prompted for a password.

The full list of available commands will be returned as output. High-level overview of each command and its expected usage can be found in the following sections.

Build

The first step when starting to package an algorithm for Seed compliance is to define the requirements and interface. This is done by a combination of execution environment configuration (Dockerfile), resource requirement definition and input / output specification (seed.manifest.json). By default, the seed build command assumes both a Dockerfile seed.manifest.json reside in the current working directory. Using these two files in conjunction the build command is able to construct an re-usable Seed compliant image.

A simple example of this would be the addition-job image.

seed build -d examples/addition-job

This will result in a new Docker image that contains com.ngageoint.seed.manifest LABEL and is named according to spec constraint: addition-job-0.0.1-seed:1.0.0

This image can now be executed via the seed run command or pushed to a remote image registry by way of seed publish.

The seed build command also provides the option to automatically publish the image after building via the -publish flag. All flags specified by the seed publish command are available for use.

Init

The init command will initalize a directory with a template seed.manifest.json file.

The following command will put a template seed.manifest.json file in the examples/job directory:

seed init -d examples/job

Run

The primary purpose of the CLI is to easily enable algorithm execution. The common stumbling blocks for new developers are being able to feed data into and retrieve out of the containers as a part of execution. The seed run command facilitates this through introspection of the Seed interface indicating exactly what input data is required, allowing for simply specifying the system locations of data and handling all mounting and output validation and capture operations for the developer.

If gpu resources are requested in the seed manifest file then seed run will automatically attempt to allocate the requested GPU resources. Nvidia-docker(2.0) not being installed or insufficient GPU resources available will result in a docker error. Instructions for installing nvidia-docker(2.0) can be found here: https://github.com/NVIDIA/nvidia-docker/wiki/Installation-(version-2.0)

For a Seed interface with a single inputData.files element with a name of MY_INPUT the seed run command would be as follows:

seed run -in process-file:0.1.0-seed:0.1.0 -i MY_INPUT=/tmp/file_input.txt -o /tmp/outputs

This will volume mount /tmp/file_input.txt as read only and /tmp/outputs into the container, replacing these values with the container relative locations and injecting into the defined args placeholders for consumption by the algorithm.

Batch

Related to the run command, the seed batch command will run an image multiple times with varying inputs. It will take an image name and either a directory with files to use as input or a csv file specifying keys and files (preferred).

For an Seed interface with two inputData.files elements with a name of MY_INPUT and MY_INPUT2 the seed batch command would be as follows:

seed batch -in process-file:0.1.0-seed:0.1.0 -b batch.csv -o /tmp/outputs

and the batch.csv file would look something like this:

MY_INPUT, MY_INPUT2
/path/to/input1.txt, input2.txt
/path/to/different/input.txt, input2.txt
currentDirectoryInput.txt, input2.txt

The image will be run three times and success or failure will be reported for each run along with the location of any output.

List

Simple command to list the local Seed compliant images. It can be run with the following command:

seed list

Allows for discovery of Seed compliant images hosted within a Docker registry (default is docker.io).

The 'seed search' command will search a given registry for repositories (images and all their versions) that end in "-seed". Here is an example of searching quay.io:

seed search -r https://quay.io

If no registry url is supplied, the search command uses docker hub, which requires a organization (or user) to be specified:

seed search -o geoint

If a registry is private, a username and password can be specified with -u and -p options:

seed search -r http://localhost:5000 -u testuser -p testpassword

Publish

Provides a convenient way for algorithm developers to push a Seed image to a registry. This command will tag a seed image appropriately and push it to the specified registry. If successful, it will then remove the local image as well.

Here is an example of publishing the extractor image to the geoint organization on docker hub:

seed publish -in extractor-0.1.0-seed:0.1.0 -r docker.io -o geoint

Publishing will check if an image with the same name and tag exists in the registry and will fail if one is found unless either the force flag (-f) is set or a deconflict tag is specified to increase a version number. A common use case for seed algorithm developers is to publish new versions of their image and this can be done by specifying one of the job or package version flags. Here is an example of updating the extractor image to 1.0.0:

seed publish -in extractor-0.1.0-seed:0.1.0 -r docker.io -o geoint -A -P

This will rebuild the extractor image with the appropriate name & label and publish the image extractor-1.0.0-seed:1.0.0 to docker hub. Finally, if the registry is private, a username and password can be specified with -u and -p options:

seed publish -in extractor-0.1.0-seed:0.1.0 -r localhost:5000 -u testuser -p testpassword

Pull

The Pull command will pull the a Seed compliant image from the remote Docker registry.

This will pull the extractor-0.1.0-seed:1.0.0 image from the docker.io/geoint registry:

seed pull -in extractor-0.1.0-seed:0.1.0 -r docker.io -o geoint

Validate

The Validate command will validate a Seed json file against the Seed schema. This is also done as part of the build and run commands, but if a user is having problems getting their Seed file to validate this can be useful to debug without those additional steps.

This command will validate the Seed file in the examples/extractor directory using the schema built-in to the Seed CLI tool:

seed validate -d examples/extractor

To use a different schema, pass it in using the -s flag:

seed validate -d examples/extractor -s schema/0.1.0/seed.manifest.schema.json

Version

The version command will print the version of the Seed CLI tool:

seed version

Development

If you wish develop on the Seed CLI, you will need an installation of Golang 1.6+ (for vendoring support). Once you have a GOPATH defined, the following will allow you to clone and build the CLI project:

# Clone repo and retrieve dependencies
git clone https://github.com/ngageoint/seed-cli.git $GOPATH/src/github.com/ngageoint/seed-cli
cd $GOPATH/src/github.com/ngageoint/seed-cli
go get ./...

# Build binary
./build-cli.sh

# Optionally add it to your local system binary folder for easy execution
cp -f output/seed-linux-amd64 /usr/local/bin/seed