diff --git a/docs/03.apps_ci_jenkins.md b/docs/03.apps_ci_jenkins.md index bd91dee..d11fd81 100644 --- a/docs/03.apps_ci_jenkins.md +++ b/docs/03.apps_ci_jenkins.md @@ -4,30 +4,211 @@ title: Continuous Integration with Jenkins tagline: --- -The SD2E project uses [Jenkins](http://jenkins.sd2e.org/) for continuous -integration (CI). It is now standard practice to set up CI for all Agave apps. -This will ensure your deployed app is always up to date with the master branch -of your git repo, and it will alert you if jobs are not working. +Jenkins is an automation server for running continuous integration tests. The +SD2E project uses [Jenkins](http://jenkins.sd2e.org/) for continuous integration +(CI) to ensure any changes that have been made to apps do not break their core +functionality. It is now standard practice to set up CI for all Agave apps. This +will ensure your deployed app is always up to date with the master branch of +your git repo, and it will alert you if jobs are not working. This guide will +help you integrate CI testing into any app you would like to deploy, but is not +meant to be a replacement for the +[Jenkins documentation](https://jenkins.io/doc/).
-#### Header 1 +#### The Jenkins file + +The Jenkins file defines the stages and environment variables of your Jenkins +job. The Jenkins file is written in [Groovy](http://groovy-lang.org/), and +should be located at the top-level directory for your app. For the previous +FastQC app example, the Jenkins file would be in the `~/fastqc-app/` directory. +Here is an example of a Jenkins file for the FastQC app: ``` -% code block +#!groovy + +pipeline { + agent any + environment { + AGAVE_JOB_TIMEOUT = 900 + AGAVE_JOB_GET_DIR = "job_output" + AGAVE_DATA_URI = "agave://data-sd2e-community/sample/sailfish/test/read1.fastq" + CONTAINER_REPO = "fastqc" + CONTAINER_TAG = "test" + AGAVE_CACHE_DIR = "${HOME}/credentials_cache/${JOB_BASE_NAME}" + AGAVE_JSON_PARSER = "jq" + AGAVE_TENANTID = "sd2e" + AGAVE_APISERVER = "https://api.sd2e.org" + AGAVE_USERNAME = "sd2etest" + AGAVE_PASSWORD = credentials('sd2etest-tacc-password') + REGISTRY_USERNAME = "sd2etest" + REGISTRY_PASSWORD = credentials('sd2etest-dockerhub-password') + REGISTRY_ORG = credentials('sd2etest-dockerhub-org') + PATH = "${HOME}/bin:${HOME}/sd2e-cloud-cli/bin:${env.PATH}" + } + stages { + + stage('Create Oauth client') { + steps { + sh "make-session-client ${JOB_BASE_NAME} ${JOB_BASE_NAME}-${BUILD_ID}" + } + } + stage('Build container') { + steps { + sh "apps-build-container -O ${REGISTRY_USERNAME} --image ${CONTAINER_REPO} --tag ${CONTAINER_TAG}" + } + } + stage('Deploy to TACC.cloud') { + steps { + sh "apps-deploy -T -O ${REGISTRY_USERNAME} --image ${CONTAINER_REPO} --tag ${CONTAINER_TAG}" + sh "cat deploy-*" + } + } + stage('Run a test job') { + steps { + sh "run-test-job deploy-${AGAVE_USERNAME}-job.json ${AGAVE_JOB_TIMEOUT}" + sh "get-test-job-outputs deploy-${AGAVE_USERNAME}-job.json.jobid ${AGAVE_JOB_GET_DIR}" + } + } + stage('Validate results') { + steps { + sh "python -m pytest tests/validate-job --job-directory ${AGAVE_JOB_GET_DIR}" + } + } + } + post { + always { + sh "delete-session-client ${JOB_BASE_NAME} ${JOB_BASE_NAME}-${BUILD_ID}" + } + success { + deleteDir() + } + } +} + ``` +Copy and paste the above text into a file called `Jenkinsfile` in the top level +of your `~/fastqc-app/` directory. + +The file is divided into three sections: `environment`, `stages`, and `post`. +The `environment` section defines environment variables needed by the Jenkins +server to run the test job. Most of the variables in this section are specific +to the Jenkins server and should be left alone. The `CONTAINER_REPO` and +`CONTAINER_TAG` variables should, collectively, point to a "test" repository +location so that the versioned app is not overwritten. It is good practice to +use the app name (e.g. "`fastqc`") as the `CONTAINER_REPO`, and "`test`" as the +`CONTAINER_TAG`. + +You may also need to change `AGAVE_DATA_URI` if your data is located on some +other system or in a different path. Also note that if your test data is located +in your private storage system, `data-tacc-work-username`, you will need to grant +`READ` access to the `sd2etest` user with the following command: + +``` +systems-roles-addupdate -u sd2etest -r USER data-tacc-work-username +``` + +The `stages` section of the Jenkins file will also remain largely unchanged. You +will typically need the following sections: +1. `Create Oauth client` +2. `Build container` +3. `Deploy to TACC.cloud` +4. `Run a test job`, and +5. `Validate results` + +The first four steps depend on scripts that exist on the Jenkins server, and +should work the same for most apps. The final step must be written by the app +developer, and will be different from app to app. More details on this step are +included in the next section below. + +Finally, the `post` section uses one more script on the Jenkins server to clean +up the session and exit the Jenkins test. This section should not change.
-#### Header 2 +#### Validating results with pytests + +To validate your results, you will need to define pytests that will be run to +verify your app is functional. For the FastQC app, navigate to the tests +directory `~/fastqc-app/tests`, and create a new sub-directory called +`validate-job`: +``` +cd ~/fastqc-app/tests +mkdir validate-job +cd validate-job +``` -[Example link](https://url/) +Create two pytests in that directory, `conftest.py` and `test_files.py`. You can +copy and paste these two examples directly: +`conftest.py`: +``` +import pytest + +def pytest_addoption(parser): + parser.addoption("--job-directory", action="store", default="job_output", + help="Directory containing output to evaluate") + +@pytest.fixture +def job_directory(request): + return request.config.getoption("--job-directory") +``` + +`test_files.py`: +``` +'''Test for specific files existence in a directory''' +import pytest +import os + +'''Parameterize the test with a list of required files''' +@pytest.mark.parametrize("file_list", [ + (['reads1_fastqc.html', 'reads1_fastqc.zip']) +]) +def test_files(job_directory,file_list): + '''checks job_directory for existence of all contents of file_list''' + # Existence + listdir = os.listdir(job_directory) + assert(len(list(set(listdir) & set(file_list))) == len(file_list)), \ + "Missing files" + # Files are readable and not zero length + for f in file_list: + try: + fstat = os.stat(os.path.join(job_directory, f)) + assert (fstat.st_size > 0), "Zero length file: {}".format(f) + except Exception: + raise IOError("Couldn't stat {}".format(f)) +``` + +The `conftest.py` adds a `--job-directory` option to pytest, this points to the +location of the output files that are created by running your app. The pytest +`test_files.py` checks if files exist and are greater than 0 bytes. + +If you are creating pytests for a different app, you can simply change the output +file names in line 7 of `test_files.py`, to the file names that are output by your app: + +``` +'''Parameterize the test with a list of required files''' +@pytest.mark.parametrize("file_list", [ + (['reads1_fastqc.html', 'reads1_fastqc.zip']) +]) +``` +In this example, Jenkins is testing for the existence of `reads1_fastqc.html` +and `reads1_fastqc.zip`, which indicate that the FastQC app ran successfully.
-#### Header 3 +#### Setting up the Jenkins server + +Now that you have created a groovy file and defined pytests, you will need to +add your repo to the Jenkins server and define when it should run tests. We +typically have the server run a test every time a changed is pushed to the +master branch, or anytime a merge request to the master branch is made. It may +also be a good idea to schedule weekly or monthly builds to ensure your app +continues working even when no changes have been made to the source repo. + +To set up Jenkins tests for your app, follow the instructions in this video tutorial: + --- diff --git a/docs/03.create_app_04.md b/docs/03.create_app_04.md index fa89a3b..139b9ea 100644 --- a/docs/03.create_app_04.md +++ b/docs/03.create_app_04.md @@ -95,6 +95,7 @@ First, create a template job json file: ``` % jobs-template username-fastqc-0.11.7 > job.json ``` +If it is not already present, add `"fastq": "agave://data-sd2e-community/sample/sailfish/test/read1.fastq"` to the "inputs" field of the job.json to run a test job. Here are the expected contents of `job.json`: ``` diff --git a/docs/04.abaco_custom_reactor.md b/docs/04.abaco_custom_reactor.md new file mode 100644 index 0000000..3158c36 --- /dev/null +++ b/docs/04.abaco_custom_reactor.md @@ -0,0 +1,240 @@ +--- +layout: page +title: Create a Custom Reactor +tagline: +--- + +In this guide, we will demonstrate how to create an Abaco reactor from scratch. +Abaco reactors react to events such as the successful upload of a file or the +completion of an Agave job. Abaco reactors enable automation by automatically, +rather than manually, triggering in response to defined events. If you are +deciding between building an Agave app or an Abaco reactor consider the +following rule of thumb: If the computation requires less than one minute and +minimal resources, use an Abaco reactor. If the computation requires more than +one minute and / or considerable resources, use an Agave app. + +For an introduction to the Abaco CLI, please see +[Abaco CLI Basics](04.abaco_cli.md). + + +
+#### Requirements + +The Abaco command line tools needed to create a custom reactor are included with +the [SD2E CLI](01.install_cli.md). To make sure you have the most current +version, run: +``` +% sd2e info --upgrade +``` + +The following are also requirements: + +1. [Docker CE](https://www.docker.com/community-edition) +2. [Docker Hub Account](https://hub.docker.com/) +3. [jq](https://stedolan.github.io/jq/) +4. getopts (installed by default on macOS High Sierra) + + +
+#### Initialize a new reactor + +To initialize a new Abaco reactor in the simplest form, run the following +command: +``` +% abaco init -n my-project +``` + +After executing this command, you will see a new directory with the project +(reactor) name. This directory contains all the required files to deploy an +Abaco reactor: +``` +my-project/ +├── Dockerfile +├── TEMPLATE.md +├── config.yml +├── message.json +├── reactor.py +├── reactor.rc +├── requirements.txt +└── secrets.json.sample +``` + +To build a functional reactor, modify the files as follows: + +
+##### `Dockerfile` +The only mandatory line is the `FROM` statement, which, by default, is: +``` +FROM sd2e/reactor-base:python2 +``` +This docker image comes pre-loaded with the python libraries that contain the +reactor functions. If you need to add any additional code, dependencies, etc., +to your reactor, this is the time for that. Other possible base images are +described on the [base images page](06.base_images.md). + +
+##### `TEMPLATE.md` +This is a short description of the files necessary for creating a reactor. + +
+##### `config.yml` +This file contains configurations to be passed into the reactor. For example, +if the file has the following contents: +``` +--- +myVar: value1 +myCategory: + var1: value2 +``` +Then you will be able to retrieve values from an [`AttrDict`][] named '`settings`' +in your `reactor.py`. For example, the above values could be called as: +``` +> import reactors as Reactor +> Reactor.settings.get('myVar') +value1 +> stuff = Reactor.settings.get('myCategory') +> stuff.get('var1') +value2 +``` + +
+##### `message.json` +This is the message template. To activate automatic validation of incoming JSON +messages to your reactor, add a valid [JSON schema][] (draft 4+) by extending +the Dockerfile: +``` +ADD message.json / +``` +At present, the JSON schema *must* be named `AbacoMessage` + + +
+##### `reactor.py` +This file is where the code for your main function can be found. If you need to +add other python files located in this same directory, extend the `Dockerfile` +with: +``` +ADD mycode.py / +``` +An example of a functional reactor could be: +```python2 +import reactors as Reactor +import json + +def main(): + md = Reactor.context.message_dict + Reactor.logger.info(json.dumps(md)) + +if __name__ == '__main__': + main() +``` +This reactor simply logs the message that is sent to it. Examples of more +interesting reactors with useful functions can be found [here](06.links.md). + +
+##### `reactor.rc` +The `reactor.rc` file contains important deployment configurations. The Abaco +CLI reads directly from this file when deploying a reactor to the Docker +registry and to the Abaco API. +``` +# Reactor mandatory settings +REACTOR_NAME=my-project + +# Reactor optional settings +# REACTOR_DESCRIPTION= +# REACTOR_WORKERS=1 +# REACTOR_PRIVILEGED=0 +# REACTOR_STATELESS=0 +# REACTOR_USE_UID=0 +# REACTOR_ALIAS=aka_reactor_demo + +# Docker settings +DOCKER_HUB_ORG=your_docker_registory_uname +DOCKER_IMAGE_TAG=my-project +DOCKER_IMAGE_VERSION=0.1 +``` +Ensure the `REACTOR_NAME` is set correctly to `my-project` (or whatever you +would like to call it), and add your Docker Hub username to the `DOCKER_HUB_ORG` +near the bottom. + +
+##### `requirements.txt` +This is a standard Python requirements file, and is empty by default. If you +have additional Python dependencies beyond those that ship with the Reactors +base image, add them here and they will be included and built (if possible) at +`deploy` time or when you manually run `docker build`. + +
+##### `secrets.json` +If you create a file called `secrets.json` (from the example file called +`secrets.json.sample`) it will never be committed to a git +repository or included in a Docker image build, but `abaco deploy` uses it to +populate the default environment variables for the Reactor. Values placed in +this file are *NOT THAT SECRET* since they can be discovered by other users if +you choose to share your Reactor. + +
+##### Ignore files +Also in this directory are preconfigured `.dockerignore` and `.gitignore` files +that are tailored towards preventing you from including sensitive information +and/or build cruft in the Docker images and git repositories used in creating +Reactors. + +
+#### Deploy your reactor + +Before deploying your reactor, first make sure to refresh the Agave token: +``` +% auth-tokens-refresh +``` + +Once you have added the necessary information to the files listed above, you can +deploy your reactor by executing the `abaco deploy` command from the top level +of your reactor project directory: +``` +% abaco deploy +Sending build context to Docker daemon 11.78kB +Step 1/1 : FROM sd2e/reactor-base:python2 +# Executing 4 build triggers + ---> Using cache + ---> Using cache + ---> Using cache + ---> Using cache + ---> 1b7f8366e05f +Successfully built 1b7f8366e05f +Successfully tagged username/my-project:0.1 +The push refers to repository [docker.io/username/my-project] +45080ac9c614: Layer already exists +8a5132998025: Layer already exists +... +aca233ed29c3: Layer already exists +e5d2f035d7a4: Layer already exists +0.1: digest: sha256:997c7f59sa5aaaa07f2f4w1dwdbc1f0ar4frf3fbq7qb1d5dcdb0f2f42056d5f1e1a size: 6987 +[INFO] Pausing to let Docker Hub register that the repo has been pushed +Successfully deployed Actor with ID: wmoxRE7rEqGPQ +``` + +This process pushes a Docker image to the location specified in your `reactor.rc` +file, and it deploys an Abaco reactor object into the API. If all goes well, you +will find a "Success" message at the end, along with the reactor ID (as a +13-character alphanumerical string). To run this particular reactor, send it a +message with any contents: +``` +% export MESSAGE='{ "myVar":"value1" }' +% abaco submit -m "${MESSAGE}" wmoxRE7rEqGPQ +kWmayogY84RPA +{ + "myVar": "value1" +} +% abaco logs wmoxRE7rEqGPQ kWmayogY84RPA +Logs for execution kWmayogY84RPA: +[INFO] 2018-05-22T03:06:00Z: Returning a logger set to level: INFO for module: wmoxRE7rEqGPQ +[INFO] 2018-05-22T03:06:00Z: {"myVar": "value1"} +``` + +More descriptions on the various Abaco commands can be found in the +[Abaco CLI introduction](04.abaco_cli.md). + + +--- +Return to the [API Documentation Overview](../index.md) diff --git a/index.md b/index.md index 940ab4d..81a4fe8 100644 --- a/index.md +++ b/index.md @@ -51,7 +51,7 @@ the SD2E platform. Documentation for getting started with the SD2E API is below.     3.3 [Modify an Existing Application](docs/03.modify_app.md) -    3.4 [Continuous Integration with Jenkins](docs/03.apps_ci_jenkins.md) (*work in progress*) +    3.4 [Continuous Integration with Jenkins](docs/03.apps_ci_jenkins.md)     3.5 [Old App Building Process](docs/old/03.old_create_app.md) (*deprecated*) @@ -60,7 +60,7 @@ the SD2E platform. Documentation for getting started with the SD2E API is below.     4.1 [Abaco CLI Basics](docs/04.abaco_cli.md) -    4.2 Create a Custom Reactor (*coming soon*) +    4.2 [Create a Custom Reactor](docs/04.abaco_custom_reactor.html)     4.3 Continuous Integration with Jenkins (*coming soon*)