Skip to content

Commit

Permalink
Merge pull request #59 from MITLibraries/docs
Browse files Browse the repository at this point in the history
Update docs and Makefile to reflect Fargate deploy
  • Loading branch information
Mike Graves committed Jan 29, 2019
2 parents 0636bca + 884c7d8 commit fbbc80e
Show file tree
Hide file tree
Showing 2 changed files with 80 additions and 40 deletions.
15 changes: 10 additions & 5 deletions Makefile
Expand Up @@ -4,6 +4,7 @@ SHELL=/bin/bash
S3_BUCKET=carbon-deploy
ORACLE_ZIP=instantclient-basiclite-linux.x64-18.3.0.0.0dbru.zip
ECR_REGISTRY=672626379771.dkr.ecr.us-east-1.amazonaws.com
DATETIME:=$(shell date -u +%Y%m%dT%H%M%SZ)

help: ## Print this message
@awk 'BEGIN { FS = ":.*##"; print "Usage: make <target>\n\nTargets:" } \
Expand All @@ -23,15 +24,13 @@ wheel:
container:
docker build -t $(ECR_REGISTRY)/carbon-stage:latest \
-t $(ECR_REGISTRY)/carbon-stage:`git describe --always` \
-t $(ECR_REGISTRY)/carbon-prod:latest \
-t $(ECR_REGISTRY)/carbon-prod:`git describe --always` \
-t carbon-stage:latest .
-t carbon .

dist: deps wheel container ## Build docker image
@tput setaf 2
@tput bold
@echo "Finished building docker image. Try running:"
@echo " $$ docker run --rm carbon:latest"
@echo " $$ docker run --rm carbon"
@tput sgr0

clean: ## Remove build artifacts
Expand All @@ -55,5 +54,11 @@ publish: ## Push and tag the latest image (use `make dist && make publish`)
$$(aws ecr get-login --no-include-email --region us-east-1)
docker push $(ECR_REGISTRY)/carbon-stage:latest
docker push $(ECR_REGISTRY)/carbon-stage:`git describe --always`

promote: ## Promote the current staging build to production
$$(aws ecr get-login --no-include-email --region us-east-1)
docker pull $(ECR_REGISTRY)/carbon-stage:latest
docker tag $(ECR_REGISTRY)/carbon-stage:latest $(ECR_REGISTRY)/carbon-prod:latest
docker tag $(ECR_REGISTRY)/carbon-stage:latest $(ECR_REGISTRY)/carbon-prod:$(DATETIME)
docker push $(ECR_REGISTRY)/carbon-prod:latest
docker push $(ECR_REGISTRY)/carbon-prod:`git describe --always`
docker push $(ECR_REGISTRY)/carbon-prod:$(DATETIME)
105 changes: 70 additions & 35 deletions README.rst
@@ -1,7 +1,11 @@
carbon
======
Carbon
======

Carbon is a tool for generating a feed of people that can be loaded into Symplectic Elements. It is designed to be run as a container. This document contains general application information. Please refer to the `Terraform module <https://github.com/MITLibraries/mitlib-terraform/tree/master/apps/carbon>`_ for the deployment configuration.

Carbon is a tool for generating a feed of people that can be loaded into Symplectic Elements. It is designed to be run as a container.
.. contents:: Table of Contents
.. section-numbering::

Developing
----------
Expand All @@ -14,17 +18,33 @@ Use pipenv to install and manage dependencies::

Connecting to the data warehouse will require installing the ``cx_Oracle`` python package. The good news is that this is now being packaged as a wheel for most architectures, so no extra work is required to install it. If you don't need to actually connect to the data warehouse, you are done. Note that the test suite uses SQLite, so you can develop and test without connecting to the data warehouse.

If you do need to connect to the data warehouse, you will also need to install the Oracle client library. It seems that now just installing the basic light package should be fine. In general, all you should need to do is extract the package and add the extracted directory to your ``LD_LIBRARY_PATH`` environment variable. If there is no ``lbclntsh.so`` (``libclntsh.dylib`` for Mac) symlink in the extracted directory, you will need to create one. The process will look something like this::
If you do need to connect to the data warehouse, you have two options, one using Docker and one without.

Without Docker
^^^^^^^^^^^^^^

To connect without Docker you will need to install the `Oracle client library <https://www.oracle.com/technetwork/database/database-technologies/instant-client/overview/index.html>`_. It seems that now just installing the basic light package should be fine. In general, all you should need to do is extract the package and add the extracted directory to your ``LD_LIBRARY_PATH`` environment variable. If there is no ``lbclntsh.so`` (``libclntsh.dylib`` for Mac) symlink in the extracted directory, you will need to create one. The process will look something like this (changing for paths/filenames as necessary)::

$ unzip instantclient-basiclite-linux.x64-18.3.0.0.0dbru.zip -d /usr/local/opt

# Add the following line to your .bash_profile or whatever to make it permanent
$ export LD_LIBRARY_PATH=/usr/local/opt/instantclient_18_3:$LD_LIBRARY_PATH

# If the symlink doesn't already exist:
$ ln -rs /usr/local/opt/instantclient_18_3/libclntsh.so.18.1 \
/usr/local/opt/instantclient_18_3/libclntsh.so

On Linux, you will also need to make sure you have libaio installed. You can probably just use your system's package manager to install this easily. The package may be called ``libaio1``.

With Docker
^^^^^^^^^^^

Connecting with Docker, in theory, should be more straightforward. The idea would be to test your changes in a container. As long as you aren't modifying the project dependencies, building the container should be quick, so iterating shouldn't be terrible. You will of course need a working Docker installation, and you will also need to have the AWS CLI installed and configured. Your development process using this method would look like:

1. Make your changes.
2. Run ``make dist`` from project root.
3. Test your changes by running ``docker run --rm carbon <carbon args>``, with ``<carbon args>`` being whatever arguments you would normally use to run carbon.

Building
--------

Expand All @@ -37,58 +57,73 @@ The build process downloads this file from S3 so you should have the AWS CLI ins
Deploying
---------

Deployment is currently being handled by Travis. When a PR is merged onto the master branch Travis will build a new container image, tag it both with ``latest`` and with the git short hash, and then push both tags to the ECR registry.
Staging
^^^^^^^

Lacking a fully automated deployment pipeline, the final step of deploying to prod will need to be done manually. Staging builds are fully automated by Travis. When a PR is merged onto the master branch Travis will build a new container image, tag it both with ``latest`` and with the git short hash, and then push both tags to the ECR staging registry. A Cloudwatch scheduled event will periodically trigger the Fargate task to run. This task will use the latest image from the ECR registry.

If you need to deploy a new image outside of Travis then do the following::

$ cd carbon
$ make clean
$ make dist && make publish

Production
^^^^^^^^^^

When you are ready to deploy the current staging build to production, simply run::

$ make promote

from the project root. This command effectively copies the latest staging image over to the latest production image. In addition, a new tag with the current ISO 8601 formatted UTC datetime is added to the current production image and pushed as well. This is done to make rolling back changes easier. In case this is needed, find the datetime tag of the last working production image and retag it as ``latest``. This can be done with the following, replacing ``<datetime>`` with the datetime tag::

$ $(aws ecr get-login --no-include-email --region us-east-1)
$ docker pull 672626379771.dkr.ecr.us-east-1.amazonaws.com/carbon-prod:<datetime>
$ docker tag 672626379771.dkr.ecr.us-east-1.amazonaws.com/carbon-prod:<datetime> \
672626379771.dkr.ecr.us-east-1.amazonaws.com/carbon-prod:latest
docker push 672626379771.dkr.ecr.us-east-1.amazonaws.com/carbon-prod:latest

Configuration
^^^^^^^^^^^^^

In order for the Lambda to run, carbon needs a few environment variables set. These can either be set in the environment or passed to the Lambda function through the event JSON object. Variables set using the event object will overwrite those set in the environment.

+-----------+-------------------------------------------------------------+
| Variable | Description |
+===========+=============================================================+
| FTP_USER | FTP user to log in as |
+-----------+-------------------------------------------------------------+
| FTP_PASS | Password for FTP user (see SECRET_ID) |
+-----------+-------------------------------------------------------------+
| FTP_PATH | Name of remote file (with path) on FTP server |
+-----------+-------------------------------------------------------------+
| FTP_HOST | FTP server hostname |
+-----------+-------------------------------------------------------------+
| FTP_PORT | FTP server port |
+-----------+-------------------------------------------------------------+
| CARBON_DB | SQLAlchemy database connection string of the form: |
| | ``oracle://<username>:<password>@<server>:1521/<sid>`` |
| | (see SECRET_ID) |
+-----------+-------------------------------------------------------------+
| SECRET_ID | The ID for an AWS Secrets secret. Use either the Amazon |
| | Resource Name or the friendly name of the secret. See below |
| | for a description of this value. |
+-----------+-------------------------------------------------------------+

The ``FTP_PASS`` and ``CARBON_DB`` env vars should not be set as env vars in the Lambda function. Instead, create an AWS Secrets JSON object with these and set the ID of the secret as the ``SECRET_ID`` env var on the Lambda function. The JSON object should look like::
The Fargate task needs the following arguments passed in at runtime. These are set in the Terraform config.

+-------------+-------------------------------------------------------------+
| Argument | Description |
+=============+=============================================================+
| --ftp | |
+-------------+-------------------------------------------------------------+
| --ftp-host | FTP server hostname |
+-------------+-------------------------------------------------------------+
| --ftp-user | FTP user to log in as |
+-------------+-------------------------------------------------------------+
| --ftp-path | Name of remote file (with path) on FTP server |
+-------------+-------------------------------------------------------------+
| --secret-id | The ID for an AWS Secrets secret. Use either the Amazon |
| | Resource Name or the friendly name of the secret. See below |
| | for a description of this value. |
| | |
+-------------+-------------------------------------------------------------+
| --sns-topic | The ARN for the SNS topic. This is used to send an email |
| | notification. |
+-------------+-------------------------------------------------------------+
| <feed type> | The type of feed to run. This should be either ``people`` |
| | or ``articles``. |
+-------------+-------------------------------------------------------------+

The ``--secret-id`` option should point to an AWS Secrets JSON object that looks like::

{
"FTP_PASS": <password>,
"CARBON_DB": <connection_string>
}

The same Lambda function is used to generate both the HR and the AA feeds. Passing the feed type to the Lambda at runtime determines which feed gets generated. This should be handled by the CloudWatch event that triggers the Lambda execution. The event can be configured to pass a custom JSON object to the Lambda. Use the following JSON, selecting either ``people`` or ``articles`` for the feed you want to generate::

{
"feed_type": <people|articles>
}
``CARBON_DB`` should be an SQLAlchemy database connection string of the form ``oracle://<username>:<password>@<server>:1521/<sid>``.

Usage
-----

While this is intended to be run as a Lambda, the old CLI interface is still supported for ease of testing locally.
The CLI interface works the same whether running locally or as a container. When running as a container, however, remember that if specifying an output file (rather than stdout) it will go to a file local to the container, not your host system.

View the help menu for the ``carbon`` command::

Expand Down

0 comments on commit fbbc80e

Please sign in to comment.