Replace secrets.py python modules with environment variables #159

ricardogsilva · 2017-09-01T16:12:43Z

Currently DAGs are retrieving sensitive information from several secrets.py files. This pattern is meant to keep sensitive data out of the git repository (but beware there are some secrets lying around in the git history for this repo) but is not very friendly for setting up new environments (dev or otherwise). For example: I'm creating a new dev environment locally and will have to create several of these secrets files with dummy values just be able to get airflow up and running.

Also, this pattern of retrieving the secrets is not very dev friendly:

from myfile import secrets

var1 = secrets["secret1"]

This forces me to create the secrets.py file AND also create a secrets dictionary AND also create the secret1 key in the dict with some value. Otherwise I cannot get airflow to run. And I have to repeat this for all DAGs even if I'm only interested in working on a single DAG.

A more flexible strategy is proposed in the 12factor app's section on configuration. Basically it is recommended that this information be kept in the environment and not in custom python files.

In this case it would mean that, instead of several secrets.py files each holding a bunch of dictionaries with keys and strings as values, there would be several environment variables, one for each secret variable. Airflow even facilitates using this pattern for things like database connections via its connections feature.

The pattern of retrieving the secrets can also be made more flexible:

import os

var1 = os.getenv("SECRET1")  # defaults to None if SECRET1 does not exist in environment

# optionally you can specify some sensible default too
# var1 = os.getenv("SECRET1", "some_default_value")

The snippet above allows me to define just the secrets that I want to use and the code will not blow up (immediatly at least) if the other secrets are not defined.

As for the definition of the environment variables, they can be kept in a single file, which can be specific to each env, for example dev.env, staging.env, production.env. This file can be something like:

# dev.env
SECRET1=my_secret
SECRET2=other_secret

The contents of the file can then be exported to the environment using:

set -o allexport
source dev.env
set +o allexport

Or, if using docker, the docker run command supports an --env-file argument where we can specify the file.

These env files would not usually be kept in the code repository with the eventual exception of the dev file, which might make sense to keep in the repo, if it facilitates dev's setup and does not contain any truly sensitive information (for example if it uses only local database credentials)

The text was updated successfully, but these errors were encountered:

randomorder · 2017-09-28T15:14:19Z

See #118
We chose to use a different approach for configuration management and implemented it for S1 and S2.
Closing this for now

ricardogsilva mentioned this issue Sep 4, 2017

127 ingest more bands for l8 products #161

Merged

simboss added the Priority: High label Sep 5, 2017

ricardogsilva mentioned this issue Sep 5, 2017

Conventions for DAG names and Configuration Keys #118

Closed

randomorder added the ingestion label Sep 6, 2017

randomorder closed this as completed Sep 28, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace secrets.py python modules with environment variables #159

Replace secrets.py python modules with environment variables #159

ricardogsilva commented Sep 1, 2017

randomorder commented Sep 28, 2017

Replace secrets.py python modules with environment variables #159

Replace secrets.py python modules with environment variables #159

Comments

ricardogsilva commented Sep 1, 2017

randomorder commented Sep 28, 2017