Airfield is an open source tool for the DC/OS ecosystem that enables teams to easily collaborate with shared Zeppelin instances.
Clone or download
Latest commit 4e55111 Dec 15, 2018

README.md

Airfield

Airfield

Airfield is an open source tool for the DC/OS ecosystem that enables teams to easily collaborate with shared Zeppelin instances.

The application consists of a micro service written in Flask and a User Interface written in Vue. It was developed and is being maintained by MaibornWolff.

Deployment

Requirements

  • DC/OS 1.11 or later
  • Marathon-LB
  • A wildcard DNS entry pointing at the loadbalancer. Each zeppelin instance will be available using a random name as a subdomain of your wildcard domain. As an example we will be using *.zeppelin.mycorp.
  • A Key-Value-Store to store the list of existing zeppelin instances. Currently supported are either consul or etcd. If you have neither installed we recommend our consul package.
  • Enough available resources to run both Airfield and one Zeppelin instance (minimum: 3 cores, 10GB RAM).

Airfield requires access to the Marathon API to manage zeppelin instances. If you are running DC/OS Enterprise you need to create a serviceaccount for airfield:

dcos security org service-accounts keypair private-key.pem public-key.pem
dcos security org service-accounts create -p public-key.pem -d "Airfield service account" airfield-principal
dcos security secrets create-sa-secret --strict private-key.pem airfield-principal airfield/account-secret
dcos security org groups add_user superusers airfield-principal

Package / Universe

Airfield is available in the DC/OS Universe.

First create a file options.json. For DC/OS EE clusters you need at least the following (change values to fit your cluster):

{
  "service": {
    "marathon_lb_vhost": "airfield.mycorp",
    "service_account_secret": "airfield/account-secret"
  },
  "airfield": {
    "marathon_lb_base_host": ".zeppelin.mycorp",
    "consul_endpoint": "http://api.aconsul.l4lb.thisdcos.directory:8500/v1"
  }
}

For DC/OS Open Source you need at least the following (change values to fit your cluster):

{
  "service": {
    "marathon_lb_vhost": "airfield.mycorp"
  },
  "airfield": {
    "marathon_lb_base_host": ".zeppelin.mycorp",
    "consul_endpoint": "http://api.aconsul.l4lb.thisdcos.directory:8500/v1",
    "dcos_base_url": "http://leader.mesos"
  }
}

The following config parameters are optional:

  • service.virtual_network_enabled and service.virtual_network_name if you want to run airfield in a virtual network
  • airfield.etcd_endpoint if you want to use etcd instead of consul
  • airfield.app_group if you want airfield to put the zeppelin instances into a different marathon app group
  • airfield.config_base_key if you want airfield to use a different key prefix for consul/etcd

Then you can install airfield using the following commands:

dcos package install airfield --options=options.json

Wait for it to finish installing, then access airfield via the vhost you provided (airfield.mycorp in the example).

Standalone Marathon App

We provide a marathon app definition for easy deployment.

The following settings need to be specified (see TODOs in the app definition):

  • AIRFIELD_BASE_HOST: Base DNS name to use for zeppelin instances (make sure its wildcard entry points towards your loadbalancer). Example: If you set it to .zeppelin.mycorp a zeppelin instance will be reachable via <randomname>.zeppelin.mycorp.
  • Either AIRFIELD_CONSUL_ENDPOINT: HTTP v1-Endpoint of your consul instance (for example http://consul.marathon.l4lb.thisdcos.directory:8500/v1)
  • or AIRFIELD_ETCD_ENDPOINT: host:port of your etcd instance (for example etcd.marathon.l4lb.thisdcos.directory:2379).
  • If running DC/OS EE: DCOS_SERVICE_ACCOUNT_CREDENTIAL: authorize Marathon access with service account. Change if you used a different secret.
  • If running DC/OS OpenSource: DCOS_BASE_URL. Set it to http://leader.mesos
  • Label HAPROXY_0_VHOST: URL you want Airfield to be reachable under (for example airfield.mycorp).

There a number of optional settings for Airfield that you can set using environment variables (see the config file for a complete list):

  • Airfield will put all zeppelin instances into the marathon app group airfield-zeppelin by default. Set AIRFIELD_MARATHON_APP_GROUP to override it. Set it to an empty string to make airfield deploy all instances on the root level.
  • By default all metadata will be stored in etcd/consul using the prefix airfield. You can override it by setting AIRFIELD_CONFIG_BASE_KEY.

Once you have configured the desired settings, you can deploy the application with the DC/OS CLI:

dcos marathon app add marathon-deployment.json

Usage

Airfield has a simple user interface that allows to interact with existing Zeppelin instances or create new instances with custom options.

Create new Zeppelin Instance

Click on the 'Add Instance' button in the main screen to reach the screen depicted below.

Airfield New Instance Screen

Simply select the desired instance type to load its default configuration. You can edit general settings, the spark configuration and specify additional packages to be installed.

Interact with a running Zeppelin Instance

Airfield Main Screen

Airfield lists all existing instances on the main screen. Besides being able to start, stop, restart or delete existing instances, the URL to the instance is also shown.

Further Development

Development Environment with docker-compose

This script uses docker-compose to set up a local development environment with Consul and Keycloak (OIDC) pre-configured. The default values for the environment variables have been configured to use these endpoints

You will still need a running DC/OS cluster to deploy your application for testing.

Local Backend

You need python >= 3.5 and an installed and configured dcos-cli (airfield uses the cli to get your cluster URL and an authentication token).

cd airfield-microservice

# Optional: use virtualenv
mkvirtualenv airfield --python=/usr/bin/python3

# Install dependencies
pip install -r requirements.txt

# Set flask app location and debug mode
export FLASK_APP=app
export FLASK_ENV=development

# Set additional environment variables - see config.py
# Run locally for development
AIRFIELD_CONSUL_ENDPOINT=http://localhost:8500/v1 AIRFIELD_BASE_HOST=example.com flask run

Local Frontend

Install the latest version of node.js.

cd airfield-frontend

# Install dependencies
npm i

# Run locally for development with mock server
npm run dev

# Build for production
npm run build

# Run ESLint on source files
npm run lint

Roadmap

The current release contains all basic functionality to collaborate with shared Zeppelin instances. Below is a list of future additions that will probably be included in a future release. Of course we can't give any guarantees :-)

  • Securing the application with OIDC
  • Usability improvements (only show creatable instances, allow adding GPUs to the instance, etc.)
  • Adding notebook templates to be created automatically on instance start
  • Deployment as DC/OS package
  • Build PR for DC/OS universe
  • Check available resources in the cluster before trying to start a notebook to avoid that instances get stuck in staging
  • Allow integration with dynamically scaling the DC/OS cluster
  • Protect zeppelin instances with user / password