Skip to content

Terraform module to deploy Apache Spark, with Tensorflow and NVIDA support.

License

Notifications You must be signed in to change notification settings

mcapuccini/terraform-openstack-spark

Repository files navigation

OpenStack Apache Spark Terraform Module

Build Status

Terraform module to deploy Apache Spark on OpenStack. By deploying this module you will get:

Table of contents

Prerequisites

On your workstation you need to:

  • Install Terraform
  • Set up the environmet by sourcing the OpenStack RC file for your project

On your OpenStack project you need to:

  • Import the CoreOS Container-Linux image (instructions here)

Configuration

Start by creating a directory, locating into it and by creating the main Terraform configuration file:

mkdir deployment
cd deployment
touch main.tf

In main.tf paste and fill in the following configuration:

module "spark" {
  source  = "mcapuccini/spark/openstack"
  # Required variables
  public_key="" # Path to a public SSH key
  external_net_uuid="" # External network UUID
  floating_ip_pool="" # Floating IP pool name
  coreos_image_name="" # Name of a CoreOS Container-Linux image in your project
  master_flavor_name="" # Flavor name to be used for the master node
  worker_flavor_name="" # Flavor name to be user for the worker nodes
  worker_volume_size="" # Worker block storage volume size in GB (used as HDFS data directory)
  workers_count=3 # Number of worker nodes to deploy
}

Init the Terraform directory by running:

terraform init

Deploy

To deploy please run:

terraform apply

Once the deployment is done, to get the SSH tunnelling commands to the interfaces you can run:

terraform output -module=spark

Scale

To scale the cluster you can increase and decrease the number of workers in main.tf and rerun terraform apply.

Destroy

You can delete the cluster by running:

terraform destroy