Skip to content

Skatteetaten/terraform-nomad-hive

Repository files navigation

Terraform-nomad-hive

This module is IaC - infrastructure as code which contains a nomad job of hive.

Content

  1. Prerequisites
  2. Requirements
    1. Required modules
    2. Required software
  3. Compatibility
  4. Providers
  5. Usage
    1. Verifying setup
      1. Data example upload
  6. Intentions
  7. Inputs
  8. Outputs
  9. Modes
  10. Examples
  11. Contributors
  12. License
  13. References

Prerequisites

Please follow this section in original template

Requirements

Required modules

Module Version
terraform-nomad-minio 0.3.0 or newer
terraform-nomad-postgres 0.3.0 or newer

Required software

Compatibility

Software OSS Version Enterprise Version
Terraform 0.13.1 or newer
Consul 1.8.3 or newer 1.8.3 or newer
Vault 1.5.2.1 or newer 1.5.2.1 or newer
Nomad 0.12.3 or newer 0.12.3 or newer

Providers

Usage

The following command will run hive in the example/standalone-vault-provided-creds folder.

make up

Verifying setup

You can verify the setup by connection to Hive using the Nomad UI at localhost:4646. Follow the steps below.

  1. Locate and click the hive-metastore service.
  2. Click the exec button and connect to the metastoreserver task.
  3. Run beeline -u jdbc:hive2:// to connect to hive.
  4. Run SHOW databases;. Your output should look like this:
OK
+----------------+
| database_name  |
+----------------+
| default        |
+----------------+

Data example upload

Check example/README.md#data-example-upload

Intentions

Module is deployed with service mesh approach using consul-connect integration, where communication service-to-service controlled by intentions. Intentions are required only when consul acl is enabled and default_policy is deny.

In the examples, intentions are created in the Ansible playboook 00_create_intention.yml:

Intention between type
mc => minio allow
minio-local => minio allow
hive-metastore => postgres allow

⚠️ Note that these intentions needs to be created if you are using the module in another module and (consul acl enabled with default policy deny).

Inputs

Name Description Type Default Required
nomad_datacenters Nomad data centers list(string) ["dc1"] no
nomad_namespace [Enterprise] Nomad namespace string "default" no
local_docker_image Switch for nomad job bool - yes
use_canary Uses canary deployment for Hive bool false no
hive_service_name Hive service name string "hive-metastore" no
hive_container_port Hive container port number 9083 no
hive_docker_image Hive container image string "fredrikhgrelland/hive:3.1.0" no
hive_container_environment_variables Hive environment variables list(string) [""] no
resource Resource allocations for cpu and memory obj(number, number) {
cpu = 500,
memory = 1024
}
no
resource_proxy Resource allocations for proxy obj(number, number) {
cpu = 200,
memory = 128}
no
hive_bucket Hive requires minio buckets obj(string, string) {
default = string,
hive = string
}
no
minio_service Minio data-object contains service_name, port, access_key and secret_key obj(string, number, string, string) - no
minio_vault_secret Minio data-object contains vault related information to fetch credentials obj(bool, string, string, string, string) {
use_vault_provider = false,
vault_kv_policy_name = "kv-secret",
vault_kv_path = "secret/path/to/minio/creds",
vault_kv_field_access_key = "access_key",
vault_kv_field_secret_key = "secret_key"
}
no
postgres_service Postgres data-object contains service_name, port, database_name, username and password obj(string, number, string, string, string) no
postgres_vault_secret Postgres data-object contains vault related information to fetch credentials obj(bool, string, string, string, string) {
use_vault_provider = false,
vault_kv_policy_name = "kv-secret",
vault_kv_path = "secret/path/to/postgres/creds",
vault_kv_field_username = "username",
vault_kv_field_password = "password"
}
no

Outputs

Name Description Type
service_name Hive service name string
buckets Minio buckets for hive string
port Hive service port number

Modes

Hive can be run in two modes:

NB! current implementation supports only hivemetastore

Examples

Folder example contains examples of module usage, please refer for more details.

The example-code shows the minimum of what you need do to set up this module.

module "minio" {
  source = "github.com/skatteetaten/terraform-nomad-minio.git?ref=0.4.0"

  # nomad
  nomad_datacenters = ["dc1"]
  nomad_namespace   = "default"
  nomad_host_volume = "persistence-minio"

  # minio
  service_name    = "minio"
  host            = "127.0.0.1"
  port            = 9000
  container_image = "minio/minio:latest" # todo: avoid using tag latest in future releases

  # user provided  credentials
  access_key = "minio"
  secret_key = "minio123"
  vault_secret = {
    use_vault_provider          = false,
    vault_kv_policy_name        = "",
    vault_kv_path               = "",
    vault_kv_field_access_key   = "",
    vault_kv_field_secret_key   = ""
  }

  data_dir                        = "/minio/data"
  buckets                         = ["default", "hive"]
  container_environment_variables = ["JUST_EXAMPLE_VAR1=some-value", "ANOTHER_EXAMPLE2=some-other-value"]
  use_host_volume                 = true
  use_canary                      = true

  # mc
  mc_service_name                    = "mc"
  mc_container_image                 = "minio/mc:latest" # todo: avoid using tag latest in future releases
  mc_container_environment_variables = ["JUST_EXAMPLE_VAR3=some-value", "ANOTHER_EXAMPLE4=some-other-value"]
}

module "postgres" {
  source = "github.com/skatteetaten/terraform-nomad-postgres.git?ref=0.4.0"

  # nomad
  nomad_datacenters = ["dc1"]
  nomad_namespace   = "default"
  nomad_host_volume = "persistence-postgres"

  # postgres
  service_name    = "postgres"
  container_image = "postgres:12-alpine"
  container_port  = 5432
  vault_secret = {
    use_vault_provider      = false,
    vault_kv_policy_name    = "",
    vault_kv_path           = "",
    vault_kv_field_username = "",
    vault_kv_field_password = ""
  }
  admin_user                      = "hive"
  admin_password                  = "hive"
  database                        = "metastore"
  volume_destination              = "/var/lib/postgresql/data"
  use_host_volume                 = true
  use_canary                      = true
  container_environment_variables = ["PGDATA=/var/lib/postgresql/data/"]
}

module "hive" {
  source = "../.."

  # nomad
  nomad_datacenters  = ["dc1"]
  nomad_namespace    = "default"
  local_docker_image = false

  # hive
  use_canary                           = true
  hive_service_name                    = "hive-metastore"
  hive_container_port                  = 9083
  hive_docker_image                    = "fredrikhgrelland/hive:3.1.0"
  hive_container_environment_variables = ["SOME_EXAMPLE=example-value"]
  
  resource = {
    cpu    = 500,
    memory = 1024
  }
  resource_proxy =  {
    cpu     = 200,
    memory  = 128
  }

  # hive - minio
  hive_bucket = {
    default = "default",
    hive    = "hive"
  }
  minio_service = {
    service_name = module.minio.minio_service_name,
    port         = module.minio.minio_port,
    access_key   = module.minio.minio_access_key,
    secret_key   = module.minio.minio_secret_key,
  }

  # hive - postgres
  postgres_service = {
    service_name  = module.postgres.service_name
    port          = module.postgres.port
    database_name = module.postgres.database_name
    username      = module.postgres.username
    password      = module.postgres.password
  }

  depends_on = [
    module.minio,
    module.postgres
  ]
}

Contributors

License

This work is licensed under Apache 2 License. See LICENSE for full details.

References