Skip to content

Neha-Sinha2305/terraform-nomad-presto

 
 

Repository files navigation

Terraform-nomad-presto

Terraform module with example

Releases

Contents

  1. Prerequisites
  2. Compatibility
  3. Usage
    1. Requirements
      1. Required software
    2. Providers
  4. Inputs
  5. Outputs
  6. Examples
  7. Authors
  8. License

Prerequisites

Please follow this section in original template

Compatibility

Software OSS Version Enterprise Version
Terraform 0.13.1 or newer
Consul 1.8.3 or newer 1.8.3 or newer
Vault 1.5.2.1 or newer 1.5.2.1 or newer
Nomad 0.12.3 or newer 0.12.3 or newer

Usage

make up

Check the example of terraform-nomad-presto documentation here

Requirements

Required software

See template README's prerequisites.

Local only. For verification and debugging:

  • consul binary available on PATH on the local machine
  • java11
  • presto.jar in root is version 340

Providers

This module uses the Nomad provider.

Inputs

Name Description Type Default Required
nomad_provider_address Nomad provider address string "http://127.0.0.1:4646" yes
nomad_data_center Nomad data centers list(string) ["dc1"] yes
nomad_namespace [Enterprise] Nomad namespace string "default" yes
nomad_job_name Nomad job name string "presto" yes
service_name Presto service name string "presto" yes
port Presto http port number 8080 yes
docker_image Presto docker image string "prestosql/presto:333" yes
container_environment_variables Presto environment variables list(string) [""] no
hivemetastore.service_name Hive metastore service name string "hive-metastore" yes
hivemetastore.port Hive metastore port number 9083 yes
minio.service_name minio service name string yes
minio.port minio port number yes
minio.access_key minio access key string yes
minio.secret_key minio secret key string yes

Outputs

Name Description Type
presto_service_name Presto service name string
presto_port Presto port number

Examples

module "presto" {
  depends_on = [
    module.minio,
    module.hive
  ]

  source = "github.com/fredrikhgrelland/terraform-nomad-presto.git?ref=0.0.1"

  nomad_job_name    = "presto"
  nomad_datacenters = ["dc1"]
  nomad_namespace   = "default"

  service_name = "presto"
  port         = 8080
  docker_image = "prestosql/presto:333"

  #hivemetastore
  hivemetastore = {
    service_name = module.hive.service_name
    port         = 9083
  }

  # minio
  minio = {
    service_name = module.minio.minio_service_name
    port         = 9000
    access_key   = module.minio.minio_access_key
    secret_key   = module.minio.minio_secret_key
  }
}

For detailed information check example/ directory.

Verifying setup

You can verify successful run with next steps:

Option 1 [hive-metastore and nomad]

# from metastore (loopback)
beeline -u jdbc:hive2://
  • Query existing tables (beeline-cli)
SHOW DATABASES;
SHOW TABLES IN <database-name>;
DROP DATABASE <database-name>;
SELECT * FROM <table_name>;

# examples
SHOW TABLES;
SELECT * FROM iris;
SELECT * FROM tweets;

Option 2 [presto and nomad]

presto
  • Query existing tables (presto-cli)
SHOW CATALOGS [ LIKE pattern ]
SHOW SCHEMAS [ FROM catalog ] [ LIKE pattern ]
SHOW TABLES [ FROM schema ] [ LIKE pattern ]

# examples
SHOW CATALOGS;
SHOW SCHEMAS IN hive;
SHOW TABLES IN hive.default;
SELECT * FROM hive.default.iris;

Option 3 [local presto-cli]

NB! Check required software section first.

  • create local proxy to presto instance with consul binary.
make proxy-presto
  • in another terminal run presto-cli session
presto --server localhost:8080 --catalog hive --schema default --user presto
  • Query tables (3 tables should be available)
show tables;
select * from <table>;

To debug or continue developing you can use presto cli locally. Some useful commands.

# manual table creation for different file types
presto --server localhost:8080 --catalog hive --schema default --user presto --file ./example/resources/query/csv_create_table.sql
presto --server localhost:8080 --catalog hive --schema default --user presto --file ./example/resources/query/json_create_table.sql
presto --server localhost:8080 --catalog hive --schema default --user presto --file ./example/resources/query/avro_tweets_create_table.sql

Authors

License


References

File types

CSV

CREATE TABLE iris (
  sepal_length varchar,
  sepal_width varchar,
  petal_length varchar,
  petal_width varchar,
  species varchar
)
WITH (
  format = 'CSV',
  external_location='s3a://hive/data/csv/',
  skip_header_line_count=1
);

NB! Hive supports csv int types for columns. You can create a table for csv file format using hive-metastore.

CREATE EXTERNAL TABLE iris (sepal_length DECIMAL, sepal_width DECIMAL,
petal_length DECIMAL, petal_width DECIMAL, species STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
LOCATION 's3a://hive/data/csv/'
TBLPROPERTIES ("skip.header.line.count"="1");

JSON

CREATE TABLE somejson (
  description varchar,
  foo ROW (
    bar varchar,
    quux varchar,
    level1 ROW (
      l2string varchar,
      l2struct ROW (
        level3 varchar
      )
    )
  ),
  wibble varchar,
  wobble ARRAY (
    ROW (
      entry int,
      EntryDetails ROW (
        details varchar,
        details2 int
      )
    )
  )
)
WITH (
  format = 'JSON',
  external_location = 's3a://hive/data/json/'
);

AVRO

CREATE TABLE tweets (
  username varchar,
  tweet varchar,
  timestamp bigint
)
WITH (
  format = 'AVRO',
  external_location='s3a://hive/data/avro-tweet/'
);

PROTOBUF

Reference to using-protobuf-parquet

todo

About

Terraform module to set up presto on nomad

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • HCL 66.6%
  • Makefile 18.4%
  • Dockerfile 11.6%
  • Python 2.2%
  • Shell 1.2%