Skip to content

priyanlc/docker-hadoop-spark-hive-workbench

Repository files navigation

How to use HDFS/Spark Workbench

To start an HDFS/Spark Workbench:

    ./start-hadoop-spark-workbench-with-Hive.sh

This will start the following services namenode datanode1 datanode2 hive-metastore-postgresql -this is the postgresql database hive-server hive-metastore hue - shut this down if not required spark-master spark-worker 1 to 4

Starting workbench for debugging

Start in new terminal for each line.

docker-compose -f docker-compose-hive.yml up  namenode hive-metastore-postgresql
docker-compose -f docker-compose-hive.yml up  datanode hive-metastore
docker-compose -f docker-compose-hive.yml up  hive-server
docker-compose -f docker-compose-hive.yml up  spark-master spark-worker1 spark-worker2 spark-worker3 spark-worker4  hue

Interfaces

Hive test

Load data into Hive:
  $ docker-compose exec hive-server bash
  # /opt/hive/bin/beeline -u jdbc:hive2://localhost:10000
  > CREATE TABLE pokes (foo INT, bar STRING);
  > LOAD DATA LOCAL INPATH '/opt/hive/examples/files/kv1.txt' OVERWRITE INTO TABLE pokes;
  > select * from pokes;
  > describe extended pokes 

Spark test

make example


To remove created data run:

make clean-example

Maintainer

  • priyanchandrapala at yahoo.co.uk

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages