gmouchakis edited this page Jun 13, 2016 · 5 revisions
Supported versions 2.0.0
Current responsible(s) Yiannis Mouchakis @ NCSR-D --
Docker image(s) bde2020/hive
More info

Short description

The Apache Hive ™ data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL. Structure can be projected onto data already in storage. A command line tool and JDBC driver are provided to connect users to Hive.

Example usage

The docker container for Apache Hive is based on so check there for Hadoop configurations. This container deploys Hive and starts a hiveserver2 on port 10000. By default metastore_db is located at /hive-metastore. All Hive configuration files are located in the conf directory.

First you have to clone the repository from

To build docker-hive go into the docker-hive directory and run

docker build -t hive .

To run it first deploy Hadoop (see Then start hiveserver2 by running

 docker run --name hive --net=hadoop -p 10000:10000 -p 10002:10002 -v <path/to/metastore_db/location>:/hive-metastore --env-file=./hadoop.env hive

Then you can access hiveserver2 from localhost:10000 and hiveserver2 UI from localhost:10002

Deploy with docker compose

You can also deploy Hive with Hadoop with docker compose. It will set up a hadoop cluster with 3 datanodes and hive with hiveserver running. All data are stored in ./data

To do so first create the hadoop network

 docker network create hadoop

Then deploy the cluster with

 docker-compose up


In order to scale up Hive you must add more Hadoop nodes. For more info see on how to add more nodes see You can also edit the docker-compose.yml file and add more nodes there.

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.