Skip to content

tarzzz/docker-spark-hive-ipython

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

docker-spark-hive-ipython

Introduction

  • Apache Hive (trunk version)
  • Apache Tez 0.5.2
  • Apache Hadoop 2.5.2
  • PostgreSQL 9.3 (Hive metastore backend)
  • Apach Spark 1.4.1
  • Jupyter

Install Docker

Attention for boot2docker user

The default memory of boot2docker is only 2G, and recommended system requirement is 4G. There may be some unexpection problem if memory is not enough. You can change the default memory setting in following way:

  • vim ~/.boot2docker/profile
  • add Memory = 4096 in this file.
  • Attention The following process will reset your boot2docker, it's mean your images and caontainers will be erased.
  • boot2docker stop
  • boot2docker destroy
  • boot2docker init
  • boot2docker start

Recommended System Requirement

  • CPU 4core
  • RAM 4G up
  • HDD 10G up (4G for Docker images)

Pull The Docker Image

  • docker pull bryanyang0528/docker-spark-hive-ipython

Build The Docker Image

  • Install git first
  • Enter a apropriate directory
  • git clone https://github.com/bryanyang0528/docker-spark-hive-ipython.git
  • cd docker-spark-hive-ipython
  • docker build .
  • docker images confirm the images id
  • docker tag <images id> docker-spark-hive-ipython:latest

Run the Docker Image

  • docker run -d -p 8888:8888 -p 4040:4040 --name pyspark bryanyang0528/docker-spark-hive-ipython

Launch the Jupyter

  • linux: Type the web address http://localhost:8888 in any browser. The link of Spark UI is http://localhost:4040.
  • Mac:(for boot2docker) In your terminal, press boot2docker ip confirm the ip address of boot2docker,the type http://<boot2docker ip>:8888in the browser.

SparkContext(as sc) and SqlContext(as sqlContext) will launch automatically when you open a notebook.

About

Spark + Jupyer + Hive + Livy

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.9%
  • Shell 0.1%