Skip to content

brentpayne/apache-drill-with-S3-docker

Repository files navigation

apache-drill-with-S3-docker

An apache drill setup in docker primed for using with S3.

This repo and walkthrough relies heavily on information found or copied from these sources:

Setup

Initial Repo setup

git clone https://github.com/brentpayne/apache-drill-with-S3.git
cd apache-drill-with-S3

Insert your project's info into these files

s3.sys.drill S3 drill storage definition

  • replace YOUR_S3_BUCKET with your S3 bucket's repo.
  • [OPTIONAL] it is always good to have you own workspace, so create a data folder in your S3 bucket or change the data workspace with the key-prefix/folder for your data.

core-site.xml Apache Drill configuration file that sets up AWS credentials. NOTE: the max S3 connections is set pretty high here.

  • Insert AWS credentials that have access to your S3 bucket for PLACE_YOUR_ACCESS_KEY_HERE and PLACE_YOUR_SECRET_KEY_HERE_IT_IS_LONGER_THAN_THE_ACCESS_KEY

Run drill with S3 connections

If you don't already have docker + docker-machine install skip to the reference section

To run the development docker with interative debugging

docker-machine start default
eval "$(docker-machine env default)"
docker build -t brentpayne/drillS3 .
docker run -it -p 8047:8047 brentpayne/drillS3 bash
sh /etc/bootstrap.sh

You can now run queries via the command line in the docker or on the apache drill webpage. You can broswe to the drill webpage via the virtual box's IP on port 8047.

Find the IP that your docker VM :

docker-machine ip default

Assume this is 192.168.99.100 . Then you can browse to 192.168.99.100:8047

Reference Section

Docker setup

This sections draws a lot from the Docker documents. You might be best suited by using it. The instruction below are for using docker-machine on Macs..

Download and setup docker and docker-machine.

Docker Setup and Development walkthrough

Install Docker

We need both docker and docker-machine installed.

Best to just install the Docker Toolbox. I don't use Kitematic, but it could be awesome:

https://www.docker.com/docker-toolbox

For reference:

Docker Toolbox Mac Installation instructions/tutorial: https://www.docker.com/docker-toolbox
Docker engine install page: https://docs.docker.com/engine/installation/mac/
Docker machine manual install: https://docs.docker.com/machine/install-machine/

I forget if you have to create the default docker VM or not. If you do, give it a enough memory and disk space.

docker-machine create -d virtualbox --virtualbox-memory 4096 --virtualbox-disk-size 20000 default

If not, you'll most likely need to increase these in your docker virtual machine: http://stackoverflow.com/questions/32834082/how-to-increase-docker-machine-memory-mac

emacs ~/.docker/machine/machines/default/config.json
docker-machine restart default

_ no comments necessary about how vim is better than emacs, just use what you want _

Running Development Docker

Startup default docker vm. I often do this after changing network, for example going from home to work. Otherwise, it seems the VM network is no longer properly connected.

docker-machine restart default

Setup docker environment variable to the default vm

eval "$(docker-machine env default)"

Build development Docker image, could take a while for first build

docker build -t brentpayne/drillS3 .

Breaking the above command down. brentpayne is my dockhub username, -t brentpayne/drillS3 specifies the docker image's tag. . is where the Dockerfile exists. The Dockerfile describes how to build a docker image.

For reference, this is the end of a successful build

...
 ---> 56dfe8a937cb
Step 16 : ADD s3.sys.drill /drill-storage/sys.storage_plugins/s3.sys.drill
 ---> Using cache
 ---> f8e89fde2185
Step 17 : CMD bash /etc/bootstrap.sh
 ---> Running in 09cd493bf0e3
 ---> 866324821d75
Removing intermediate container 09cd493bf0e3
Successfully built 866324821d75

This command lists current docker images

docker images

Now we can start of a container instance and connect to a bash terminal.

docker run -it -p 8047:8047 brentpayne/drillS3 bash
bash /etc/bootstrap.sh

Breaking this command down, the -it options are needed for an interactive terminal connection. -p 8047:8047 sets the docker VM's port 8047 to forward to the container's port 8047.

About

An apache drill setup in docker primed for using with S3

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages