Skip to content
This repository has been archived by the owner on Sep 20, 2022. It is now read-only.

Commit

Permalink
Fixed userguide for Docker/Spark entry
Browse files Browse the repository at this point in the history
  • Loading branch information
myui committed May 18, 2017
1 parent 68f6b46 commit 10e7d45
Show file tree
Hide file tree
Showing 3 changed files with 22 additions and 3 deletions.
19 changes: 19 additions & 0 deletions docs/gitbook/docker/getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,9 @@ This page introduces how to run Hivemall on Docker.

`docker build -f resources/docker/Dockerfile .`

> #### Note
> You can [skip](./getting_started.html#running-pre-built-docker-image-in-dockerhub) building images by using existing Docker images.
# 2. Run container

## Run by docker-compose
Expand All @@ -52,11 +55,27 @@ This page introduces how to run Hivemall on Docker.
2. Run `docker run -it ${docker_image_id}`.
Refer [Docker reference](https://docs.docker.com/engine/reference/run/) for the command detail.

## Running pre-built Docker image in Dockerhub

1. Check [the latest tag](https://hub.docker.com/r/hivemall/latest/tags/) first.
2. Pull pre-build docker image from Dockerhub `docker pull hivemall/latest:20170517`
3. `docker run -p 8088:8088 -p 50070:50070 -p 19888:19888 -it hivemall/latest:20170517`

You can find pre-built Hivemall docker images in [this repository](https://hub.docker.com/r/hivemall/latest/).

# 3. Run Hivemall on Docker

1. Type `hive` to run (`.hiverc` automatically loads Hivemall functions)
2. Try your Hivemall queries!

## Accessing Hadoop management GUIs

* YARN http://localhost:8088/
* HDFS http://localhost:50070/
* MR jobhistory server http://localhost:19888/

Note that you need to expose local ports e.g., by `-p 8088:8088 -p 50070:50070 -p 19888:19888` on running docker image.

## Load data into HDFS (optional)

You can find an example script to load data into HDFS in `./bin/prepare_iris.sh`.
Expand Down
2 changes: 1 addition & 1 deletion docs/gitbook/spark/binaryclass/a9a_df.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ val testDf = spark.read.format("libsvm").load("a9a.t")
.select($"rowid", $"label".as("target"), $"feature", $"weight".as("value"))
.cache

scala> df.printSchema
scala> testDf.printSchema
root
|-- rowid: string (nullable = true)
|-- target: float (nullable = true)
Expand Down
4 changes: 2 additions & 2 deletions docs/gitbook/spark/getting_started/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ $ ./bin/spark-shell --jars hivemall-spark-xxx-with-dependencies.jar
Then, you load scripts for Hivemall functions.

```
scala> :load define-all.spark
scala> :load import-packages.spark
scala> :load resources/ddl/define-all.spark
scala> :load resources/ddl/import-packages.spark
```

0 comments on commit 10e7d45

Please sign in to comment.