Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZEPPELIN-925 Merge HiveInterpreter into JDBCInterpreter #943

Closed
wants to merge 9 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -162,8 +162,7 @@ enable 3rd party vendor repository (cloudera)
##### `-Pmapr[version]` (optional)

For the MapR Hadoop Distribution, these profiles will handle the Hadoop version. As MapR allows different versions of Spark to be installed, you should specify which version of Spark is installed on the cluster by adding a Spark profile (`-Pspark-1.2`, `-Pspark-1.3`, etc.) as needed.
For Hive, check the hive/pom.xml and adjust the version installed as well. The correct Maven
artifacts can be found for every version of MapR at http://doc.mapr.com
The correct Maven artifacts can be found for every version of MapR at http://doc.mapr.com

Available profiles are

Expand Down
2 changes: 1 addition & 1 deletion conf/zeppelin-site.xml.template
Original file line number Diff line number Diff line change
Expand Up @@ -178,7 +178,7 @@

<property>
<name>zeppelin.interpreters</name>
<value>org.apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.spark.PySparkInterpreter,org.apache.zeppelin.rinterpreter.RRepl,org.apache.zeppelin.rinterpreter.KnitR,org.apache.zeppelin.spark.SparkRInterpreter,org.apache.zeppelin.spark.SparkSqlInterpreter,org.apache.zeppelin.spark.DepInterpreter,org.apache.zeppelin.markdown.Markdown,org.apache.zeppelin.angular.AngularInterpreter,org.apache.zeppelin.shell.ShellInterpreter,org.apache.zeppelin.hive.HiveInterpreter,org.apache.zeppelin.tajo.TajoInterpreter,org.apache.zeppelin.file.HDFSFileInterpreter,org.apache.zeppelin.flink.FlinkInterpreter,,org.apache.zeppelin.python.PythonInterpreter,org.apache.zeppelin.lens.LensInterpreter,org.apache.zeppelin.ignite.IgniteInterpreter,org.apache.zeppelin.ignite.IgniteSqlInterpreter,org.apache.zeppelin.cassandra.CassandraInterpreter,org.apache.zeppelin.geode.GeodeOqlInterpreter,org.apache.zeppelin.postgresql.PostgreSqlInterpreter,org.apache.zeppelin.jdbc.JDBCInterpreter,org.apache.zeppelin.phoenix.PhoenixInterpreter,org.apache.zeppelin.kylin.KylinInterpreter,org.apache.zeppelin.elasticsearch.ElasticsearchInterpreter,org.apache.zeppelin.scalding.ScaldingInterpreter,org.apache.zeppelin.alluxio.AlluxioInterpreter,org.apache.zeppelin.hbase.HbaseInterpreter,org.apache.zeppelin.livy.LivySparkInterpreter,org.apache.zeppelin.livy.LivyPySparkInterpreter,org.apache.zeppelin.livy.LivySparkRInterpreter,org.apache.zeppelin.livy.LivySparkSQLInterpreter</value>
<value>org.apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.spark.PySparkInterpreter,org.apache.zeppelin.rinterpreter.RRepl,org.apache.zeppelin.rinterpreter.KnitR,org.apache.zeppelin.spark.SparkRInterpreter,org.apache.zeppelin.spark.SparkSqlInterpreter,org.apache.zeppelin.spark.DepInterpreter,org.apache.zeppelin.markdown.Markdown,org.apache.zeppelin.angular.AngularInterpreter,org.apache.zeppelin.shell.ShellInterpreter,org.apache.zeppelin.tajo.TajoInterpreter,org.apache.zeppelin.file.HDFSFileInterpreter,org.apache.zeppelin.flink.FlinkInterpreter,,org.apache.zeppelin.python.PythonInterpreter,org.apache.zeppelin.lens.LensInterpreter,org.apache.zeppelin.ignite.IgniteInterpreter,org.apache.zeppelin.ignite.IgniteSqlInterpreter,org.apache.zeppelin.cassandra.CassandraInterpreter,org.apache.zeppelin.geode.GeodeOqlInterpreter,org.apache.zeppelin.postgresql.PostgreSqlInterpreter,org.apache.zeppelin.jdbc.JDBCInterpreter,org.apache.zeppelin.phoenix.PhoenixInterpreter,org.apache.zeppelin.kylin.KylinInterpreter,org.apache.zeppelin.elasticsearch.ElasticsearchInterpreter,org.apache.zeppelin.scalding.ScaldingInterpreter,org.apache.zeppelin.alluxio.AlluxioInterpreter,org.apache.zeppelin.hbase.HbaseInterpreter,org.apache.zeppelin.livy.LivySparkInterpreter,org.apache.zeppelin.livy.LivyPySparkInterpreter,org.apache.zeppelin.livy.LivySparkRInterpreter,org.apache.zeppelin.livy.LivySparkSQLInterpreter</value>
<description>Comma separated interpreter configurations. First interpreter become a default</description>
</property>

Expand Down
2 changes: 1 addition & 1 deletion docs/development/writingzeppelininterpreter.md
Original file line number Diff line number Diff line change
Expand Up @@ -199,7 +199,7 @@ Checkout some interpreters released with Zeppelin by default.
- [spark](https://github.com/apache/incubator-zeppelin/tree/master/spark)
- [markdown](https://github.com/apache/incubator-zeppelin/tree/master/markdown)
- [shell](https://github.com/apache/incubator-zeppelin/tree/master/shell)
- [hive](https://github.com/apache/incubator-zeppelin/tree/master/hive)
- [jdbc](https://github.com/apache/incubator-zeppelin/tree/master/jdbc)

### Contributing a new Interpreter to Zeppelin releases

Expand Down
2 changes: 1 addition & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ limitations under the License.
### Multiple language backend

Zeppelin interpreter concept allows any language/data-processing-backend to be plugged into Zeppelin.
Currently Zeppelin supports many interpreters such as Scala(with Apache Spark), Python(with Apache Spark), SparkSQL, Hive, Markdown and Shell.
Currently Zeppelin supports many interpreters such as Scala(with Apache Spark), Python(with Apache Spark), SparkSQL, JDBC, Markdown and Shell.

<img class="img-responsive" src="/assets/themes/zeppelin/img/screenshots/multiple_language_backend.png" />

Expand Down
2 changes: 1 addition & 1 deletion docs/install/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -232,7 +232,7 @@ You can configure Zeppelin with both **environment variables** in `conf/zeppelin
<td>ZEPPELIN_INTERPRETERS</td>
<td>zeppelin.interpreters</td>
<description></description>
<td>org.apache.zeppelin.spark.SparkInterpreter,<br />org.apache.zeppelin.spark.PySparkInterpreter,<br />org.apache.zeppelin.spark.SparkSqlInterpreter,<br />org.apache.zeppelin.spark.DepInterpreter,<br />org.apache.zeppelin.markdown.Markdown,<br />org.apache.zeppelin.shell.ShellInterpreter,<br />org.apache.zeppelin.hive.HiveInterpreter<br />
<td>org.apache.zeppelin.spark.SparkInterpreter,<br />org.apache.zeppelin.spark.PySparkInterpreter,<br />org.apache.zeppelin.spark.SparkSqlInterpreter,<br />org.apache.zeppelin.spark.DepInterpreter,<br />org.apache.zeppelin.markdown.Markdown,<br />org.apache.zeppelin.shell.ShellInterpreter,<br />
...
</td>
<td>Comma separated interpreter configurations [Class] <br /> The first interpreter will be a default value. <br /> It means only the first interpreter in this list can be available without <code>%interpreter_name</code> annotation in Zeppelin notebook paragraph. </td>
Expand Down
12 changes: 4 additions & 8 deletions docs/install/yarn_install.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ limitations under the License.
{% include JB/setup %}

## Introduction
This page describes how to pre-configure a bare metal node, configure Zeppelin and connect it to existing YARN cluster running Hortonworks flavour of Hadoop. It also describes steps to configure Spark & Hive interpreter of Zeppelin.
This page describes how to pre-configure a bare metal node, configure Zeppelin and connect it to existing YARN cluster running Hortonworks flavour of Hadoop. It also describes steps to configure Spark interpreter of Zeppelin.

## Prepare Node

Expand Down Expand Up @@ -118,16 +118,12 @@ bin/zeppelin-daemon.sh stop
```

## Interpreter
Zeppelin provides various distributed processing frameworks to process data that ranges from Spark, Hive, Tajo, Ignite and Lens to name a few. This document describes to configure Hive & Spark interpreters.
Zeppelin provides various distributed processing frameworks to process data that ranges from Spark, JDBC, Tajo, Ignite and Lens to name a few. This document describes to configure JDBC & Spark interpreters.

### Hive
Zeppelin supports Hive interpreter and hence copy hive-site.xml that should be present at /etc/hive/conf to the configuration folder of Zeppelin. Once Zeppelin is built it will have conf folder under /home/zeppelin/incubator-zeppelin.
Zeppelin supports Hive through JDBC interpreter. You might need the information to use Hive and can find in your hive-site.xml

```bash
cp /etc/hive/conf/hive-site.xml /home/zeppelin/incubator-zeppelin/conf
```

Once Zeppelin server has started successfully, visit http://[zeppelin-server-host-name]:8080 with your web browser. Click on Interpreter tab next to Notebook dropdown. Look for Hive configurations and set them appropriately. By default hive.hiveserver2.url will be pointing to localhost and hive.hiveserver2.password/hive.hiveserver2.user are set to hive/hive. Set them as per Hive installation on YARN cluster.
Once Zeppelin server has started successfully, visit http://[zeppelin-server-host-name]:8080 with your web browser. Click on Interpreter tab next to Notebook dropdown. Look for Hive configurations and set them appropriately. Set them as per Hive installation on YARN cluster.
Click on Save button. Once these configurations are updated, Zeppelin will prompt you to restart the interpreter. Accept the prompt and the interpreter will reload the configurations.

### Spark
Expand Down
45 changes: 45 additions & 0 deletions docs/interpreter/hive.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,51 @@ group: manual
## Hive Interpreter for Apache Zeppelin
The [Apache Hive](https://hive.apache.org/) ™ data warehouse software facilitates querying and managing large datasets residing in distributed storage. Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL. At the same time this language also allows traditional map/reduce programmers to plug in their custom mappers and reducers when it is inconvenient or inefficient to express this logic in HiveQL.

## Important Notice
Hive Interpreter will be deprecated and merged into JDBC Interpreter. You can use Hive Interpreter by using JDBC Interpreter with same functionality. See the example below of settings and dependencies.

### Properties
<table class="table-configuration">
<tr>
<th>Property</th>
<th>Value</th>
</tr>
<tr>
<td>hive.driver</td>
<td>org.apache.hive.jdbc.HiveDriver</td>
</tr>
<tr>
<td>hive.url</td>
<td>jdbc:hive2://localhost:10000</td>
</tr>
<tr>
<td>hive.user</td>
<td>hiveUser</td>
</tr>
<tr>
<td>hive.password</td>
<td>hivePassword</td>
</tr>
</table>

### Dependencies
<table class="table-configuration">
<tr>
<th>Artifact</th>
<th>Exclude</th>
</tr>
<tr>
<td>org.apache.hive:hive-jdbc:0.14.0</td>
<td></td>
</tr>
<tr>
<td>org.apache.hadoop:hadoop-common:2.6.0</td>
<td></td>
</tr>
</table>

----

### Configuration
<table class="table-configuration">
<tr>
Expand Down
41 changes: 41 additions & 0 deletions docs/interpreter/jdbc.md
Original file line number Diff line number Diff line change
Expand Up @@ -195,6 +195,47 @@ To develop this functionality use this [method](http://docs.oracle.com/javase/7/
</tr>
</table>

### Examples
#### Hive
##### Properties
<table class="table-configuration">
<tr>
<th>Name</th>
<th>Value</th>
</tr>
<tr>
<td>hive.driver</td>
<td>org.apache.hive.jdbc.HiveDriver</td>
</tr>
<tr>
<td>hive.url</td>
<td>jdbc:hive2://localhost:10000</td>
</tr>
<tr>
<td>hive.user</td>
<td>hive_user</td>
</tr>
<tr>
<td>hive.password</td>
<td>hive_password</td>
</tr>
</table>
##### Dependencies
<table class="table-configuration">
<tr>
<th>Artifact</th>
<th>Excludes</th>
</tr>
<tr>
<td>org.apache.hive:hive-jdbc:0.14.0</td>
<td></td>
</tr>
<tr>
<td>org.apache.hadoop:hadoop-common:2.6.0</td>
<td></td>
</tr>
</table>

### How to use

#### Reference in paragraph
Expand Down
2 changes: 1 addition & 1 deletion docs/manual/interpreters.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ limitations under the License.
## Interpreters in Zeppelin
In this section, we will explain about the role of interpreters, interpreters group and interpreter settings in Zeppelin.
The concept of Zeppelin interpreter allows any language/data-processing-backend to be plugged into Zeppelin.
Currently, Zeppelin supports many interpreters such as Scala ( with Apache Spark ), Python ( with Apache Spark ), SparkSQL, Hive, Markdown, Shell and so on.
Currently, Zeppelin supports many interpreters such as Scala ( with Apache Spark ), Python ( with Apache Spark ), SparkSQL, JDBC, Markdown, Shell and so on.

## What is Zeppelin interpreter?
Zeppelin Interpreter is a plug-in which enables Zeppelin users to use a specific language/data-processing-backend. For example, to use Scala code in Zeppelin, you need `%spark` interpreter.
Expand Down
2 changes: 1 addition & 1 deletion docs/security/interpreter_authorization.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ Interpreter authorization involves permissions like creating an interpreter and

Data source authorization involves authenticating to the data source like a Mysql database and letting it determine user permissions.

For the Hive interpreter, we need to maintain per-user connection pools.
For the JDBC interpreter, we need to maintain per-user connection pools.
The interpret method takes the user string as parameter and executes the jdbc call using a connection in the user's connection pool.

In case of Presto, we don't need password if the Presto DB server runs backend code using HDFS authorization for the user.
Expand Down
165 changes: 0 additions & 165 deletions hive/pom.xml

This file was deleted.

Loading