From a994f4ecf913129c37fd8cbabf7cfbb1264b8add Mon Sep 17 00:00:00 2001 From: Alex Ott Date: Fri, 1 Jun 2018 11:42:19 +0200 Subject: [PATCH 1/7] improve formatting for Cassandra interpreter docs --- docs/interpreter/cassandra.md | 159 +++++++++++++++++----------------- 1 file changed, 78 insertions(+), 81 deletions(-) diff --git a/docs/interpreter/cassandra.md b/docs/interpreter/cassandra.md index e91d995093b..444fddcf112 100644 --- a/docs/interpreter/cassandra.md +++ b/docs/interpreter/cassandra.md @@ -69,27 +69,27 @@ The **Cassandra** interpreter accepts the following commands Help command - HELP + `HELP` Display the interactive help menu Schema commands - DESCRIBE KEYSPACE, DESCRIBE CLUSTER, DESCRIBE TABLES ... + `DESCRIBE KEYSPACE`, `DESCRIBE CLUSTER`, `DESCRIBE TABLES` ... Custom commands to describe the Cassandra schema Option commands - @consistency, @retryPolicy, @fetchSize ... + `@consistency`, `@retryPolicy`, `@fetchSize` ... Inject runtime options to all statements in the paragraph Prepared statement commands - @prepare, @bind, @remove_prepared + `@prepare`, `@bind`, `@remove_prepared` Let you register a prepared command and re-use it later by injecting bound values Native CQL statements - All CQL-compatible statements (SELECT, INSERT, CREATE ...) + All CQL-compatible statements (`SELECT`, `INSERT`, `CREATE`, ...) All CQL statements are executed directly against the Cassandra server @@ -107,15 +107,15 @@ SELECT * FROM users WHERE login='jdoe'; Each statement should be separated by a semi-colon ( **;** ) except the special commands below: -1. @prepare -2. @bind -3. @remove_prepare -4. @consistency -5. @serialConsistency -6. @timestamp -7. @retryPolicy -8. @fetchSize -9. @requestTimeOut +1. `@prepare` +2. `@bind` +3. `@remove_prepare` +4. `@consistency` +5. `@serialConsistency` +6. `@timestamp` +7. `@retryPolicy` +8. `@fetchSize` +9. `@requestTimeOut` Multi-line statements as well as multiple statements on the same line are also supported as long as they are separated by a semi-colon. Ex: @@ -130,7 +130,7 @@ FROM artists WHERE login='jlennon'; ``` -Batch statements are supported and can span multiple lines, as well as DDL(CREATE/ALTER/DROP) statements: +Batch statements are supported and can span multiple lines, as well as DDL (`CREATE`/`ALTER`/`DROP`) statements: ```sql @@ -429,7 +429,7 @@ Some remarks about query parameters: > 1. **many** query parameters can be set in the same paragraph > 2. if the **same** query parameter is set many time with different values, the interpreter only take into account the first value -> 3. each query parameter applies to **all CQL statements** in the same paragraph, unless you override the option using plain CQL text (like forcing timestamp with the USING clause) +> 3. each query parameter applies to **all CQL statements** in the same paragraph, unless you override the option using plain CQL text (like forcing timestamp with the `USING` clause) > 4. the order of each query parameter with regard to CQL statement does not matter ## Support for Prepared Statements @@ -463,7 +463,7 @@ saves the generated prepared statement in an **internal hash map**, using the pr > Please note that this internal prepared statement map is shared with **all notebooks** and **all paragraphs** because there is only one instance of the interpreter for Cassandra -> If the interpreter encounters **many** @prepare for the **same _statement-name_ (key)**, only the **first** statement will be taken into account. +> If the interpreter encounters **many** `@prepare` for the **same _statement-name_ (key)**, only the **first** statement will be taken into account. Example: @@ -474,7 +474,7 @@ Example: ``` For the above example, the prepared statement is `SELECT * FROM spark_demo.albums LIMIT ?`. -`SELECT * FROM spark_demo.artists LIMIT ? is ignored because an entry already exists in the prepared statements map with the key select. +`SELECT * FROM spark_demo.artists LIMIT ?` is ignored because an entry already exists in the prepared statements map with the key _select_. In the context of **Zeppelin**, a notebook can be scheduled to be executed at regular interval, thus it is necessary to **avoid re-preparing many time the same statement (considered an anti-pattern)**. @@ -488,18 +488,18 @@ Once the statement is prepared (possibly in a separated notebook/paragraph). You Bound values are not mandatory for the **@bind** statement. However if you provide bound values, they need to comply to some syntax: -* String values should be enclosed between simple quotes ( ‘ ) -* Date values should be enclosed between simple quotes ( ‘ ) and respect the formats: +* String values should be enclosed between simple quotes (**'**) +* Date values should be enclosed between simple quotes (**'**) and respect the formats (full list is in the [documentation](https://docs.datastax.com/en/cql/3.3/cql/cql_reference/timestamp_type_r.html)): 1. yyyy-MM-dd HH:MM:ss 2. yyyy-MM-dd HH:MM:ss.SSS * **null** is parsed as-is -* **boolean** (true|false) are parsed as-is +* **boolean** (`true`|`false`) are parsed as-is * collection values must follow the **[standard CQL syntax]**: - * list: [‘list_item1’, ’list_item2’, ...] - * set: {‘set_item1’, ‘set_item2’, …} - * map: {‘key1’: ‘val1’, ‘key2’: ‘val2’, …} -* **tuple** values should be enclosed between parenthesis (see **[Tuple CQL syntax]**): (‘text’, 123, true) -* **udt** values should be enclosed between brackets (see **[UDT CQL syntax]**): {stree_name: ‘Beverly Hills’, number: 104, zip_code: 90020, state: ‘California’, …} + * list: ['list_item1', 'list_item2', ...] + * set: {'set_item1', 'set_item2', …} + * map: {'key1': 'val1', 'key2': 'val2', …} +* **tuple** values should be enclosed between parenthesis (see **[Tuple CQL syntax]**): ('text', 123, true) +* **udt** values should be enclosed between brackets (see **[UDT CQL syntax]**): {stree_name: 'Beverly Hills', number: 104, zip_code: 90020, state: 'California', …} > It is possible to use the @bind statement inside a batch: > @@ -540,8 +540,7 @@ Example: AND styles CONTAINS '${style=Rock}'; {% endraw %} - -In the above example, the first CQL query will be executed for _performer='Sheryl Crow' AND style='Rock'_. +In the above example, the first CQL query will be executed for `performer='Sheryl Crow' AND style='Rock'`. For subsequent queries, you can change the value directly using the form. > Please note that we enclosed the **$\{ \}** block between simple quotes ( **'** ) because Cassandra expects a String here. @@ -550,14 +549,12 @@ For subsequent queries, you can change the value directly using the form. It is also possible to use dynamic forms for **prepared statements**: {% raw %} - @bind[select]=='${performer=Sheryl Crow|Doof|Fanfarlo|Los Paranoia}', '${style=Rock}' - {% endraw %} ## Shared states -It is possible to execute many paragraphs in parallel. However, at the back-end side, we’re still using synchronous queries. +It is possible to execute many paragraphs in parallel. However, at the back-end side, we're still using synchronous queries. _Asynchronous execution_ is only possible when it is possible to return a `Future` value in the `InterpreterResult`. It may be an interesting proposal for the **Zeppelin** project. @@ -570,7 +567,7 @@ Long story short, you have 3 available bindings: - **isolated**: _different JVM_ running a _single Interpreter instance_, one JVM for each note Using the **shared** binding, the same `com.datastax.driver.core.Session` object is used for **all** notes and paragraphs. -Consequently, if you use the **USE _keyspace name_;** statement to log into a keyspace, it will change the keyspace for +Consequently, if you use the `USE keyspace_name;` statement to log into a keyspace, it will change the keyspace for **all current users** of the **Cassandra** interpreter because we only create 1 `com.datastax.driver.core.Session` object per instance of **Cassandra** interpreter. @@ -597,41 +594,41 @@ Below are the configuration parameters and their default value. Default Value - cassandra.cluster + `cassandra.cluster` Name of the Cassandra cluster to connect to Test Cluster - cassandra.compression.protocol - On wire compression. Possible values are: NONE, SNAPPY, LZ4 - NONE + `cassandra.compression.protocol` + On wire compression. Possible values are: `NONE`, `SNAPPY`, `LZ4` + `NONE` - cassandra.credentials.username + `cassandra.credentials.username` If security is enable, provide the login none - cassandra.credentials.password + `cassandra.credentials.password` If security is enable, provide the password none - cassandra.hosts + `cassandra.hosts` Comma separated Cassandra hosts (DNS name or IP address).
- Ex: '192.168.0.12,node2,node3' + Ex: `192.168.0.12,node2,node3` - localhost + `localhost` - cassandra.interpreter.parallelism + `cassandra.interpreter.parallelism` Number of concurrent paragraphs(queries block) that can be executed 10 - cassandra.keyspace + `cassandra.keyspace` Default keyspace to connect to. @@ -640,80 +637,80 @@ Below are the configuration parameters and their default value. in all of your queries - system + `system` - cassandra.load.balancing.policy + `cassandra.load.balancing.policy` - Load balancing policy. Default = new TokenAwarePolicy(new DCAwareRoundRobinPolicy()) - To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. + Load balancing policy. Default = `new TokenAwarePolicy(new DCAwareRoundRobinPolicy())` + To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. At runtime the interpreter will instantiate the policy using Class.forName(FQCN) DEFAULT - cassandra.max.schema.agreement.wait.second + `cassandra.max.schema.agreement.wait.second` Cassandra max schema agreement wait in second 10 - cassandra.pooling.core.connection.per.host.local + `cassandra.pooling.core.connection.per.host.local` Protocol V2 and below default = 2. Protocol V3 and above default = 1 2 - cassandra.pooling.core.connection.per.host.remote + `cassandra.pooling.core.connection.per.host.remote` Protocol V2 and below default = 1. Protocol V3 and above default = 1 1 - cassandra.pooling.heartbeat.interval.seconds + `cassandra.pooling.heartbeat.interval.seconds` Cassandra pool heartbeat interval in secs 30 - cassandra.pooling.idle.timeout.seconds + `cassandra.pooling.idle.timeout.seconds` Cassandra idle time out in seconds 120 - cassandra.pooling.max.connection.per.host.local + `cassandra.pooling.max.connection.per.host.local` Protocol V2 and below default = 8. Protocol V3 and above default = 1 8 - cassandra.pooling.max.connection.per.host.remote + `cassandra.pooling.max.connection.per.host.remote` Protocol V2 and below default = 2. Protocol V3 and above default = 1 2 - cassandra.pooling.max.request.per.connection.local + `cassandra.pooling.max.request.per.connection.local` Protocol V2 and below default = 128. Protocol V3 and above default = 1024 128 - cassandra.pooling.max.request.per.connection.remote + `cassandra.pooling.max.request.per.connection.remote` Protocol V2 and below default = 128. Protocol V3 and above default = 256 128 - cassandra.pooling.new.connection.threshold.local + `cassandra.pooling.new.connection.threshold.local` Protocol V2 and below default = 100. Protocol V3 and above default = 800 100 - cassandra.pooling.new.connection.threshold.remote + `cassandra.pooling.new.connection.threshold.remote` Protocol V2 and below default = 100. Protocol V3 and above default = 200 100 - cassandra.pooling.pool.timeout.millisecs + `cassandra.pooling.pool.timeout.millisecs` Cassandra pool time out in millisecs 5000 - cassandra.protocol.version + `cassandra.protocol.version` Cassandra binary protocol version 4 @@ -722,74 +719,74 @@ Below are the configuration parameters and their default value. Cassandra query default consistency level
- Available values: ONE, TWO, THREE, QUORUM, LOCAL_ONE, LOCAL_QUORUM, EACH_QUORUM, ALL + Available values: `ONE`, `TWO`, `THREE`, `QUORUM`, `LOCAL_ONE`, `LOCAL_QUORUM`, `EACH_QUORUM`, `ALL` - ONE + `ONE` - cassandra.query.default.fetchSize + `cassandra.query.default.fetchSize` Cassandra query default fetch size 5000 - cassandra.query.default.serial.consistency + `cassandra.query.default.serial.consistency` Cassandra query default serial consistency level
- Available values: SERIAL, LOCAL_SERIAL + Available values: `SERIAL`, `LOCAL_SERIAL` - SERIAL + `SERIAL` - cassandra.reconnection.policy + `cassandra.reconnection.policy` Cassandra Reconnection Policy. - Default = new ExponentialReconnectionPolicy(1000, 10 * 60 * 1000) - To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. + Default = `new ExponentialReconnectionPolicy(1000, 10 * 60 * 1000)` + To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. At runtime the interpreter will instantiate the policy using Class.forName(FQCN) DEFAULT - cassandra.retry.policy + `cassandra.retry.policy` Cassandra Retry Policy. - Default = DefaultRetryPolicy.INSTANCE - To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. + Default = `DefaultRetryPolicy.INSTANCE` + To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. At runtime the interpreter will instantiate the policy using Class.forName(FQCN) DEFAULT - cassandra.socket.connection.timeout.millisecs + `cassandra.socket.connection.timeout.millisecs` Cassandra socket default connection timeout in millisecs 500 - cassandra.socket.read.timeout.millisecs + `cassandra.socket.read.timeout.millisecs` Cassandra socket read timeout in millisecs 12000 - cassandra.socket.tcp.no_delay + `cassandra.socket.tcp.no_delay` Cassandra socket TCP no delay true - cassandra.speculative.execution.policy + `cassandra.speculative.execution.policy` Cassandra Speculative Execution Policy. - Default = NoSpeculativeExecutionPolicy.INSTANCE - To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. + Default = `NoSpeculativeExecutionPolicy.INSTANCE` + To Specify your own policy, provide the fully qualify class name (FQCN) of your policy. At runtime the interpreter will instantiate the policy using Class.forName(FQCN) DEFAULT - cassandra.ssl.enabled + `cassandra.ssl.enabled` Enable support for connecting to the Cassandra configured with SSL. To connect to Cassandra configured with SSL use true @@ -798,14 +795,14 @@ Below are the configuration parameters and their default value. false - cassandra.ssl.truststore.path + `cassandra.ssl.truststore.path` Filepath for the truststore file to use for connection to Cassandra with SSL. - cassandra.ssl.truststore.password + `cassandra.ssl.truststore.password` Password for the truststore file to use for connection to Cassandra with SSL. From c90b61f1114c32c529dc4c5665026e31975e0ba0 Mon Sep 17 00:00:00 2001 From: Alex Ott Date: Fri, 1 Jun 2018 11:45:47 +0200 Subject: [PATCH 2/7] use same capitalization in all interpreter names --- docs/index.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/index.md b/docs/index.md index 788a979c2ab..a040077778f 100644 --- a/docs/index.md +++ b/docs/index.md @@ -134,7 +134,7 @@ limitations under the License. * [BigQuery](./interpreter/bigquery.html) * [Cassandra](./interpreter/cassandra.html) * [Elasticsearch](./interpreter/elasticsearch.html) - * [flink](./interpreter/flink.html) + * [Flink](./interpreter/flink.html) * [Geode](./interpreter/geode.html) * [Groovy](./interpreter/groovy.html) * [HBase](./interpreter/hbase.html) @@ -145,7 +145,7 @@ limitations under the License. * [Kylin](./interpreter/kylin.html) * [Lens](./interpreter/lens.html) * [Livy](./interpreter/livy.html) - * [markdown](./interpreter/markdown.html) + * [Markdown](./interpreter/markdown.html) * [Neo4j](./interpreter/neo4j.html) * [Pig](./interpreter/pig.html) * [Postgresql, HAWQ](./interpreter/postgresql.html) From bb26a2954b6979808d90ef047a5fd376d05fa357 Mon Sep 17 00:00:00 2001 From: Alex Ott Date: Fri, 1 Jun 2018 11:46:06 +0200 Subject: [PATCH 3/7] use same formatting for parser name --- docs/interpreter/markdown.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/docs/interpreter/markdown.md b/docs/interpreter/markdown.md index d5581d9b10d..609f20495ee 100644 --- a/docs/interpreter/markdown.md +++ b/docs/interpreter/markdown.md @@ -71,7 +71,6 @@ For more information, please see [Mathematical Expression](../usage/display_syst ### Markdown4j Parser -Since pegdown parser is more accurate and provides much more markdown syntax -`markdown4j` option might be removed later. But keep this parser for the backward compatibility. +Since `pegdown` parser is more accurate and provides much more markdown syntax `markdown4j` option might be removed later. But keep this parser for the backward compatibility. From 5a7950e79c40b61d872be34000a561004eebe469 Mon Sep 17 00:00:00 2001 From: Alex Ott Date: Fri, 1 Jun 2018 12:43:21 +0200 Subject: [PATCH 4/7] add missing language spec for syntax highlighting --- .../contribution/how_to_contribute_code.md | 39 ++++++---- .../contribution/how_to_contribute_website.md | 2 +- .../contribution/useful_developer_tools.md | 17 +++-- .../development/helium/writing_application.md | 6 +- .../writing_zeppelin_interpreter.md | 14 ++-- docs/interpreter/ignite.md | 8 +- docs/interpreter/jdbc.md | 8 +- docs/interpreter/kylin.md | 2 +- docs/interpreter/lens.md | 6 +- docs/interpreter/livy.md | 9 ++- docs/interpreter/neo4j.md | 6 +- docs/interpreter/python.md | 3 +- docs/interpreter/sap.md | 4 +- docs/interpreter/scalding.md | 8 +- docs/interpreter/shell.md | 9 ++- docs/setup/basics/how_to_build.md | 15 ++-- docs/setup/deployment/cdh.md | 6 +- docs/setup/deployment/docker.md | 6 +- .../deployment/flink_and_spark_cluster.md | 46 +++++------ docs/setup/deployment/spark_cluster_mode.md | 24 +++--- docs/setup/deployment/virtual_machine.md | 8 +- docs/setup/deployment/yarn_install.md | 4 +- docs/setup/operation/configuration.md | 15 ++-- docs/setup/security/authentication_nginx.md | 14 ++-- docs/setup/security/http_security_headers.md | 8 +- docs/setup/security/notebook_authorization.md | 4 +- docs/setup/security/shiro_authentication.md | 4 +- docs/setup/storage/storage.md | 76 +++++++++---------- docs/usage/display_system/basic.md | 6 +- docs/usage/interpreter/dynamic_loading.md | 2 +- docs/usage/interpreter/installation.md | 14 ++-- docs/usage/interpreter/overview.md | 4 +- docs/usage/interpreter/user_impersonation.md | 8 +- .../other_features/customizing_homepage.md | 2 +- docs/usage/other_features/zeppelin_context.md | 10 ++- 35 files changed, 221 insertions(+), 196 deletions(-) diff --git a/docs/development/contribution/how_to_contribute_code.md b/docs/development/contribution/how_to_contribute_code.md index 92b69b5c267..290c8d1a5b7 100644 --- a/docs/development/contribution/how_to_contribute_code.md +++ b/docs/development/contribution/how_to_contribute_code.md @@ -51,13 +51,13 @@ First of all, you need Zeppelin source code. The official location of Zeppelin i Get the source code on your development machine using git. -``` +```bash git clone git://git.apache.org/zeppelin.git zeppelin ``` You may also want to develop against a specific branch. For example, for branch-0.5.6 -``` +```bash git clone -b branch-0.5.6 git://git.apache.org/zeppelin.git zeppelin ``` @@ -69,19 +69,19 @@ Before making a pull request, please take a look [Contribution Guidelines](http: ### Build -``` +```bash mvn install ``` To skip test -``` +```bash mvn install -DskipTests ``` To build with specific spark / hadoop version -``` +```bash mvn install -Dspark.version=x.x.x -Dhadoop.version=x.x.x ``` @@ -93,18 +93,26 @@ For the further 1. Copy the `conf/zeppelin-site.xml.template` to `zeppelin-server/src/main/resources/zeppelin-site.xml` and change the configurations in this file if required 2. Run the following command -``` + +```bash cd zeppelin-server -HADOOP_HOME=YOUR_HADOOP_HOME JAVA_HOME=YOUR_JAVA_HOME mvn exec:java -Dexec.mainClass="org.apache.zeppelin.server.ZeppelinServer" -Dexec.args="" +HADOOP_HOME=YOUR_HADOOP_HOME JAVA_HOME=YOUR_JAVA_HOME \ +mvn exec:java -Dexec.mainClass="org.apache.zeppelin.server.ZeppelinServer" -Dexec.args="" ``` #### Option 2 - Daemon Script -> **Note:** Make sure you first run ```mvn clean install -DskipTests``` on your zeppelin root directory, otherwise your server build will fail to find the required dependencies in the local repro. +> **Note:** Make sure you first run + +```bash +mvn clean install -DskipTests +``` + +in your zeppelin root directory, otherwise your server build will fail to find the required dependencies in the local repro. or use daemon script -``` +```bash bin/zeppelin-daemon start ``` @@ -122,8 +130,7 @@ Some portions of the Zeppelin code are generated by [Thrift](http://thrift.apach To regenerate the code, install **thrift-0.9.2** and then run the following command to generate thrift code. - -``` +```bash cd /zeppelin-interpreter/src/main/thrift ./genthrift.sh ``` @@ -132,14 +139,16 @@ cd /zeppelin-interpreter/src/main/thrift Zeppelin has [set of integration tests](https://github.com/apache/zeppelin/tree/master/zeppelin-server/src/test/java/org/apache/zeppelin/integration) using Selenium. To run these test, first build and run Zeppelin and make sure Zeppelin is running on port 8080. Then you can run test using following command -``` -TEST_SELENIUM=true mvn test -Dtest=[TEST_NAME] -DfailIfNoTests=false -pl 'zeppelin-interpreter,zeppelin-zengine,zeppelin-server' +```bash +TEST_SELENIUM=true mvn test -Dtest=[TEST_NAME] -DfailIfNoTests=false \ +-pl 'zeppelin-interpreter,zeppelin-zengine,zeppelin-server' ``` For example, to run [ParagraphActionIT](https://github.com/apache/zeppelin/blob/master/zeppelin-server/src/test/java/org/apache/zeppelin/integration/ParagraphActionsIT.java), -``` -TEST_SELENIUM=true mvn test -Dtest=ParagraphActionsIT -DfailIfNoTests=false -pl 'zeppelin-interpreter,zeppelin-zengine,zeppelin-server' +```bash +TEST_SELENIUM=true mvn test -Dtest=ParagraphActionsIT -DfailIfNoTests=false \ +-pl 'zeppelin-interpreter,zeppelin-zengine,zeppelin-server' ``` You'll need Firefox web browser installed in your development environment. While CI server uses [Firefox 31.0](https://ftp.mozilla.org/pub/firefox/releases/31.0/) to run selenium test, it is good idea to install the same version (disable auto update to keep the version). diff --git a/docs/development/contribution/how_to_contribute_website.md b/docs/development/contribution/how_to_contribute_website.md index d5d3b5a28c5..1b7c2d93678 100644 --- a/docs/development/contribution/how_to_contribute_website.md +++ b/docs/development/contribution/how_to_contribute_website.md @@ -39,7 +39,7 @@ Documentation website is hosted in 'master' branch under `/docs/` dir. First of all, you need the website source code. The official location of mirror for Zeppelin is [http://git.apache.org/zeppelin.git](http://git.apache.org/zeppelin.git). Get the source code on your development machine using git. -``` +```bash git clone git://git.apache.org/zeppelin.git cd docs ``` diff --git a/docs/development/contribution/useful_developer_tools.md b/docs/development/contribution/useful_developer_tools.md index 326986afd46..17ca40307f5 100644 --- a/docs/development/contribution/useful_developer_tools.md +++ b/docs/development/contribution/useful_developer_tools.md @@ -37,7 +37,7 @@ Check [zeppelin-web: Local Development](https://github.com/apache/zeppelin/tree/ this script would be helpful when changing JDK version frequently. -``` +```bash function setjdk() { if [ $# -ne 0 ]; then # written based on OSX. @@ -59,7 +59,7 @@ you can use this function like `setjdk 1.8` / `setjdk 1.7` ### Building Submodules Selectively -``` +```bash # build `zeppelin-web` only mvn clean -pl 'zeppelin-web' package -DskipTests; @@ -71,7 +71,8 @@ mvn clean package -pl 'spark,spark-dependencies,zeppelin-server' --am -DskipTest # build spark related modules with profiles: scala 2.11, spark 2.1 hadoop 2.7 ./dev/change_scala_version.sh 2.11 -mvn clean package -Pspark-2.1 -Phadoop-2.7 -Pscala-2.11 -pl 'spark,spark-dependencies,zeppelin-server' --am -DskipTests +mvn clean package -Pspark-2.1 -Phadoop-2.7 -Pscala-2.11 \ +-pl 'spark,spark-dependencies,zeppelin-server' --am -DskipTests # build `zeppelin-server` and `markdown` with dependencies mvn clean package -pl 'markdown,zeppelin-server' --am -DskipTests @@ -79,7 +80,7 @@ mvn clean package -pl 'markdown,zeppelin-server' --am -DskipTests ### Running Individual Tests -``` +```bash # run the `HeliumBundleFactoryTest` test class mvn test -pl 'zeppelin-server' --am -DfailIfNoTests=false -Dtest=HeliumBundleFactoryTest ``` @@ -88,13 +89,15 @@ mvn test -pl 'zeppelin-server' --am -DfailIfNoTests=false -Dtest=HeliumBundleFac Make sure that Zeppelin instance is started to execute integration tests (= selenium tests). -``` +```bash # run the `SparkParagraphIT` test class -TEST_SELENIUM="true" mvn test -pl 'zeppelin-server' --am -DfailIfNoTests=false -Dtest=SparkParagraphIT +TEST_SELENIUM="true" mvn test -pl 'zeppelin-server' --am \ +-DfailIfNoTests=false -Dtest=SparkParagraphIT # run the `testSqlSpark` test function only in the `SparkParagraphIT` class # but note that, some test might be dependent on the previous tests -TEST_SELENIUM="true" mvn test -pl 'zeppelin-server' --am -DfailIfNoTests=false -Dtest=SparkParagraphIT#testSqlSpark +TEST_SELENIUM="true" mvn test -pl 'zeppelin-server' --am \ +-DfailIfNoTests=false -Dtest=SparkParagraphIT#testSqlSpark ``` diff --git a/docs/development/helium/writing_application.md b/docs/development/helium/writing_application.md index 366d3e74aea..d128671e23b 100644 --- a/docs/development/helium/writing_application.md +++ b/docs/development/helium/writing_application.md @@ -147,7 +147,7 @@ Resouce name is a string which will be compared with the name of objects in the Application may require two or more resources. Required resources can be listed inside of the json array. For example, if the application requires object "name1", "name2" and "className1" type of object to run, resources field can be -``` +```json resources: [ [ "name1", "name2", ":className1", ...] ] @@ -155,7 +155,7 @@ resources: [ If Application can handle alternative combination of required resources, alternative set can be listed as below. -``` +```json resources: [ [ "name", ":className"], [ "altName", ":altClassName1"], @@ -165,7 +165,7 @@ resources: [ Easier way to understand this scheme is -``` +```json resources: [ [ 'resource' AND 'resource' AND ... ] OR [ 'resource' AND 'resource' AND ... ] OR diff --git a/docs/development/writing_zeppelin_interpreter.md b/docs/development/writing_zeppelin_interpreter.md index acb5d8613a4..a3fc6aac23e 100644 --- a/docs/development/writing_zeppelin_interpreter.md +++ b/docs/development/writing_zeppelin_interpreter.md @@ -42,7 +42,7 @@ In 'Separate Interpreter(scoped / isolated) for each note' mode which you can se Creating a new interpreter is quite simple. Just extend [org.apache.zeppelin.interpreter](https://github.com/apache/zeppelin/blob/master/zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/Interpreter.java) abstract class and implement some methods. For your interpreter project, you need to make `interpreter-parent` as your parent project and use plugin `maven-enforcer-plugin`, `maven-dependency-plugin` and `maven-resources-plugin`. Here's one sample pom.xml -``` +```xml 4.0.0 @@ -128,7 +128,7 @@ Here is an example of `interpreter-setting.json` on your own interpreter. Finally, Zeppelin uses static initialization with the following: -``` +```java static { Interpreter.register("MyInterpreterName", MyClassName.class.getName()); } @@ -157,7 +157,7 @@ If you want to add a new set of syntax highlighting, 1. Add the `mode-*.js` file to [zeppelin-web/bower.json](https://github.com/apache/zeppelin/blob/master/zeppelin-web/bower.json) (when built, [zeppelin-web/src/index.html](https://github.com/apache/zeppelin/blob/master/zeppelin-web/src/index.html) will be changed automatically). 2. Add `language` field to `editor` object. Note that if you don't specify language field, your interpreter will use plain text mode for syntax highlighting. Let's say you want to set your language to `java`, then add: - ``` + ```json "editor": { "language": "java" } @@ -166,7 +166,7 @@ If you want to add a new set of syntax highlighting, ### Edit on double click If your interpreter uses mark-up language such as markdown or HTML, set `editOnDblClick` to `true` so that text editor opens on pargraph double click and closes on paragraph run. Otherwise set it to `false`. -``` +```json "editor": { "editOnDblClick": false } @@ -177,7 +177,7 @@ By default, `Ctrl+dot(.)` brings autocompletion list in the editor. Through `completionKey`, each interpreter can configure autocompletion key. Currently `TAB` is only available option. -``` +```json "editor": { "completionKey": "TAB" } @@ -201,7 +201,7 @@ To configure your interpreter you need to follow these steps: Property value is comma separated [INTERPRETER\_CLASS\_NAME]. For example, - ``` + ```xml zeppelin.interpreters org.apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.spark.PySparkInterpreter,org.apache.zeppelin.spark.SparkSqlInterpreter,org.apache.zeppelin.spark.DepInterpreter,org.apache.zeppelin.markdown.Markdown,org.apache.zeppelin.shell.ShellInterpreter,org.apache.zeppelin.hive.HiveInterpreter,com.me.MyNewInterpreter @@ -225,7 +225,7 @@ Note that the first interpreter configuration in zeppelin.interpreters will be t For example, -``` +```scala %myintp val a = "My interpreter" diff --git a/docs/interpreter/ignite.md b/docs/interpreter/ignite.md index 0b4e27b2720..49e432f3622 100644 --- a/docs/interpreter/ignite.md +++ b/docs/interpreter/ignite.md @@ -42,8 +42,8 @@ In order to use Ignite interpreters, you may install Apache Ignite in some simpl > **Tip. If you want to run Ignite examples on the cli not IDE, you can export executable Jar file from IDE. Then run it by using below command.** -``` -$ nohup java -jar +```bash +nohup java -jar ``` ## Configuring Ignite Interpreter @@ -96,7 +96,7 @@ In order to execute SQL query, use ` %ignite.ignitesql ` prefix.
Supposing you are running `org.apache.ignite.examples.streaming.wordcount.StreamWords`, then you can use "words" cache( Of course you have to specify this cache name to the Ignite interpreter setting section `ignite.jdbc.url` of Zeppelin ). For example, you can select top 10 words in the words cache using the following query -``` +```sql %ignite.ignitesql select _val, count(_val) as cnt from String group by _val order by cnt desc limit 10 ``` @@ -105,7 +105,7 @@ select _val, count(_val) as cnt from String group by _val order by cnt desc limi As long as your Ignite version and Zeppelin Ignite version is same, you can also use scala code. Please check the Zeppelin Ignite version before you download your own Ignite. -``` +```scala %ignite import org.apache.ignite._ import org.apache.ignite.cache.affinity._ diff --git a/docs/interpreter/jdbc.md b/docs/interpreter/jdbc.md index aee9c4ec1f7..5a8ffc99ded 100644 --- a/docs/interpreter/jdbc.md +++ b/docs/interpreter/jdbc.md @@ -738,16 +738,18 @@ The JDBC interpreter also supports interpolation of `ZeppelinContext` objects in The following example shows one use of this facility: ####In Scala cell: -``` + +```scala z.put("country_code", "KR") // ... ``` ####In later JDBC cell: + ```sql %jdbc_interpreter_name - select * from patents_list where - priority_country = '{country_code}' and filing_date like '2015-%' +select * from patents_list where +priority_country = '{country_code}' and filing_date like '2015-%' ``` Object interpolation is disabled by default, and can be enabled for all instances of the JDBC interpreter by diff --git a/docs/interpreter/kylin.md b/docs/interpreter/kylin.md index e1d27d9907b..1f2b0f3ab44 100644 --- a/docs/interpreter/kylin.md +++ b/docs/interpreter/kylin.md @@ -75,7 +75,7 @@ To get start with Apache Kylin, please see [Apache Kylin Quickstart](https://kyl ## Using the Apache Kylin Interpreter In a paragraph, use `%kylin(project_name)` to select the **kylin** interpreter, **project name** and then input **sql**. If no project name defined, will use the default project name from the above configuration. -``` +```sql %kylin(learn_project) select count(*) from kylin_sales group by part_dt ``` diff --git a/docs/interpreter/lens.md b/docs/interpreter/lens.md index 4f07c71d9f7..e41920f8366 100644 --- a/docs/interpreter/lens.md +++ b/docs/interpreter/lens.md @@ -35,8 +35,8 @@ In order to use Lens interpreters, you may install Apache Lens in some simple st 2. Before running Lens, you have to set HIVE_HOME and HADOOP_HOME. If you want to get more information about this, please refer to [here](http://lens.apache.org/lenshome/install-and-run.html#Installation). Lens also provides Pseudo Distributed mode. [Lens pseudo-distributed setup](http://lens.apache.org/lenshome/pseudo-distributed-setup.html) is done by using [docker](https://www.docker.com/). Hive server and hadoop daemons are run as separate processes in lens pseudo-distributed setup. 3. Now, you can start lens server (or stop). -``` -./bin/lens-ctl start (or stop) +```bash +./bin/lens-ctl start # (or stop) ``` ## Configuring Lens Interpreter @@ -106,7 +106,7 @@ As you can see in this video, they are using Lens Client Shell(./bin/lens-cli.sh
  • Create and Use(Switch) Databases. -``` +```sql create database newDb ``` diff --git a/docs/interpreter/livy.md b/docs/interpreter/livy.md index e4784d4513d..954eb8cfe02 100644 --- a/docs/interpreter/livy.md +++ b/docs/interpreter/livy.md @@ -177,7 +177,7 @@ Basically, you can use **spark** -``` +```scala %livy.spark sc.version ``` @@ -185,14 +185,14 @@ sc.version **pyspark** -``` +```python %livy.pyspark print "1" ``` **sparkR** -``` +```r %livy.sparkr hello <- function( name ) { sprintf( "Hello, %s", name ); @@ -209,7 +209,8 @@ This is particularly useful when multi users are sharing a Notebook server. ## Apply Zeppelin Dynamic Forms You can leverage [Zeppelin Dynamic Form](../usage/dynamic_form/intro.html). Form templates is only avalible for livy sql interpreter. -``` + +```sql %livy.sql select * from products where ${product_id=1} ``` diff --git a/docs/interpreter/neo4j.md b/docs/interpreter/neo4j.md index 37f1f8c935d..1b14127d523 100644 --- a/docs/interpreter/neo4j.md +++ b/docs/interpreter/neo4j.md @@ -75,7 +75,7 @@ In a notebook, to enable the **Neo4j** interpreter, click the **Gear** icon and In a paragraph, use `%neo4j` to select the Neo4j interpreter and then input the Cypher commands. For list of Cypher commands please refer to the official [Cyper Refcard](http://neo4j.com/docs/cypher-refcard/current/) -```bash +``` %neo4j //Sample the TrumpWorld dataset WITH @@ -92,7 +92,7 @@ The Neo4j interpreter leverages the [Network display system](../usage/display_sy This query: -```bash +``` %neo4j MATCH (vp:Person {name:"VLADIMIR PUTIN"}), (dt:Person {name:"DONALD J. TRUMP"}) MATCH path = allShortestPaths( (vp)-[*]-(dt) ) @@ -104,7 +104,7 @@ produces the following result_ ### Apply Zeppelin Dynamic Forms You can leverage [Zeppelin Dynamic Form](../usage/dynamic_form/intro.html) inside your queries. This query: -```bash +``` %neo4j MATCH (o:Organization)-[r]-() RETURN o.name, count(*), collect(distinct type(r)) AS types diff --git a/docs/interpreter/python.md b/docs/interpreter/python.md index 1965fc95697..615d82e126e 100644 --- a/docs/interpreter/python.md +++ b/docs/interpreter/python.md @@ -171,7 +171,8 @@ If Zeppelin cannot find the matplotlib backend files (which should usually be fo then the backend will automatically be set to agg, and the (otherwise deprecated) instructions below can be used for more limited inline plotting. If you are unable to load the inline backend, use `z.show(plt)`: - ```python + +```python %python import matplotlib.pyplot as plt plt.figure() diff --git a/docs/interpreter/sap.md b/docs/interpreter/sap.md index be05aee8f26..0447958f58a 100644 --- a/docs/interpreter/sap.md +++ b/docs/interpreter/sap.md @@ -98,7 +98,7 @@ If generated query contains promtps, then promtps will appear as dynamic form af Example query -``` +```sql %sap universe [Universe Name]; @@ -120,4 +120,4 @@ where and [Folder1].[Dimension4] is not null and [Folder1].[Dimension5] in ('Value1', 'Value2'); -``` \ No newline at end of file +``` diff --git a/docs/interpreter/scalding.md b/docs/interpreter/scalding.md index f2e3461d88c..b63065c4243 100644 --- a/docs/interpreter/scalding.md +++ b/docs/interpreter/scalding.md @@ -28,7 +28,7 @@ limitations under the License. ## Building the Scalding Interpreter You have to first build the Scalding interpreter by enable the **scalding** profile as follows: -``` +```bash mvn clean package -Pscalding -DskipTests ``` @@ -88,7 +88,7 @@ option and set max.open.instances argument. In example, by using the [Alice in Wonderland](https://gist.github.com/johnynek/a47699caa62f4f38a3e2) tutorial, we will count words (of course!), and plot a graph of the top 10 words in the book. -``` +```scala %scalding import scala.io.Source @@ -144,7 +144,7 @@ res4: com.twitter.scalding.Mode = Hdfs(true,Configuration: core-default.xml, cor **Test HDFS read** -``` +```scala val testfile = TypedPipe.from(TextLine("/user/x/testfile")) testfile.dump ``` @@ -153,7 +153,7 @@ This command should print the contents of the hdfs file /user/x/testfile. **Test map-reduce job** -``` +```scala val testfile = TypedPipe.from(TextLine("/user/x/testfile")) val a = testfile.groupAll.size.values a.toList diff --git a/docs/interpreter/shell.md b/docs/interpreter/shell.md index 9ab4036be4c..d44a42559c8 100644 --- a/docs/interpreter/shell.md +++ b/docs/interpreter/shell.md @@ -93,7 +93,8 @@ The shell interpreter also supports interpolation of `ZeppelinContext` objects i The following example shows one use of this facility: ####In Scala cell: -``` + +```scala z.put("dataFileName", "members-list-003.parquet") // ... val members = spark.read.parquet(z.get("dataFileName")) @@ -101,8 +102,10 @@ val members = spark.read.parquet(z.get("dataFileName")) ``` ####In later Shell cell: -``` -%sh rm -rf {dataFileName} + +```bash +%sh +rm -rf {dataFileName} ``` Object interpolation is disabled by default, and can be enabled (for the Shell interpreter) by diff --git a/docs/setup/basics/how_to_build.md b/docs/setup/basics/how_to_build.md index f78c631ab04..85a59ca91aa 100644 --- a/docs/setup/basics/how_to_build.md +++ b/docs/setup/basics/how_to_build.md @@ -51,7 +51,7 @@ If you haven't installed Git and Maven yet, check the [Build requirements](#buil #### 1. Clone the Apache Zeppelin repository -``` +```bash git clone https://github.com/apache/zeppelin.git ``` @@ -60,7 +60,7 @@ git clone https://github.com/apache/zeppelin.git You can build Zeppelin with following maven command: -``` +```bash mvn clean package -DskipTests [Options] ``` @@ -248,7 +248,7 @@ plugin.frontend.yarnDownloadRoot # default https://github.com/yarnpkg/yarn/relea If you don't have requirements prepared, install it. (The installation method may vary according to your environment, example is for Ubuntu.) -``` +```bash sudo apt-get update sudo apt-get install git sudo apt-get install openjdk-7-jdk @@ -261,7 +261,8 @@ sudo apt-get install r-cran-evaluate ### Install maven -``` + +```bash wget http://www.eu.apache.org/dist/maven/maven-3/3.3.9/binaries/apache-maven-3.3.9-bin.tar.gz sudo tar -zxf apache-maven-3.3.9-bin.tar.gz -C /usr/local/ sudo ln -s /usr/local/apache-maven-3.3.9/bin/mvn /usr/local/bin/mvn @@ -280,7 +281,7 @@ If you're behind the proxy, you'll need to configure maven and npm to pass throu First of all, configure maven in your `~/.m2/settings.xml`. -``` +```xml @@ -309,7 +310,7 @@ First of all, configure maven in your `~/.m2/settings.xml`. Then, next commands will configure npm. -``` +```bash npm config set proxy http://localhost:3128 npm config set https-proxy http://localhost:3128 npm config set registry "http://registry.npmjs.org/" @@ -318,7 +319,7 @@ npm config set strict-ssl false Configure git as well -``` +```bash git config --global http.proxy http://localhost:3128 git config --global https.proxy http://localhost:3128 git config --global url."http://".insteadOf git:// diff --git a/docs/setup/deployment/cdh.md b/docs/setup/deployment/cdh.md index 9fb508fddb2..d35292e2a92 100644 --- a/docs/setup/deployment/cdh.md +++ b/docs/setup/deployment/cdh.md @@ -29,14 +29,14 @@ limitations under the License. You can import the Docker image by pulling it from Cloudera Docker Hub. -``` +```bash docker pull cloudera/quickstart:latest ``` ### 2. Run docker -``` +```bash docker run -it \ -p 80:80 \ -p 4040:4040 \ @@ -75,7 +75,7 @@ To verify the application is running well, check the web UI for HDFS on `http:// ### 4. Configure Spark interpreter in Zeppelin Set following configurations to `conf/zeppelin-env.sh`. -``` +```bash export MASTER=yarn-client export HADOOP_CONF_DIR=[your_hadoop_conf_path] export SPARK_HOME=[your_spark_home_path] diff --git a/docs/setup/deployment/docker.md b/docs/setup/deployment/docker.md index c0cdb6966d7..746986d6080 100644 --- a/docs/setup/deployment/docker.md +++ b/docs/setup/deployment/docker.md @@ -33,7 +33,7 @@ You need to [install docker](https://docs.docker.com/engine/installation/) on yo ### Running docker image -``` +```bash docker run -p 8080:8080 --rm --name zeppelin apache/zeppelin: ``` @@ -41,7 +41,7 @@ docker run -p 8080:8080 --rm --name zeppelin apache/zeppelin: If you want to specify `logs` and `notebook` dir, -``` +```bash docker run -p 8080:8080 --rm \ -v $PWD/logs:/logs \ -v $PWD/notebook:/notebook \ @@ -52,7 +52,7 @@ docker run -p 8080:8080 --rm \ ### Building dockerfile locally -``` +```bash cd $ZEPPELIN_HOME cd scripts/docker/zeppelin/bin diff --git a/docs/setup/deployment/flink_and_spark_cluster.md b/docs/setup/deployment/flink_and_spark_cluster.md index 11188a494f1..09fe5164bec 100644 --- a/docs/setup/deployment/flink_and_spark_cluster.md +++ b/docs/setup/deployment/flink_and_spark_cluster.md @@ -48,24 +48,24 @@ For git, openssh-server, and OpenJDK 7 we will be using the apt package manager. ##### git From the command prompt: -``` +```bash sudo apt-get install git ``` ##### openssh-server -``` +```bash sudo apt-get install openssh-server ``` ##### OpenJDK 7 -``` +```bash sudo apt-get install openjdk-7-jdk openjdk-7-jre-lib ``` *A note for those using Ubuntu 16.04*: To install `openjdk-7` on Ubuntu 16.04, one must add a repository. [Source](http://askubuntu.com/questions/761127/ubuntu-16-04-and-openjdk-7) -``` bash +```bash sudo add-apt-repository ppa:openjdk-r/ppa sudo apt-get update sudo apt-get install openjdk-7-jdk openjdk-7-jre-lib @@ -76,26 +76,26 @@ Zeppelin requires maven version 3.x. The version available in the repositories Purge any existing versions of maven. -``` +```bash sudo apt-get purge maven maven2 ``` Download the maven 3.3.9 binary. -``` +```bash wget "http://www.us.apache.org/dist/maven/maven-3/3.3.9/binaries/apache-maven-3.3.9-bin.tar.gz" ``` Unarchive the binary and move to the `/usr/local` directory. -``` +```bash tar -zxvf apache-maven-3.3.9-bin.tar.gz sudo mv ./apache-maven-3.3.9 /usr/local ``` Create symbolic links in `/usr/bin`. -``` +```bash sudo ln -s /usr/local/apache-maven-3.3.9/bin/mvn /usr/bin/mvn ``` @@ -105,19 +105,19 @@ This provides a quick overview of Zeppelin installation from source, however the From the command prompt: Clone Zeppelin. -``` +```bash git clone https://github.com/apache/zeppelin.git ``` Enter the Zeppelin root directory. -``` +```bash cd zeppelin ``` Package Zeppelin. -``` +```bash mvn clean package -DskipTests -Pspark-1.6 -Dflink.version=1.1.3 -Pscala-2.10 ``` @@ -145,7 +145,7 @@ As long as you didn't edit any code, it is unlikely the build is failing because Start the Zeppelin daemon. -``` +```bash bin/zeppelin-daemon.sh start ``` @@ -238,7 +238,7 @@ Run the code to make sure the built-in Zeppelin Flink interpreter is working pro Finally, stop the Zeppelin daemon. From the command prompt run: -``` +```bash bin/zeppelin-daemon.sh stop ``` @@ -273,7 +273,7 @@ See the [Flink Installation guide](https://github.com/apache/flink/blob/master/R Return to the directory where you have been downloading, this tutorial assumes that is `$HOME`. Clone Flink, check out release-1.1.3-rc2, and build. -``` +```bash cd $HOME git clone https://github.com/apache/flink.git cd flink @@ -283,7 +283,7 @@ mvn clean install -DskipTests Start the Flink Cluster in stand-alone mode -``` +```bash build-target/bin/start-cluster.sh ``` @@ -297,14 +297,16 @@ In a browser, navigate to http://`yourip`:8082 to see the Flink Web-UI. Click o If no task managers are present, restart the Flink cluster with the following commands: (if binaries) -``` + +```bash flink-1.1.3/bin/stop-cluster.sh flink-1.1.3/bin/start-cluster.sh ``` (if built from source) -``` + +```bash build-target/bin/stop-cluster.sh build-target/bin/start-cluster.sh ``` @@ -339,13 +341,13 @@ Return to the directory where you have been downloading, this tutorial assumes t the time of writing. You are free to check out other version, just make sure you build Zeppelin against the correct version of Spark. However if you use Spark 2.0, the word count example will need to be changed as Spark 2.0 is not compatible with the following examples. -``` +```bash cd $HOME ``` Clone, check out, and build Spark version 1.6.x. -``` +```bash git clone https://github.com/apache/spark.git cd spark git checkout branch-1.6 @@ -362,7 +364,7 @@ cd $HOME Start the Spark cluster in stand alone mode, specifying the webui-port as some port other than 8080 (the webui-port of Zeppelin). -``` +```bash spark/sbin/start-master.sh --webui-port 8082 ``` **Note:** Why `--webui-port 8082`? There is a digression toward the end of this document that explains this. @@ -375,13 +377,13 @@ Toward the top of the page there will be a *URL*: spark://`yourhost`:7077. Note Start the slave using the URI from the Spark master WebUI: -``` +```bash spark/sbin/start-slave.sh spark://yourhostname:7077 ``` Return to the root directory and start the Zeppelin daemon. -``` +```bash cd $HOME zeppelin/bin/zeppelin-daemon.sh start diff --git a/docs/setup/deployment/spark_cluster_mode.md b/docs/setup/deployment/spark_cluster_mode.md index 7abaecdd1da..94102bf0abe 100644 --- a/docs/setup/deployment/spark_cluster_mode.md +++ b/docs/setup/deployment/spark_cluster_mode.md @@ -38,14 +38,14 @@ You can simply set up Spark standalone environment with below steps. ### 1. Build Docker file You can find docker script files under `scripts/docker/spark-cluster-managers`. -``` +```bash cd $ZEPPELIN_HOME/scripts/docker/spark-cluster-managers/spark_standalone docker build -t "spark_standalone" . ``` ### 2. Run docker -``` +```bash docker run -it \ -p 8080:8080 \ -p 7077:7077 \ @@ -70,7 +70,7 @@ After running single paragraph with Spark interpreter in Zeppelin, browse `https You can also simply verify that Spark is running well in Docker with below command. -``` +```bash ps -ef | grep spark ``` @@ -83,14 +83,14 @@ You can simply set up [Spark on YARN](http://spark.apache.org/docs/latest/runnin ### 1. Build Docker file You can find docker script files under `scripts/docker/spark-cluster-managers`. -``` +```bash cd $ZEPPELIN_HOME/scripts/docker/spark-cluster-managers/spark_yarn_cluster docker build -t "spark_yarn" . ``` ### 2. Run docker -``` +```bash docker run -it \ -p 5000:5000 \ -p 9000:9000 \ @@ -120,7 +120,7 @@ Note that `sparkmaster` hostname used here to run docker container should be def You can simply verify the processes of Spark and YARN are running well in Docker with below command. -``` +```bash ps -ef ``` @@ -129,7 +129,7 @@ You can also check each application web UI for HDFS on `http://:50070/ ### 4. Configure Spark interpreter in Zeppelin Set following configurations to `conf/zeppelin-env.sh`. -``` +```bash export MASTER=yarn-client export HADOOP_CONF_DIR=[your_hadoop_conf_path] export SPARK_HOME=[your_spark_home_path] @@ -154,7 +154,7 @@ You can simply set up [Spark on Mesos](http://spark.apache.org/docs/latest/runni ### 1. Build Docker file -``` +```bash cd $ZEPPELIN_HOME/scripts/docker/spark-cluster-managers/spark_mesos docker build -t "spark_mesos" . ``` @@ -162,7 +162,7 @@ docker build -t "spark_mesos" . ### 2. Run docker -``` +```bash docker run --net=host -it \ -p 8080:8080 \ -p 7077:7077 \ @@ -183,7 +183,7 @@ Note that `sparkmaster` hostname used here to run docker container should be def You can simply verify the processes of Spark and Mesos are running well in Docker with below command. -``` +```bash ps -ef ``` @@ -192,7 +192,7 @@ You can also check each application web UI for Mesos on `http://:5050/ ### 4. Configure Spark interpreter in Zeppelin -``` +```bash export MASTER=mesos://127.0.1.1:5050 export MESOS_NATIVE_JAVA_LIBRARY=[PATH OF libmesos.so] export SPARK_HOME=[PATH OF SPARK HOME] @@ -234,4 +234,4 @@ W0103 20:17:24.040252 339 sched.cpp:736] Ignoring framework registered message W0103 20:17:26.150250 339 sched.cpp:736] Ignoring framework registered message because it was sentfrom 'master@127.0.0.1:5050' instead of the leading master 'master@127.0.1.1:5050' W0103 20:17:26.737604 339 sched.cpp:736] Ignoring framework registered message because it was sentfrom 'master@127.0.0.1:5050' instead of the leading master 'master@127.0.1.1:5050' W0103 20:17:35.241714 336 sched.cpp:736] Ignoring framework registered message because it was sentfrom 'master@127.0.0.1:5050' instead of the leading master 'master@127.0.1.1:5050' -``` \ No newline at end of file +``` diff --git a/docs/setup/deployment/virtual_machine.md b/docs/setup/deployment/virtual_machine.md index 21beba6420e..11a36f8024b 100644 --- a/docs/setup/deployment/virtual_machine.md +++ b/docs/setup/deployment/virtual_machine.md @@ -44,7 +44,7 @@ If you are running Windows and don't yet have python installed, [install Python 1. Download and Install Vagrant: [Vagrant Downloads](http://www.vagrantup.com/downloads.html) 2. Install Ansible: [Ansible Python pip install](http://docs.ansible.com/ansible/intro_installation.html#latest-releases-via-pip) - ``` + ```bash sudo easy_install pip sudo pip install ansible ansible --version @@ -58,7 +58,7 @@ Thats it ! You can now run `vagrant ssh` and this will place you into the guest If you don't wish to build Zeppelin from scratch, run the z-manager installer script while running in the guest VM: -``` +```bash curl -fsSL https://raw.githubusercontent.com/NFLabs/z-manager/master/zeppelin-installer.sh | bash ``` @@ -67,7 +67,7 @@ curl -fsSL https://raw.githubusercontent.com/NFLabs/z-manager/master/zeppelin-in You can now -``` +```bash git clone git://git.apache.org/zeppelin.git ``` @@ -108,7 +108,7 @@ The virtual machine consists of: This assumes you've already cloned the project either on the host machine in the zeppelin-dev directory (to be shared with the guest machine) or cloned directly into a directory while running inside the guest machine. The following build steps will also include Python and R support via PySpark and SparkR: -``` +```bash cd /zeppelin mvn clean package -Pspark-1.6 -Phadoop-2.4 -DskipTests ./bin/zeppelin-daemon.sh start diff --git a/docs/setup/deployment/yarn_install.md b/docs/setup/deployment/yarn_install.md index fc46bc2cb3d..b5967992a4a 100644 --- a/docs/setup/deployment/yarn_install.md +++ b/docs/setup/deployment/yarn_install.md @@ -105,7 +105,7 @@ hdp-select status hadoop-client | sed 's/hadoop-client - \(.*\)/\1/' ## Start/Stop ### Start Zeppelin -``` +```bash cd /home/zeppelin/zeppelin bin/zeppelin-daemon.sh start ``` @@ -113,7 +113,7 @@ After successful start, visit http://[zeppelin-server-host-name]:8080 with your ### Stop Zeppelin -``` +```bash bin/zeppelin-daemon.sh stop ``` diff --git a/docs/setup/operation/configuration.md b/docs/setup/operation/configuration.md index ed4e1f26ac0..f2e356ddc7a 100644 --- a/docs/setup/operation/configuration.md +++ b/docs/setup/operation/configuration.md @@ -368,8 +368,9 @@ A condensed example can be found in the top answer to this [StackOverflow post]( The keystore holds the private key and certificate on the server end. The trustore holds the trusted client certificates. Be sure that the path and password for these two stores are correctly configured in the password fields below. They can be obfuscated using the Jetty password tool. After Maven pulls in all the dependency to build Zeppelin, one of the Jetty jars contain the Password tool. Invoke this command from the Zeppelin home build directory with the appropriate version, user, and password. -``` -java -cp ./zeppelin-server/target/lib/jetty-all-server-.jar org.eclipse.jetty.util.security.Password +```bash +java -cp ./zeppelin-server/target/lib/jetty-all-server-.jar \ +org.eclipse.jetty.util.security.Password ``` If you are using a self-signed, a certificate signed by an untrusted CA, or if client authentication is enabled, then the client must have a browser create exceptions for both the normal HTTPS port and WebSocket port. This can by done by trying to establish an HTTPS connection to both ports in a browser (e.g. if the ports are 443 and 8443, then visit https://127.0.0.1:443 and https://127.0.0.1:8443). This step can be skipped if the server certificate is signed by a trusted CA and client auth is disabled. @@ -378,7 +379,7 @@ If you are using a self-signed, a certificate signed by an untrusted CA, or if c The following properties needs to be updated in the `zeppelin-site.xml` in order to enable server side SSL. -``` +```xml zeppelin.server.ssl.port 8443 @@ -421,7 +422,7 @@ The following properties needs to be updated in the `zeppelin-site.xml` in order The following properties needs to be updated in the `zeppelin-site.xml` in order to enable client side certificate authentication. -``` +```xml zeppelin.server.ssl.port 8443 @@ -461,7 +462,7 @@ Please notice that passwords will be stored in *plain text* by default. To encry You can generate an appropriate encryption key any way you'd like - for instance, by using the openssl tool: -``` +```bash openssl enc -aes-128-cbc -k secret -P -md sha1 ``` @@ -476,7 +477,7 @@ The Password tool documentation can be found [here](http://www.eclipse.org/jetty After using the tool: -``` +```bash java -cp $ZEPPELIN_HOME/zeppelin-server/target/lib/jetty-util-9.2.15.v20160210.jar \ org.eclipse.jetty.util.security.Password \ password @@ -489,7 +490,7 @@ MD5:5f4dcc3b5aa765d61d8327deb882cf99 update your configuration with the obfuscated password : -``` +```xml zeppelin.ssl.keystore.password OBF:1v2j1uum1xtv1zej1zer1xtn1uvk1v1v diff --git a/docs/setup/security/authentication_nginx.md b/docs/setup/security/authentication_nginx.md index be4875a43d1..705a21d251f 100644 --- a/docs/setup/security/authentication_nginx.md +++ b/docs/setup/security/authentication_nginx.md @@ -38,7 +38,7 @@ This instruction based on Ubuntu 14.04 LTS but may work with other OS with few c You can install NGINX server with same box where zeppelin installed or separate box where it is dedicated to serve as proxy server. - ``` + ```bash $ apt-get install nginx ``` > **NOTE :** On pre 1.3.13 version of NGINX, Proxy for Websocket may not fully works. Please use latest version of NGINX. See: [NGINX documentation](https://www.nginx.com/blog/websocket-nginx/). @@ -47,7 +47,7 @@ This instruction based on Ubuntu 14.04 LTS but may work with other OS with few c In most cases, NGINX configuration located under `/etc/nginx/sites-available`. Create your own configuration or add your existing configuration at `/etc/nginx/sites-available`. - ``` + ```bash $ cd /etc/nginx/sites-available $ touch my-zeppelin-auth-setting ``` @@ -95,7 +95,7 @@ This instruction based on Ubuntu 14.04 LTS but may work with other OS with few c Then make a symbolic link to this file from `/etc/nginx/sites-enabled/` to enable configuration above when NGINX reloads. - ``` + ```bash $ ln -s /etc/nginx/sites-enabled/my-zeppelin-auth-setting /etc/nginx/sites-available/my-zeppelin-auth-setting ``` @@ -103,17 +103,17 @@ This instruction based on Ubuntu 14.04 LTS but may work with other OS with few c Now you need to setup `.htpasswd` file to serve list of authenticated user credentials for NGINX server. - ``` + ```bash $ cd /etc/nginx $ htpasswd -c htpasswd [YOUR-ID] - $ NEW passwd: [YOUR-PASSWORD] - $ RE-type new passwd: [YOUR-PASSWORD-AGAIN] + NEW passwd: [YOUR-PASSWORD] + RE-type new passwd: [YOUR-PASSWORD-AGAIN] ``` Or you can use your own apache `.htpasswd` files in other location for setting up property: `auth_basic_user_file` Restart NGINX server. - ``` + ```bash $ service nginx restart ``` Then check HTTP Basic Authentication works in browser. If you can see regular basic auth popup and then able to login with credential you entered into `.htpasswd` you are good to go. diff --git a/docs/setup/security/http_security_headers.md b/docs/setup/security/http_security_headers.md index 1c55d18e184..b9f9668ad0a 100644 --- a/docs/setup/security/http_security_headers.md +++ b/docs/setup/security/http_security_headers.md @@ -32,7 +32,7 @@ It also prevents MITM attack by not allowing User to override the invalid certif The following property needs to be updated in the zeppelin-site.xml in order to enable HSTS. You can choose appropriate value for "max-age". -``` +```xml zeppelin.server.strict.transport max-age=631138519 @@ -55,7 +55,7 @@ The HTTP X-XSS-Protection response header is a feature of Internet Explorer, Chr The following property needs to be updated in the zeppelin-site.xml in order to set X-XSS-PROTECTION header. -``` +```xml zeppelin.server.xxss.protection 1; mode=block @@ -78,7 +78,7 @@ The X-Frame-Options HTTP response header can indicate browser to avoid clickjack The following property needs to be updated in the zeppelin-site.xml in order to set X-Frame-Options header. -``` +```xml zeppelin.server.xframe.options SAMEORIGIN @@ -99,7 +99,7 @@ Security conscious organisations does not want to reveal the Application Server The following property needs to be updated in the zeppelin-site.xml in order to set Server header. -``` +```xml zeppelin.server.jetty.name Jetty(7.6.0.v20120127) diff --git a/docs/setup/security/notebook_authorization.md b/docs/setup/security/notebook_authorization.md index fe0e27a37b7..6410fe97a34 100644 --- a/docs/setup/security/notebook_authorization.md +++ b/docs/setup/security/notebook_authorization.md @@ -53,13 +53,13 @@ By default, owners and writers have **write** permission, owners, writers and ru ## Separate notebook workspaces (public vs. private) By default, the authorization rights allow other users to see the newly created note, meaning the workspace is `public`. This behavior is controllable and can be set through either `ZEPPELIN_NOTEBOOK_PUBLIC` variable in `conf/zeppelin-env.sh`, or through `zeppelin.notebook.public` property in `conf/zeppelin-site.xml`. Thus, in order to make newly created note appear only in your `private` workspace by default, you can set either `ZEPPELIN_NOTEBOOK_PUBLIC` to `false` in your `conf/zeppelin-env.sh` as follows: -``` +```bash export ZEPPELIN_NOTEBOOK_PUBLIC="false" ``` or set `zeppelin.notebook.public` property to `false` in `conf/zeppelin-site.xml` as follows: -``` +```xml zeppelin.notebook.public false diff --git a/docs/setup/security/shiro_authentication.md b/docs/setup/security/shiro_authentication.md index a51f77e7098..2785bfd5407 100644 --- a/docs/setup/security/shiro_authentication.md +++ b/docs/setup/security/shiro_authentication.md @@ -46,8 +46,8 @@ Set to property **zeppelin.anonymous.allowed** to **false** in `conf/zeppelin-si ### 3. Start Zeppelin -``` -bin/zeppelin-daemon.sh start (or restart) +```bash +bin/zeppelin-daemon.sh start #(or restart) ``` Then you can browse Zeppelin at [http://localhost:8080](http://localhost:8080). diff --git a/docs/setup/storage/storage.md b/docs/setup/storage/storage.md index 6f2ace43de5..4d45b657671 100644 --- a/docs/setup/storage/storage.md +++ b/docs/setup/storage/storage.md @@ -46,7 +46,7 @@ By default, only first two of them will be automatically kept in sync by Zeppeli To enable versioning for all your local notebooks though a standard Git repository - uncomment the next property in `zeppelin-site.xml` in order to use GitNotebookRepo class: -``` +```xml zeppelin.notebook.storage org.apache.zeppelin.notebook.repo.GitNotebookRepo @@ -61,7 +61,7 @@ To enable versioning for all your local notebooks though a standard Git reposito Notes may be stored in hadoop compatible file system such as hdfs, so that multiple Zeppelin instances can share the same notes. It supports all the versions of hadoop 2.x. If you use `FileSystemNotebookRepo`, then `zeppelin.notebook.dir` is the path on the hadoop compatible file system. And you need to specify `HADOOP_CONF_DIR` in `zeppelin-env.sh` so that zeppelin can find the right hadoop configuration files. If your hadoop cluster is kerberized, then you need to specify `zeppelin.server.kerberos.keytab` and `zeppelin.server.kerberos.principal` -``` +```xml zeppelin.notebook.storage org.apache.zeppelin.notebook.repo.FileSystemNotebookRepo @@ -90,14 +90,14 @@ s3://bucket_name/username/notebook-id/ Configure by setting environment variables in the file **zeppelin-env.sh**: -``` -export ZEPPELIN_NOTEBOOK_S3_BUCKET = bucket_name -export ZEPPELIN_NOTEBOOK_S3_USER = username +```bash +export ZEPPELIN_NOTEBOOK_S3_BUCKET=bucket_name +export ZEPPELIN_NOTEBOOK_S3_USER=username ``` Or using the file **zeppelin-site.xml** uncomment and complete the S3 settings: -``` +```xml zeppelin.notebook.s3.bucket bucket_name @@ -112,7 +112,7 @@ Or using the file **zeppelin-site.xml** uncomment and complete the S3 settings: Uncomment the next property for use S3NotebookRepo class: -``` +```xml zeppelin.notebook.storage org.apache.zeppelin.notebook.repo.S3NotebookRepo @@ -122,7 +122,7 @@ Uncomment the next property for use S3NotebookRepo class: Comment out the next property to disable local git notebook storage (the default): -``` +```xml zeppelin.notebook.storage org.apache.zeppelin.notebook.repo.GitNotebookRepo @@ -136,13 +136,13 @@ Comment out the next property to disable local git notebook storage (the default To use an [AWS KMS](https://aws.amazon.com/kms/) encryption key to encrypt notebooks, set the following environment variable in the file **zeppelin-env.sh**: -``` -export ZEPPELIN_NOTEBOOK_S3_KMS_KEY_ID = kms-key-id +```bash +export ZEPPELIN_NOTEBOOK_S3_KMS_KEY_ID=kms-key-id ``` Or using the following setting in **zeppelin-site.xml**: -``` +```xml zeppelin.notebook.s3.kmsKeyID AWS-KMS-Key-UUID @@ -152,13 +152,13 @@ Or using the following setting in **zeppelin-site.xml**: In order to set custom KMS key region, set the following environment variable in the file **zeppelin-env.sh**: -``` -export ZEPPELIN_NOTEBOOK_S3_KMS_KEY_REGION = kms-key-region +```bash +export ZEPPELIN_NOTEBOOK_S3_KMS_KEY_REGION=kms-key-region ``` Or using the following setting in **zeppelin-site.xml**: -``` +```xml zeppelin.notebook.s3.kmsKeyRegion target-region @@ -172,13 +172,13 @@ Format of `target-region` is described in more details [here](http://docs.aws.am You may use a custom [``EncryptionMaterialsProvider``](https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/model/EncryptionMaterialsProvider.html) class as long as it is available in the classpath and able to initialize itself from system properties or another mechanism. To use this, set the following environment variable in the file **zeppelin-env.sh**: -``` -export ZEPPELIN_NOTEBOOK_S3_EMP = class-name +```bash +export ZEPPELIN_NOTEBOOK_S3_EMP=class-name ``` Or using the following setting in **zeppelin-site.xml**: -``` +```xml zeppelin.notebook.s3.encryptionMaterialsProvider provider implementation class name @@ -189,13 +189,13 @@ Or using the following setting in **zeppelin-site.xml**: To request server-side encryption of notebooks, set the following environment variable in the file **zeppelin-env.sh**: -``` -export ZEPPELIN_NOTEBOOK_S3_SSE = true +```bash +export ZEPPELIN_NOTEBOOK_S3_SSE=true ``` Or using the following setting in **zeppelin-site.xml**: -``` +```xml zeppelin.notebook.s3.sse true @@ -210,7 +210,7 @@ Using `AzureNotebookRepo` you can connect your Zeppelin with your Azure account First of all, input your `AccountName`, `AccountKey`, and `Share Name` in the file **zeppelin-site.xml** by commenting out and completing the next properties: -``` +```xml zeppelin.notebook.azure.connectionString DefaultEndpointsProtocol=https;AccountName=;AccountKey= @@ -226,7 +226,7 @@ First of all, input your `AccountName`, `AccountKey`, and `Share Name` in the fi Secondly, you can initialize `AzureNotebookRepo` class in the file **zeppelin-site.xml** by commenting the next property: -``` +```xml zeppelin.notebook.storage org.apache.zeppelin.notebook.repo.GitNotebookRepo @@ -236,7 +236,7 @@ Secondly, you can initialize `AzureNotebookRepo` class in the file **zeppelin-si and commenting out: -``` +```xml zeppelin.notebook.storage org.apache.zeppelin.notebook.repo.AzureNotebookRepo @@ -246,7 +246,7 @@ and commenting out: In case you want to use simultaneously your local git storage with Azure storage use the following property instead: - ``` + ```xml zeppelin.notebook.storage org.apache.zeppelin.notebook.repo.GitNotebookRepo, apache.zeppelin.notebook.repo.AzureNotebookRepo @@ -256,7 +256,7 @@ In case you want to use simultaneously your local git storage with Azure storage Optionally, you can specify Azure folder structure name in the file **zeppelin-site.xml** by commenting out the next property: - ``` + ```xml zeppelin.notebook.azure.user user @@ -271,7 +271,7 @@ Using `GCSNotebookRepo` you can connect Zeppelin with Google Cloud Storage using First, choose a GCS path under which to store notebooks. -``` +```xml zeppelin.notebook.gcs.dir @@ -284,7 +284,7 @@ First, choose a GCS path under which to store notebooks. Then, initialize the `GCSNotebookRepo` class in the file **zeppelin-site.xml** by commenting the next property: -``` +```xml zeppelin.notebook.storage org.apache.zeppelin.notebook.repo.GitNotebookRepo @@ -294,7 +294,7 @@ Then, initialize the `GCSNotebookRepo` class in the file **zeppelin-site.xml** b and commenting out: -``` +```xml zeppelin.notebook.storage org.apache.zeppelin.notebook.repo.GCSNotebookRepo @@ -304,7 +304,7 @@ and commenting out: Or, if you want to simultaneously use your local git storage with GCS, use the following property instead: - ``` +```xml zeppelin.notebook.storage org.apache.zeppelin.notebook.repo.GitNotebookRepo,org.apache.zeppelin.notebook.repo.GCSNotebookRepo @@ -360,7 +360,7 @@ export GOOGLE_APPLICATION_CREDENTIALS=/path/to/my/key.json ZeppelinHub storage layer allows out of the box connection of Zeppelin instance with your ZeppelinHub account. First of all, you need to either comment out the following property in **zeppelin-site.xml**: -``` +```xml {% include JB/setup %} -# Trouble Shooting +# Troubleshooting
    diff --git a/docs/setup/security/http_security_headers.md b/docs/setup/security/http_security_headers.md index b9f9668ad0a..ad4aeef2336 100644 --- a/docs/setup/security/http_security_headers.md +++ b/docs/setup/security/http_security_headers.md @@ -89,9 +89,9 @@ The following property needs to be updated in the zeppelin-site.xml in order to You can choose appropriate value from below. -* DENY -* SAMEORIGIN -* ALLOW-FROM _uri_ +* `DENY` +* `SAMEORIGIN` +* `ALLOW-FROM uri` ## Setting up Server Header diff --git a/docs/usage/display_system/angular_backend.md b/docs/usage/display_system/angular_backend.md index ff29102ac1d..2b9b094650f 100644 --- a/docs/usage/display_system/angular_backend.md +++ b/docs/usage/display_system/angular_backend.md @@ -99,6 +99,7 @@ In this section, we will introduce a simpler and more intuitive way of using **A Here are some usages. ### Import + ```scala // In notebook scope import org.apache.zeppelin.display.angular.notebookscope._