In [None]:
import os
os.environ['JDBC_HOST'] = 'jrtest01-splice-hregion'

In [None]:
# setup-- 
import os
import pyspark
from splicemachine.spark.context import PySpliceContext
from pyspark.conf import SparkConf
from pyspark.sql import SparkSession

# make sure pyspark tells workers to use python3 not 2 if both are installed
os.environ['PYSPARK_PYTHON'] = '/usr/bin/python3'
jdbc_host = os.environ['JDBC_HOST']

conf = pyspark.SparkConf()
sc = pyspark.SparkContext(conf=conf)

spark = SparkSession.builder.config(conf=conf).getOrCreate()

splicejdbc=f"jdbc:splice://{jdbc_host}:1527/splicedb;user=splice;password=admin"

splice = PySpliceContext(spark, splicejdbc)


<link rel="stylesheet" href="https://doc.splicemachine.com/zeppelin/css/zepstyles.css" />

# Welcome to Splice Machine!
This README will help you to get started, in these sections:

* *Release Notes* lists the known assumptions and limitations in the current version of our Database Service

* *About Zeppelin* provides a very quick overview of Zeppelin

* *Tutorial Notebooks* introduces the set of tutorials that we've created to help you get started with Splice Machine


## Release Notes

The following are known assumptions and limitations to the service at this time:

1. Clusters are only created in the us-east-1 region currently. We will add support for more regions later

2. For the JDBC connection, individual queries or actions will time out after one hour. For long-running queries, run within a Zeppelin notebook

3. TLS for JDBC has not yet been enabled

4. Usage graphs for clusters (CPU, Memory, and Disk) are currently intermittent

5. VPC Settings are not yet enabled but will be in a near future release

6. It is important to know that the timestamps you see in Zeppelin will be different than the timestamps you see in the Splice Spark UI, depending upon your time zone

7. At this time, cancelling queries (through Zeppelin or JDBC tools) does not work. Spark queries can be killed through the Spark UI

8. Reminder: though Splice Machine backs up your database regularly, it does not back up your Notebook changes. Please export your Notebooks regularly if you make changes



<link rel="stylesheet" href="https://doc.splicemachine.com/zeppelin/css/zepstyles.css" />

## About Zeppelin

Apache Zeppelin is a web-based notebook that enables interactive data analytics. With Zeppelin, you can easily develop data-driven, interactive and collaborative documents with a rich set of pre-built language backends (or interpreters) including Scala, SQL, Markdown, Angular, and many others.

We strongly encourage you to visit the <a href="https://zeppelin.apache.org/docs/" target="_blank">Zeppelin documentation site</a> to learn about creating, modifying, and running your own Zeppelin notebooks.


## Tutorial Notebooks

We've created a set of Notebooks to help you get up and running with your Splice Machine database and Zeppelin:

<table class="splicezep">
    <thead>
        <tr>
            <th>Section</th>
            <th>Notebook</th>
            <th>Description</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td rowspan="9">Getting Started Tutorial</td>
            <td><a href="/#/notebook/2CTKW7A6U">Notebook Basics</a></td>
            <td>Introduces using Splice Machine and Zeppelin together</td>
        </tr>
        <tr>
            <td><a href="/#/notebook/2CUA8V2RK">Copying Data to S3 for easy import</a></td>
            <td>How to copy data to S3 for easy access from Splice Machine</td>
        </tr>
        <tr>
            <td><a href="/#/notebook/2CSKWGZ8P">Importing Data into Your Database</a></td>
            <td>How to import data into your Database</td>
        </tr>
        <tr>
            <td><a href="/#/notebook/2CU1SNACA">Running Queries</a></td>
            <td>Running Splice Machine database queries in Zeppelin and applying visualizations to the results</td>
        </tr>
        <tr>
            <td><a href="/#/notebook/2CS27TE2A">Tuning Queries for Performance</a></td>
            <td>Easy Splice Machine query optimization techniques</td>
        </tr>
        <tr>
            <td><a href="/#/notebook/2CVPHXRZN">Using the DB Console UI</a></td>
            <td>Introduces the Spark DB Console, which you can use to monitor queries</td>
        </tr>
        <tr>
            <td><a href="/#/notebook/2CU4BJJDS">Using Explain and Hints</a></td>
            <td>Shows you how to use Splice Machine's Explain Plan and Hints to tune up your queries</td>
        </tr>
        <tr>
            <td><a href="/#/notebook/2CFY9Q6NS">Running the TPCH-1 Benchmark Queries</a></td>
            <td>Walks you through importing the TPCH-1 data and running the benchmark queries</td>
        </tr>
        <tr>
            <td><a href="/#/notebook/2CFMYDAYJ">Common Utilities</a></td>
            <td>A collection of Splice Machine tools and techniques to help simplify development</td>
        </tr> 
        <tr>
            <td rowspan="9">Deep Dive</td>
            <td><a href="/#/notebook/2D76NPKV6">Introduction</a></td>
            <td>This notebook introduces Splice Machine, with a brief overview of its architecture, technology stack, and SQL coverage.</td>
        </tr>
        <tr>
            <td><a href="/#/notebook/2D6ZFHE9U">The Life of a Query</a></td>
            <td>Walks you through importing and running the TPC-H benchmark data by creating a database in Splice Machine, importing the TPC-H dataset, running queries, and improving performance by indexing the data.</td>
        </tr>
        <tr>
            <td><a href="/#/notebook/2D93Y56QJ">Monitoring Queries with the Database Console</a></td>
            <td>Introduces you to the Splice Machine Database Console, which you can use to monitor and control your currently running queries.</td>
        </tr>
        <tr>
            <td><a href="/#/notebook/2D9M6ZUAW">Using Zeppelin for Graphing and Filtering</a></td>
            <td>Shows you how to use Apache Zeppelin notebooks for graphing and filtering the data in your Splice Machine databases.</td>
        </tr>
        <tr>
            <td><a href="/#/notebook/2D89MQA89">Splice Transactions with Spark and JDBC</a></td>
            <td>Introduces you to the transactional nature of Splice Machine and using JDBC to program transactions.</td>
        </tr>
        <tr>
            <td><a href="/#/notebook/2D8PMSAPJ">Creating Applications with Splice Machine</a></td>
            <td>Show you how you can easily create applications with Splice Machine.</td>
        </tr>
        <tr>
            <td><a href="/#/notebook/2D96GBY2P">Using the Splice Machine Spark Adapter</a></td>
            <td>Introduces you to Using the Splice Machine Spark Adapter for working with your database.</td>
        </tr>
        <tr>
            <td><a href="/#/notebook/2DFY9ZKYB">Machine Learning with Spark MLlib using Python</a></td>
            <td>Presents an example of Using the Spark Machine Learning Library (MLlib) with Splice Machine and Python.</td>
        </tr>
        <tr>
            <td><a href="/#/notebook/2D7P8BMMS">Machine Learning with Spark MLlib using Scala</a></td>
            <td>Presents an example of Using the Spark Machine Learning Library (MLlib) with Splice Machine and Scala.</td>
        </tr>
        <tr>
            <td>Data Engineering</td>
            <td><a href="/#/notebook/2CD73PAF8">ETL Pipeline</a></td>
            <td>A simple notebook that creates a Splice Machine ETL Pipeline</td>
        </tr>
        <tr>
            <td rowspan="2">Reference Applications</td>
            <td><a href="/#/notebook/2CKKJKSK8">Supply Chain</a></td>
            <td>Our Supply Chain application Notebook demonstrates a supply chain tool that predicts shortages.</td>
        </tr>
        <tr>
            <td><a href="/#/notebook/2CH931JT4">IOT</a></td>
            <td>Our Internet of Things Notebook demonstrates an app that tracks the movement of items from warehouses to stores in real time</td>
        </tr>
        <tr>
            <td>Benchmarks</td>
            <td><a href="/#/notebook/2CSGDX1CW">TPCH-100 Benchmark</a></td>
            <td>Walks you through importing the TPCH-100 data and running the benchmark queries</td>
        </tr>
        <tr>
            <td rowspan="3">Zeppelin Tutorial</td>
            <td><a href="/#/notebook/2A94M5J1Z">Basic Features</a></td>
            <td>A tutorial written elsewhere to introduce you to Zeppelin</td>
        </tr>
        <tr>
            <td><a href="/#/notebook/2BWJFTXKJ">R (SparkR)</a></td>
            <td>An example of using SparkR in a Zeppelin notebook</td>
        </tr>
        <tr>
            <td><a href="/#/notebook/2BYEZ5EVK">Using Mahout</a></td>
            <td>How to use Mahout with Zeppelin</td>
        </tr>
    </tbody>
</table>
 
We strongly recommend that you take the time to go through all of these Tutorial Notebooks, which will address many of your initial questions and guide you to your next steps.

<p class="noteIcon">We recommend going through the Notebooks in our <em>Getting Started Tutorial</em> in sequence, starting with <a href="./2.1%20Notebook%20Basics.ipynb">Notebook Basics</a>; these Notebooks build on results generated by previous steps to guide you through importing data, making queries, and tuning those queries for better performance.</p> 

After you've completed the Tutorials, you can explore our other Notebooks, which illustrate the database capabilities, and walk you through reference applications using Splice Machine along with other tools, including streaming, supply chain management, and machine learning.
