In [None]:
# setup-- 
import os
import pyspark
from splicemachine.spark.context import PySpliceContext
from pyspark.conf import SparkConf
from pyspark.sql import SparkSession

# make sure pyspark tells workers to use python3 not 2 if both are installed
os.environ['PYSPARK_PYTHON'] = '/usr/bin/python3'
jdbc_host = os.environ['JDBC_HOST']

conf = pyspark.SparkConf()
sc = pyspark.SparkContext(conf=conf)

spark = SparkSession.builder.config(conf=conf).getOrCreate()

splicejdbc=f"jdbc:splice://{jdbc_host}:1527/splicedb;user=splice;password=admin"

splice = PySpliceContext(spark, splicejdbc)


<link rel="stylesheet" href="https://doc.splicemachine.com/zeppelin/css/zepstyles.css" />

# Running Queries in Splice Machine
<link rel="stylesheet" href="https://doc.splicemachine.com/zeppelin/css/zepstyles.css" />

Now that we have data imported into our database, we can run some simple queries in our notebook.

## A Simple SQL SELECT statement

Splice Machine supports ANSI SQL. Our example query uses an SQL `SELECT` statement to select records from a table. This query makes use of the sample data that we imported in the previous tutorial, *Importing Data*. 

This query selects all records in the `import_example` table that have `100` as the value of column `i`; try it by clicking the  <img class="inline" src="https://doc.splicemachine.com/zeppelin/images/zepPlayIcon.png" alt="Run Zep Paragraph Icon"> *Run* button in the  the next paragraph.


In [None]:
%%sql 

select * from import_example
where i = 100

## EXPLAINing Queries

If you have a query that is not performing as expected, you can run the `explain` command to display the execution plan for the query.

All you need to do is put `EXPLAIN` in front of the query and run that. This generates the plan, but does not actually run the query. Try it by clicking the  <img class="inline" src="https://doc.splicemachine.com/zeppelin/images/zepPlayIcon.png" alt="Run Zep Paragraph Icon"> *Run* button in the next paragraph.

In [None]:
%%sql 

explain select * from import_example a, import_example b
where a.i = 100

<br />
To see the flow of the execution of the query, look at the generated plan from the *bottom up.*  The very first steps of the query are at the bottom, then each step follows above. You can see the costs and row count estimates for each step.

In the *explain* example that we just ran, we can see we are scanning table `import_example` twice, then joining them with a particular strategy; in this case, the strategy is a nested join loop.

The final steps, `Scroll Insensitive` and `Cursor` are typical end steps to the query execution.  There is one __very important__ piece of information shown on the `Cursor` line at the end:

    Cursor(n=5,rows=360,updateMode=, engine=control)

This line shows you which *engine* is used for the query. The engine parameter indicates which engine Splice Machine plans to use. 

<div class="noteIcon">
<p>As you may know, Splice Machine is a dual-engine database:</p>
<ul style="margin-bottom:0; padding-bottom:0">
<li>Fast-running queries (e.g. those only processing a few rows) typically get executed on the <code>control</code> side, directly in HBase.</li>
<li>Longer-running queries or queries that process a lot of data go through <code>Spark</code>.</li>
</ul>
</div>

We'll cover more about the engines, and the Spark engine in particular, in a later Tutorial.


### Where to Go Next

By now, you've probably jumped ahead and run queries against your own data. It's possible that some of those queries did not run as quickly as you expected. Our next Tutorial, <a href="./2.5%20Tuning%20for%20Performance%20Tutorial.ipynb">Tuning Queries for Performance</a>, introduces the important elements required to make queries fast. 

You'll see noticeable performance improvements by tuning your queries using the simple methods explained there.