In [None]:
import os
os.environ['JDBC_HOST'] = 'jrtest01-splice-hregion'

<link rel="stylesheet" href="https://doc.splicemachine.com/zeppelin/css/zepstyles.css" />

# Using the Database Console User Interface

You may recall that Splice Machine has a dual-engine architecture can run statements and queries directly in HBase (the `control` side) or in Apache Spark. You can see which engine is used (`control` or `Spark`) from examining the top line of the `explain` for a query. Fast queries that run in milliseconds are sent directly to the control engine, while larger queries processing more data go to the Spark engine.

We'll now dig into the DB Console UI, which monitors queries sent to the Spark engine. You can use this very useful utility to follow the progress of queries running in Spark, monitor GC usage, and terminate queries when necessary.

## Accessing the Console UI

When you click on your cluster from the Splice Management screen, you'll see the *DB Console* button in the upper right-hand corner.
<img class="fitwidth" src="https://doc.splicemachine.com/zeppelin/images/zepAccessConsoleUI.png" alt="Accessing the Console UI from Your Dashboard">

Click that button to access the DB Console ( *Spark* ) UI:
<img class="fitwidth" src="https://doc.splicemachine.com/zeppelin/images/zepSparkJobs.png" alt="Console UI Top-Level Display">


## Basic UI Features

Before we use the console to examine a specific query, let's go over a few interesting notes about the DB Console:

* Queries are reported as *Jobs* in the Spark UI
* Each Job will have *Stages*
* Each Stage will have *Tasks*

#### Drilling Down

In general, you can click anything that displays as a <span class="ConsoleLink">blue link</span> to drill down into a more detailed view. For example,if you were  looking at the following in the Console, you could click <span class="ConsoleLink">Explain</span> in the following description from the completed jobs table, which will drill down into the job details for *Job 113*:
<img class="fitwidth" src="https://doc.splicemachine.com/zeppelin/images/zepConsoleDrillDown.png" alt="Drilling Down in the Console UI">

You can continue to drill down from there to reveal increasing levels of detail. In the next section of this Tutorial, we will view job details and then drill down for an example query.

#### Switching Views

You can quickly switch to a different view by clicking a tab in the tab bar at the top of the console screen. The *Jobs* tab is selected in this screen shot:
<img class="fitwidth" src="https://doc.splicemachine.com/zeppelin/images/zepConsoleTabs.png" alt="Console UI Main Tabs display">

#### Hovering

You can hover the cursor over interface element links, like the <span class="ConsoleLink">Event Timeline</span> drop-down in the following image, to display a screen tip for the item:
<img class="fitwidth" src="https://doc.splicemachine.com/zeppelin/images/zepConsoleHover.png" alt="Console UI Event Timeline drop-down">

Similarly, you can hover over the ? to display the definition for a term, like the definition of a job:
<img class="fitwidth" src="https://doc.splicemachine.com/zeppelin/images/zepConsoleHover2.png" alt="Hovering to display term definition in the Console UI">

And you can hover over an event in timeline display to see summary information; for example:
<img class="fitwidth" src="https://doc.splicemachine.com/zeppelin/images/zepConsoleTimelineHover.png" alt="Hovering over the Console UI Timeline Display">


## Running a Basic Query

Let's run a simple query that we set up in the previous Notebook.

Before we run the query, let's first generate the `EXPLAIN` plan for this query by clicking *Shift + Enter*:



In [None]:
%%sql 

explain select count(*) from index_example

<br />

Notice the `engine=Spark` on the top line, which indicates that this query will be processed by the Spark engine.

Now let's actually run the query:

In [None]:
%%sql 

select count(*) from index_example

<br />

Since we have run a Spark query, we can now use the DB Console ( *Spark* ) UI to view the query.  You should see something like this:
<img class="fitwidth" src="https://doc.splicemachine.com/zeppelin/images/zepSparkJob2.png" alt="Viewing a query in the console UI">

If you got to the DB Console quickly enough after running the query, it may show as an *Active Job* instead of being a *Completed Job.*

<p class="noteNote">The Spark DB Console is not accessible until you've run at least one Spark query on your cluster.</p>


## Drilling Down into Our Results

Let's examine the Stages of the Job we just ran by starting on the Jobs page and clicking <span class="ConsoleLink">Produce Result Set</span> for the above query. You'll see the *Job Detail* display for the query:

  <img class="fitwidth" src="https://doc.splicemachine.com/zeppelin/images/zepJobDetail1.png" alt="Job Detail in the Console UI">

Note that the detail includes this information:

* This job has two Stages
* Each Stage has a duration
* Each Stage in this Job ran one Task

### Viewing Job Details Graphically

You can see a graphical representation of the actual activity performed within the Job's Stages by clicking the <span class="ConsoleLink">DAG Visualization</span> link above the *Completed Stages* section of the Job Details display. Here's what that looks like for our example query:

  <img class="fitwidth" src="https://doc.splicemachine.com/zeppelin/images/zepJobDetailDag.png" alt="Directed Graph Visualization in the Console UI">

Note that this is essentially another view of the EXPLAIN plan for this query, with the execution flow depicted by the arrows.


### Viewing Stage Details

To drill down into the detail of the first Stage of our query, click anywhere in the box representing the Stage (`Stage 135` in this context) in the DAG visualization. The Console displays the details of that Stage:

  <img class="fitwidth" src="https://doc.splicemachine.com/zeppelin/images/zepStageDetailDag.png" alt="Viewing details of a stage in the Console UI">

The DAG Visualization for the Stage is shown at the top of this view; you can hide the DAG by clicking the <span class="ConsoleLink">DAG Visualization</span> link, or you can scroll down below the graph to see the *Summary Metrics* for the Stage:

  <img class="fitwidth" src="https://doc.splicemachine.com/zeppelin/images/zepStageDetails.png" alt="Stage Details ">

At the very bottom of this view, we see *Tasks.*  These are the most basic work units in the Spark Engine. For each task you will see:

* a duration
* garbage collection time
* other information relevant to the task activity

In the above example, we see that this Task:

* performed a TableScan
* read 1310720 rows
* wrote out 71 bytes of records for the next processing Stage



### Viewing the Event Timeline

You can get another view of the current Stage by clicking the <span class="ConsoleLink">Event Timeline</span> link; the Console the displays all tasks in this stage on a timeline:
  <img class="fitwidth" src="https://doc.splicemachine.com/zeppelin/images/zepStageTimeline.png" alt="Viewing the event timeline for a stage in the Console UI">

This view is especially useful when a Stage has many tasks, and you want to see how many executors and how much parallelism is being achieved for this stage of the query. More on this in a moment.

### Parallelism and Spark

The power of Splice Machine in performing large analytic queries quickly lies in its ability to run those queries with parallel resources.  Spark has the capability of running a number of Job/Stages/Tasks in parallel.  How much parallelism you see, and where, depends on the following:

<table class="splicezep">
    <col />
    <col />
    <thead>
        <tr>
            <th>Parallelism Factor</th>
            <th>Description</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td>Active executors</td>
            <td><p>How many executors are running your query or queries?</p>
                <p>The executor count starts at 4 and is equal to your selected number of *OLAP Splice Units*; each executor can run typically 4 tasks in parallel, which means that a cluster configured for 4 *OLAP Splice Units* can run up to 16 tasks in parallel.</p></td>
        </tr>
        <tr>
            <td>Tasks per stage in one query</td>
            <td><p>Our example data set contains only 1 million rows; as a result, our example query had only 1 Task per Stage. With more data in your tables, you will see more tasks in parallel in a given Stage.</p>
                <p>Splice Machine will dynamically split up the workload across many tasks for large data sets.</p></td>
        </tr>
        <tr>
            <td># of Queries being run simultaneously</td>
            <td>Spark can run queries simultaneously with available resources.</td>
        </tr>
    </tbody>
</table>



### Summary and Where to Go Next

As you've seen in this Tutorial, the Database Console UI is extremely useful in getting a view into how well your queries are getting processed.  Once you have your data loaded at or near target scale, if you are not seeing good throughput (task activity, etc), contact <a  href="mailto:support@splicemachine.com">support@splicemachine.com</a>.

Please proceed to the next Tutorial Notebook, <a href="./2.7%20Explaining%20and%20Hinting%20Tutorial.ipynb">Explaining and Hints</a>, to discover more about Splice Machine query tuning.