<h2>Welcome to Teradata SQL extension for Jupyter </h2>
<p>
This notebook provides information to help you get started with the Teradata JupyterLab kernel and extensions.  Note that you must add Teradata SQL Engine connections in order to execute SQL statements within your notebook.  
    
[Visit our landing page for more information and downloads](https://teradata.github.io/jupyterextensions/)
<p>
<h2>Features</h2>

<h3> The SQL Kernel Provides:</h3>
<ul>
<li>Connection management to add, remove, connect, and list conections</li>
<li>Query engine that uses embedded Teradata SQL driver</li>
<li>SQL aware notebook with SQL intellisense and syntax checking</li>
<li>Result set renderer that displays result data in easy to read, pageable grid</li>
<li>Execution history that stores execution metadata to recall SQL commands at a later time</li>
<li>Visualization using VegaLite library to display charts, graphs, plots, etc.</li>
<li>Magic commands that provide additional custom kernel options to enhance Teradata user experience</li>
</ul>
The Navigator allows the user to explore the SQL Engine catalog, regardless of the language you are using in your notebook (SQL, Python, R).


<h3> The Navigator Provides:</h3>
<ul>
<li>Hierarchical display of SQL object relational model</li>
<li>Column metadata showing data type and indexes</li>
<li>Row Count and Column Distribution menu options</li>
</ul>

<h4>Refer to the GettingStartedDemo notebook for a live example running Teradata SQL kernel magics.</h4>


<h2>Teradata Navigator</h2>
<p>
<ul>
<li>Launched from JupyterLab Launcher or Commands list</li>
<li>Select connection profile (Connection profiles are created using the Teradata SQL Notebook)</li>
<li>Column data type metadata provided</li>
<li>Right click options available (such as Row Count)</li>
</ul>


***
## Teradata SQL Kernel
#### A Teradata SQL Notebook can be opened by selecting the Teradata icon on the launcher page

***
## Magic commands (aka magics)
#### The Teradata SQL Kernel supports a set of magics that can be use to perform a variety of operations
Full list of magics provided with Teradata SQL Kernel can be seen by entering %help.<br><br>
Some magic commands have parameters in the form name=value, name2=value2.<br>
In these cases
1. the parameter names are not case sensitive
2. the values can be quoted with double quotes. Quoting of values is only required if the value contains any of the characters , = " \n

In [None]:
%help

#### A list of the "magics" without descriptions can be displayed with the %lsmagic command

In [None]:
%lsmagic

#### Help for a specific command can be displayed by adding the command name (without the %)

In [None]:
%help chart

***
## Managing Teradata SQL Engine connections

Before accessing a Teradata SQL Engine, we need to create a connection.<br>
This is a two step process.
1. add a new connection definition (%addconnect) (if one does not already exist)
2. connect (%connect)
***

### List existing connections with the %lsconnect command


In [None]:
%lsconnect

### Add a new connection with the %addconnect command
- **NAME** - the user-assigned name of the connection to be created
- **USER** - the user name to log in
- **HOST** - the host name or IP address

The parameter names are not case sensitive (NAME, name, Name, etc).

In [None]:
%help addconnect

### Make a connection active with the %connect command
If already connected, the %connect command will make the specified connection the active connection. No password is required in this case.

In [None]:
%help rmconnect

### Connect with the %connect command
The connection name is required. The password is required if this connection is not yet connected.
If the password is required and is not specified, a password prompt will be displayed to allow the password to be entered in a hidden text field.
It is generally more secure not to specify the password on the %connect command because the password will be visible on-screen and will be saved as part of the notebook.

In [None]:
%help connect

### Select the Teradata SQL Engine to connect to

In [None]:
%lsconnect

In [None]:
%connect teradata-vantage

#### When a connection is first established, it will prompt you for the password. It will be shown in the %lsconnect output as '*Connected'
***** indicates that this is the active connection (only one connection is **active** at a time, although more that one can be **connected**)

In [None]:
%lsconnect

### Make a connection active with the %connect command
If already connected, the %connect command will make the specified connection the active connection. No password is required in this case.

In [None]:
%connect teradata-vantage

In [None]:
%lsconnect

### Disconnect with the %disconnect command

In [None]:
%disconnect teradata-vantage

In [None]:
%lsconnect

***
## Executing SQL

### To execute SQL, simply enter the SQL in a cell and execute the cell (shift-enter or the run button in the toolbar)
#### The active connection will be used to run the SQL. Query results (if any) will be displayed in a table.
#### Hitting the Tab key provides intellisense, for example the list of databases, tables, available.  This query is just an example, you'll need to specify one that will run on your selected connection.

In [None]:
select top 25 * from dataTest.test1;

### The result of SQL statements that do not return a result set is summarized in the output cell.

In [None]:
update dataTest.test1 set ch_val = 'm' where ch_val = 'x';

### Errors are also shown in the output cell

In [None]:
select * from NonExistentTable;

### Execute SQL on a specific connection (not necessarily the active connection) with the %%connect cell magic command
#### Follow the %%connect command with SQL in the same cell

In [None]:
%%connect teradata-vantage
select top 25 * from dataTest.test1; 

#### The %%connect cell magic only specifies the connnection to use for the SQL in the same cell; it will not change the active connection

In [None]:
%lsconnect

***
## History

### List the history of SQL commands executed with the %history command

In [None]:
%help history

In [None]:
%history

- **HistID** column is the sequential id of each history item
- **ResultSetID** column is the id of the result set produced by the SQL in this history item<br>
  If the SQL produced a result set, the result set id will be enclosed in <><br>
  If no result set was produced (error, or SQL that does not return results) the result set id is not enclosed in <> and represents just a timestamp
- **ConnID** column is the name of the connection this SQL was executed on
- **SQL** column is the SQL that was executed

### By default the most recent 20 items will be displayed
#### The **limit** can be specified to change the numbe of items displayed.

In [None]:
%history 5

#### **Start** can be specified to change the history id to display from.

In [None]:
%history 5,3

***
## Visualization

### The %chart command is used to produce a graphical visualization of SQL query result sets
#### The %chart command produces and displays a __[Vega-lite](https://vega.github.io/vega-lite/)__ specification using the specified parameters and a result set as input

In [None]:
%help chart

- **x** and **y** - represent the x and y axes of the graph. These valuse must be specified.
- **title** - the title displayed above the chart (optional)
- **id** - the history id or result set id to use as input (most recently accessed if not specified)
- **labelx** - the label of the x axis (default is the x column name)
- **labely** - the label of the y axis (default is the y column name)
- **gridx** - whether to show grid lines for the x axis (default is true)
- **gridy** - whether to show grid lines for the y axis (default is true)
- **mark** - the type of chart to show (bar, line, area, point, rect, square, text, tick) (default is bar)
- **typex** and **typey** - the data type of the x and y axes __[(see the vega-lite specification)](https://vega.github.io/vega-lite/docs/type.html)__
    - q=quantitative - represents quatity values - generally numeric values
    - n=nominal - categorical data values based only on their names or categories. E.g., gender, nationality, music genre.
    - o=ordinal - represents ranked order (1st, 2nd, …) by which the data can be sorted. There is no notion of relative degree of difference between values
    - t=temporal - time and date/time values
    - default types are assigned based on the column data types
        - numeric types -> quantitative
        - time times -> temporal
        - other types -> nominal


### Execute a query, this is just an example

In [None]:
select top 25 * from appcenter_user.member_details

#### By default %chart uses the most recently accessed result set as input
#### In this case the result set in the cell above will be used.
Note that all fields in this table are defined as text fields, so **typey=q** is required to cause a column to be interpreted as quantitative (numeric).<br>

In [None]:
%chart x=id, y=amount, typey=q

#### The type of chart can be changed by specifying the **mark**

#### Changing a few more parameters

### Visualizing other result sets
#### The %chart command can also be used to visualized an older result set by specifying the **id** parameter as a **history id** or a **result set id**

#### **history id** is a numeric value

***
## Display previous result set with the %table command
#### Like the %chart command, the %table command can be used to show a previous result set based on **history id** or **result set id**

In [None]:
%history

In [None]:
%table 9

***
## Sharing Result Sets
#### Teradata result sets can easily be use in another type of notebook (e.g. Python or R)
Result sets are store under the Teradata/ResultSets directory that is created in the JupterLab working directory (the directory in which 'juptyer lab' command is executed).
Under the TeradataResultSets directory is a set of directories named with a timestamp (e.g. 2018.04.25_13.37.30.129_PDT). Each of these directories contains a single result set. Within a result set directory are two types of JSON files. The file named 00000.json is the result set metadata. It defines the data types of each column in the result set as well as some general infomation. The other files are named 00001.json, 00002.json, etc. These are result set chunk files. Each of these file contains part of the result set data. Only large result sets will be split into multiple parts; many will be contained in a single 00001.json file.<br><br>
These chunk files contain a JSON formatted array of JSON objects (name-value pair lists).<br>
For example:<br>
```
[
  {
    "Area_Name": "NEWJERSEY",
    "Area_id": "090000077",
    "Country_cd": "USA",
    "Division_id": "080000075"
  },
  {
    "Area_Name": "PANHANDLEAREA",
    "Area_id": "090000043",
    "Country_cd": "USA",
    "Division_id": "032226524"
  },
  {
    "Area_Name": "TENESSEEAREA",
    "Area_id": "090000062",
    "Country_cd": "USA",
    "Division_id": "039461986"
  }
]```

#### The easiest way to use a Teradata result set is to click the "Copy Result Set Path" button at the top of each result set table.

The "Copy Result Set Path" button will copy the displayed result set path to the clipboard. This value can then be pasted into a cell of another notebook to load the result set into that notebook.<br><br>To load a result set from a table into a Python Pandas dataframe,

1. run a SQL query or use the %table command
1. click the "Copy Result Set Path" button above the displayed table
1. paste the result set path into a Python notebook using the command **pd.read_json("<ResultSetPath>")**<br>
   for example:<br><br>
```
import pandas as pd
df = pd.read_json("/root/JupyterLabHome/TeradataResultsets/2018.04.25_13.37.30.129_PDT/00001.json")
df
```

***
# Teradata SQL Syntax coloring
<ul>
<li>Includes Keywords, UnreservedKeywords, BuiltinFunctions, Functuations</li>
</ul>

In [None]:
alter and date timestamp tinyblob begin teradata
--dbc
/*
dbc
*/

SELECT * FROm attribtion (
    
 ON యూనికోడ  AS INPUT  PARTITION BY 用户�?? OrDER BY time_stamp
 ON conversion_event_table AS convOFersion DIMENSION 
 ON optional_event_table AS optional DIMENSION 
 ON आदर�?श  AS model1 DIMENSION
 ON model2_table_fun2 AS model2 DIMENSION
 USING 
 EVENT_COLUMN_NAOOME('ঘটনা')
 TIMESTAMP_COLUMN_NAME('time_stamp') 
 WINDOW('rows:10&seconds:20') 
)ORDER By 用户�??, time_stamp;

select ARRAY_EQ ACCORDING && || + - / * $ % ^ & & 

***
# Teradata Syntax Checking
<ul>
<li>Provides Syntax Checking to validate sql.  Type shift+Tab to run syntax checking on cell contents.</li>
</ul>

***
# Teradata Intellisense
<ul>
<li>Provide Data Dictionary objects and Parser results</li>
<li>Colored types are provided for easy distinction</li>
<li>Type Tab key within cell to launch intellisense</li>
</ul>

***
## Displaying Python and R Version Information
The <b>%pyinfo</b> and <b>%rinfo</b> commands will display information about the version of Python and R and the addtional modules that are installed on the actively connected <b>Teradata Vantage</b>. If the module parameter is specified for either of the commands, the list of modules will show only the installed modules whose names start with the parameter value. The commands will only be successful if the actively connected <b>Teradata Vantage</b> has R or Python installed.

In [None]:
%connect teradata-vantage

In [None]:
%rinfo

In [None]:
%pyinfo

<h4> Not done yet? Refer to the GettingStartedDemo notebook for a live example using the Teradata SQL notebook.</h4>

Copyright 2018 Teradata. All rights reserved.