# TechBytes: Using Python with Teradata Vantage
## Part 1: Introduction - Connecting to Vantage

The contents of this file are Teradata Public Content and have been released to the Public Domain.
Please see _license.txt_ file in the package for more information.

Alexander Kolovos and Tim Miller - May 2021 - v.2.0 \
Copyright (c) 2021 by Teradata \
Licensed under BSD

This TechByte demonstrates some basic operations to get started with teradataml; namely, how to
* connect from a client machine to a Vantage Advanced SQL Engine Database.
* create Database tables from your data and create pandas-like teradataml DataFrames from tables.
* list and drop tables in a Vantage connection.

Contributions by:
- Alexander Kolovos, Sr Staff Software Architect, Teradata Product Engineering / Vantage Cloud and Applications.
- Tim Miller, Principal Software Architect, Teradata Product Management / Advanced Analytics.

### 1. Load libraries and create a Vantage connection

In [1]:
# Load teradataml and dependency packages.
#
import os
import getpass as gp
from teradataml import create_context, remove_context, get_context
from teradataml import DataFrame, copy_to_sql, in_schema
from teradataml import db_list_tables, db_drop_table
import pandas as pd

In [2]:
# Specify a Teradata Vantage server to connect to. In the following statement, 
# replace the following argument values with strings as follows:
# <HOST>   : Specify your target Vantage system hostname (or IP address).
# <UID>    : Specify your Database username.
# <PWD>    : Specify your password. You can also use encrypted passwords via
#            the Stored Password Protection feature.
# Notes: 
# 1. teradataml will operate inside your default database, unless you wish to
#    specify the "database" argument to use a different database <DB_Name>.
# 2. Specify the argument "temp_database_name" with a <Temp_DB_Name> for 
#    teradataml to use a different database for creation of termporary objects.
# 3. teradataml supports a variety of logon mechanisms. Simply use the
#    argument "logmech" with any of the values 'TD2', 'TDNEGO', 'LDAP', 'KRB5',
#    or 'JWT'.
#
#con = create_context(host = <HOST>, username = <UID>, password = <PWD>, 
#                     database = <DB_Name>, "temp_database_name" = <Temp_DB_Name>)
#
con = create_context(host = "tdprd.td.teradata.com", username = "ak186064",
                            password = gp.getpass(prompt='Password:'), 
                            logmech = "LDAP", database = "TRNG_TECHBYTES",
                            temp_database_name = "ak186064")

Password: ··············


### 2. Interacting with Vantage tables
Creating a Vantage table: Common ways to create Database tables from data on your client include:
- the teradataml copy_to_sql() function.
- the teradataml fastload() function (not demonstrated here; recommended when sending data of size 100,000 rows or more).
- the Fastload command-line utility (not demonstrated here; utility is available in the TTU package for the Windows or the Linux OS at downloads.teradata.com.

All of the above tasks involve at least partial data transfer from your source to the target Vantage server.

In [3]:
# Read your dataset file into a pandas DataFrame, then use the teradataml
# copy_to_sql() function to persist the DataFrame as a Vantage table.
# The table is created in the context's specified database TRNG_TECHBYTES.
#
CustomerInputDF = pd.read_csv("./Inputs/Data/Customer.csv")
copy_to_sql(CustomerInputDF, table_name = "Customer", if_exists = "replace")
CustomerInputDF.head(5)

Unnamed: 0,cust_id,income,age,years_with_bank,nbr_children,gender,marital_status,postal_code,state_code
0,20443995,29734.5,60,7,3,M,2,97237,OR
1,13628990,43715.0,34,4,3,F,2,60634,IL
2,25898368,1424.0,93,10,0,M,2,79931,TX
3,25900040,3524.1,43,10,0,F,1,90024,CA
4,17724954,28632.5,40,6,3,F,2,97203,OR


In [4]:
# Let us sample this table by creating a teradataml DataFrame from Customer.
#
tdCustomer = DataFrame("Customer")
# Using to_pandas() for a cleaner display format; involves data transfer.
tdCustomer.to_pandas().head(5)

Unnamed: 0,cust_id,income,age,years_with_bank,nbr_children,gender,marital_status,postal_code,state_code
0,20443995,29734.5,60,7,3,M,2,97237,OR
1,13628990,43715.0,34,4,3,F,2,60634,IL
2,25898368,1424.0,93,10,0,M,2,79931,TX
3,25900040,3524.1,43,10,0,F,1,90024,CA
4,17724954,28632.5,40,6,3,F,2,97203,OR


In [5]:
# Independently, assume a table Customer2 exists in database ak186064 on the
# target server. Let us create a teradataml DataFrame from Customer2, too.
#
tdCustomer2 = DataFrame(in_schema("ak186064", "Customer2"))
# Using to_pandas() for a cleaner display format; involves data transfer.
tdCustomer2.to_pandas().head(5)

Unnamed: 0_level_0,income,age,years_with_bank,nbr_children,gender,marital_status,postal_code,state_code
cust_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
28617834,2553.2,36,6,4,F,4,92804,CA
21803664,0.0,13,8,1,M,1,10104,NY
31346907,226701.8,43,3,1,F,3,60650,IL
25897836,3020.0,51,8,1,F,2,20126,DC
27258100,23226.0,39,7,2,F,2,93702,CA


In [6]:
# We can use copy_to_sql() to persist the tdCustomer2 Dataframe as a table 
# Customer2 in the TRNG_TECHBYTES database, too.
#
copy_to_sql(tdCustomer2, table_name = "Customer2", if_exists = "replace")

_Notes on uploading data:_

There is no preferred way to upload your data to an Advanced SQL Engine server; yet, keep in mind that:
1. When you `copy_to_sql()` a pandas DataFrame as a Vantage table, then data in index columns of the pandas DataFrame will be excluded from the Vantage table.
2. When using the Teradata Fastload utility to create a table, any primary indices you specify (or the default primary index from the first table column) will be interpreted as index columns in the teradataml DataFrame. In this scenario, use `to_pandas().reset_index()` with a teradataml DataFrame, if you have tasks that might require using index column data (for example, when plotting these data).

In [7]:
# At this point, TRNG_TECHBYTES contains tables "Customer" and "Customer2".
# Let us verify this by listing tables that begin with the string "custom".
#
# By default, the list operation will take place in the teradataml context
# "database"; therefore in the present example, TRNG_TECHBYTES list results
# will be displayed. The search is non-case-sensitive.
# Note: To list tables in a different database, use the db_list_tables()
#       "schema_name" argument to specify the desired database name.
#
db_list_tables(object_name = "custom%")

Unnamed: 0,TableName
0,Customer
1,Customer2


In [8]:
# Entirely remove a table from the Database.
#
# By default, the drop operation will take place in the teradataml context
# "database"; therefore in the present example, TRNG_TECHBYTES.Customer2
# will be dropped.
# Note: To drop a table in a different database, use the db_drop_table()
#       "schema_name" argument to specify the desired database name.
#
db_drop_table("Customer2")
db_list_tables(object_name = "custom%")

Unnamed: 0,TableName
0,Customer


In [None]:
# Details about an object are available through the Python help() function.
# Example: help(DataFrame)
# An object's methods can be queried with the Python dir() function.
# Example: dir(DataFrame)

### End of session

In [9]:
# Remove the context of present teradataml session and terminate the Python
# session. It is recommended to call the remove_context() function for session
# cleanup. Temporary objects are removed at the end of the session.
#
remove_context()

True