### Disclaimer
Please note, the Vantage Functions via SQLAlchemy feature is a preview/beta code release with limited functionality (the “Code”). As such, you acknowledge that the Code is experimental in nature and that the Code is provided “AS IS” and may not be functional on any machine or in any environment. TERADATA DISCLAIMS ALL WARRANTIES RELATING TO THE CODE, EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, ANY WARRANTIES AGAINST INFRINGEMENT OF THIRD-PARTY RIGHTS, MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.

TERADATA SHALL NOT BE RESPONSIBLE OR LIABLE WITH RESPECT TO ANY SUBJECT MATTER OF THE CODE UNDER ANY CONTRACT, NEGLIGENCE, STRICT LIABILITY OR OTHER THEORY 
    (A) FOR LOSS OR INACCURACY OF DATA OR COST OF PROCUREMENT OF SUBSTITUTE GOODS, SERVICES OR TECHNOLOGY, OR 
    (B) FOR ANY INDIRECT, INCIDENTAL OR CONSEQUENTIAL DAMAGES INCLUDING, BUT NOT LIMITED TO LOSS OF REVENUES AND LOSS OF PROFITS. TERADATA SHALL NOT BE RESPONSIBLE FOR ANY MATTER BEYOND ITS REASONABLE CONTROL.

Notwithstanding anything to the contrary: 
    (a) Teradata will have no obligation of any kind with respect to any Code-related comments, suggestions, design changes or improvements that you elect to provide to Teradata in either verbal or written form (collectively, “Feedback”), and 
    (b) Teradata and its affiliates are hereby free to use any ideas, concepts, know-how or techniques, in whole or in part, contained in Feedback: 
        (i) for any purpose whatsoever, including developing, manufacturing, and/or marketing products and/or services incorporating Feedback in whole or in part, and 
        (ii) without any restrictions or limitations, including requiring the payment of any license fees, royalties, or other consideration. 

In [1]:
# Get the connection to the Vantage using create_context()
from teradataml import *
import getpass
td_context = create_context(host=getpass.getpass("Hostname: "), username=getpass.getpass("Username: "), password=getpass.getpass("Password: "))

# Load the example dataset.
load_example_data("GLM", ["admissions_train"])

Hostname: ········
Username: ········
Password: ········


In [2]:
# Create the DataFrame on 'admissions_train' table
admissions_train = DataFrame("admissions_train")
admissions_train

   masters   gpa     stats programming  admitted
id                                              
15     yes  4.00  Advanced    Advanced         1
7      yes  2.33    Novice      Novice         1
22     yes  3.46    Novice    Beginner         0
17      no  3.83  Advanced    Advanced         1
13      no  4.00  Advanced      Novice         1
38     yes  2.65  Advanced    Beginner         1
26     yes  3.57  Advanced    Advanced         1
5       no  3.44    Novice      Novice         0
34     yes  3.85  Advanced    Beginner         0
40     yes  3.95    Novice    Beginner         0

In [3]:
# Helper function to print some details of dataframe.
def print_dataframe(df):
    print("Equivalent SQL: {}".format(df.show_query()))
    print("\n")
    print(" ************************* DataFrame ********************* ")
    print(df)
    print("\n\n")

In [4]:
# Before we move on with examples, one should read below just to understand how teradataml DataFrame and 
# it's columns are used to create a SQLAlchemy ClauseElement/Expression.

# Often in below examples one would see something like this: 'admissions_train.admitted.expression'
# Here in the above expression,
#    'admissions_train' is 'teradataml DataFrame'
#    'admitted' is 'column name' in teradataml DataFrame 'admissions_train'
#    Thus, 
#        'admissions_train.admitted' together forms a ColumnExpression.
#    expression allows us to use teradata ColumnExpression to be treated as SQLAlchemy Expression.
#    Thus,
#        'admissions_train.admitted.expression' gives us an expression that can be used with SQLAlchemy clauseElements.

## Using SQLAlchemy 'case' to filter rows in teradataml

In [5]:
# Using SQLAlchemy case expression. 
# Form a CASE object that returns 1 for cases where masters value is "yes", otherwise return 0.
from sqlalchemy import case

caseobj = case(
    [
        (admissions_train.masters.expression == "yes", 1)
    ],
    else_=0
)
type(caseobj)

sqlalchemy.sql.elements.Case

In [6]:
# Filter the results where admitted is 1 and masters is 'yes'. This is being done with the help of CASE object created above.
df = admissions_train[admissions_train.admitted == caseobj]
print_dataframe(df)

Equivalent SQL: select * from "admissions_train" where admitted = CASE WHEN (masters = 'yes') THEN 1 ELSE 0 END


 ************************* DataFrame ********************* 
   masters   gpa     stats programming  admitted
id                                              
31     yes  3.50  Advanced    Beginner         1
23     yes  3.59  Advanced      Novice         1
7      yes  2.33    Novice      Novice         1
4      yes  3.50  Beginner      Novice         1
26     yes  3.57  Advanced    Advanced         1
5       no  3.44    Novice      Novice         0
20     yes  3.90  Advanced    Advanced         1
18     yes  3.81  Advanced    Advanced         1
38     yes  2.65  Advanced    Beginner         1
6      yes  3.50  Beginner    Advanced         1





## Using SQLAlchemy 'between' to filter rows in teradataml

In [7]:
# Example for using between, to filter the rows where GPA is between 3 and 4.
from sqlalchemy import between
between_expr = between(admissions_train.gpa.expression, 3, 4)
type(between_expr)

sqlalchemy.sql.elements.BinaryExpression

In [8]:
# Use SQLAlchemy BinaryExpression for 'between' created above to filter rows.
df = admissions_train[between_expr]
print_dataframe(df)

Equivalent SQL: select * from "admissions_train" where gpa BETWEEN 3 AND 4


 ************************* DataFrame ********************* 
   masters   gpa     stats programming  admitted
id                                              
39     yes  3.75  Advanced    Beginner         0
15     yes  4.00  Advanced    Advanced         1
30     yes  3.79  Advanced      Novice         0
26     yes  3.57  Advanced    Advanced         1
3       no  3.70    Novice    Beginner         1
17      no  3.83  Advanced    Advanced         1
34     yes  3.85  Advanced    Beginner         0
13      no  4.00  Advanced      Novice         1
5       no  3.44    Novice      Novice         0
36      no  3.00  Advanced      Novice         0





## Using SQLAlchemy 'IN' and 'NOT IN' to filter rows in teradataml

In [9]:
# Filter row to get the results where stats are either 'Novice' or 'Beginner'
in_expr = admissions_train.stats.expression.in_(["Novice", "Beginner"])
type(in_expr)

sqlalchemy.sql.elements.BinaryExpression

In [10]:
# Use SQLAlchemy BinaryExpression for 'IN' created above to filter rows.
df = admissions_train[in_expr]
print_dataframe(df)

Equivalent SQL: select * from "admissions_train" where stats IN ('Novice', 'Beginner')


 ************************* DataFrame ********************* 
   masters   gpa     stats programming  admitted
id                                              
21      no  3.87    Novice    Beginner         1
5       no  3.44    Novice      Novice         0
3       no  3.70    Novice    Beginner         1
1      yes  3.95  Beginner    Beginner         0
7      yes  2.33    Novice      Novice         1
22     yes  3.46    Novice    Beginner         0
40     yes  3.95    Novice    Beginner         0
33      no  3.55    Novice      Novice         1
6      yes  3.50  Beginner    Advanced         1
29     yes  4.00    Novice    Beginner         0





In [11]:
# Filter row to get the results where stats are neither 'Novice' nor 'Beginner'
notin_expr = admissions_train.stats.expression.notin_(["Novice", "Beginner"])
type(notin_expr)

sqlalchemy.sql.elements.BinaryExpression

In [12]:
# Use SQLAlchemy BinaryExpression for 'NOT IN' created above to filter rows.
df = admissions_train[notin_expr]
print_dataframe(df)

Equivalent SQL: select * from "admissions_train" where stats NOT IN ('Novice', 'Beginner')


 ************************* DataFrame ********************* 
   masters   gpa     stats programming  admitted
id                                              
19     yes  1.98  Advanced    Advanced         0
15     yes  4.00  Advanced    Advanced         1
38     yes  2.65  Advanced    Beginner         1
26     yes  3.57  Advanced    Advanced         1
17      no  3.83  Advanced    Advanced         1
34     yes  3.85  Advanced    Beginner         0
13      no  4.00  Advanced      Novice         1
24      no  1.87  Advanced      Novice         1
36      no  3.00  Advanced      Novice         0
27     yes  3.96  Advanced    Advanced         0





## Filtering using LIKE 

In [13]:
# Retrieve data where stats are like "%gin%"
like_expr = admissions_train.stats.expression.like("%gin%")
type(like_expr)

sqlalchemy.sql.elements.BinaryExpression

In [14]:
# Use SQLAlchemy BinaryExpression for 'LIKE' created above to filter rows.
df = admissions_train[like_expr]
print_dataframe(df)

Equivalent SQL: select * from "admissions_train" where stats LIKE '%gin%'


 ************************* DataFrame ********************* 
   masters   gpa     stats programming  admitted
id                                              
8       no  3.60  Beginner    Advanced         1
6      yes  3.50  Beginner    Advanced         1
2      yes  3.76  Beginner    Beginner         0
1      yes  3.95  Beginner    Beginner         0
4      yes  3.50  Beginner      Novice         1





In [15]:
# Example for using NOTLIKE, retrive rows which does not contain string 'ce'
nlike_expr = admissions_train.stats.expression.notlike("%ce%")
type(nlike_expr)

sqlalchemy.sql.elements.BinaryExpression

In [16]:
# Use SQLAlchemy BinaryExpression for 'NOT LIKE' created above to filter rows.
df = admissions_train[nlike_expr]
print_dataframe(df)

Equivalent SQL: select * from "admissions_train" where stats NOT LIKE '%ce%'


 ************************* DataFrame ********************* 
   masters   gpa     stats programming  admitted
id                                              
8       no  3.60  Beginner    Advanced         1
6      yes  3.50  Beginner    Advanced         1
2      yes  3.76  Beginner    Beginner         0
1      yes  3.95  Beginner    Beginner         0
4      yes  3.50  Beginner      Novice         1





In [17]:
# Example for using iLIKE - Case insensitive LIKE
ilike_expr = admissions_train.stats.expression.ilike("be%NN%r")
type(ilike_expr)

sqlalchemy.sql.elements.BinaryExpression

In [18]:
# Use SQLAlchemy BinaryExpression for 'Case Insensitive LIKE' created above to filter rows.
df = admissions_train[ilike_expr]
print_dataframe(df)

Equivalent SQL: select * from "admissions_train" where lower(stats) LIKE lower('be%NN%r')


 ************************* DataFrame ********************* 
   masters   gpa     stats programming  admitted
id                                              
8       no  3.60  Beginner    Advanced         1
6      yes  3.50  Beginner    Advanced         1
2      yes  3.76  Beginner    Beginner         0
1      yes  3.95  Beginner    Beginner         0
4      yes  3.50  Beginner      Novice         1





In [19]:
# Filter rows starting with "Beg"
sw_expr = admissions_train.stats.expression.startswith("Beg")
type(sw_expr)

sqlalchemy.sql.elements.BinaryExpression

In [20]:
# Use SQLAlchemy BinaryExpression for 'Starts With' created above to filter rows.
df = admissions_train[sw_expr]
print_dataframe(df)

Equivalent SQL: select * from "admissions_train" where stats LIKE 'Beg' || '%'


 ************************* DataFrame ********************* 
   masters   gpa     stats programming  admitted
id                                              
8       no  3.60  Beginner    Advanced         1
6      yes  3.50  Beginner    Advanced         1
2      yes  3.76  Beginner    Beginner         0
1      yes  3.95  Beginner    Beginner         0
4      yes  3.50  Beginner      Novice         1





## Using IS NULL or IS NOT NULL

In [21]:
# Example for using IS NULL
isnull_expr = admissions_train.stats.expression.is_(None)
type(isnull_expr)

sqlalchemy.sql.elements.BinaryExpression

In [22]:
# Use SQLAlchemy BinaryExpression for 'IS NULL' created above to filter rows.
df = admissions_train[isnull_expr]
print_dataframe(df)

Equivalent SQL: select * from "admissions_train" where stats IS NULL


 ************************* DataFrame ********************* 
Empty DataFrame
Columns: [masters, gpa, stats, programming, admitted]
Index: []





In [23]:
# Example for using IS NOT NULL
isnotnull_expr = admissions_train.stats.expression.isnot(None)
type(isnotnull_expr)

sqlalchemy.sql.elements.BinaryExpression

In [24]:
# Use SQLAlchemy BinaryExpression for 'IS NOT NULL' created above to filter rows.
df = admissions_train[isnotnull_expr]
print_dataframe(df)

Equivalent SQL: select * from "admissions_train" where stats IS NOT NULL


 ************************* DataFrame ********************* 
   masters   gpa     stats programming  admitted
id                                              
15     yes  4.00  Advanced    Advanced         1
7      yes  2.33    Novice      Novice         1
22     yes  3.46    Novice    Beginner         0
17      no  3.83  Advanced    Advanced         1
13      no  4.00  Advanced      Novice         1
38     yes  2.65  Advanced    Beginner         1
26     yes  3.57  Advanced    Advanced         1
5       no  3.44    Novice      Novice         0
34     yes  3.85  Advanced    Beginner         0
40     yes  3.95    Novice    Beginner         0





In [25]:
# One must run remove_context() to close the connection and garbage collect internally generated objects.
remove_context()

True