<header style="padding:1px;background:#f9f9f9;border-top:3px solid #00b2b1"><img id="Teradata-logo" src="https://www.teradata.com/Teradata/Images/Rebrand/Teradata_logo-two_color.png" alt="Teradata" width="220" align="right" />

<b style = 'font-size:28px;font-family:Arial;color:#E37C4D'>Train Delay Prediction</b>
</header>

<p style = 'font-size:18px;font-family:Arial;color:#E37C4D'><b>Introduction:</b></p>
<p style = 'font-size:16px;font-family:Arial'>
Train delays significantly affect both the operational effectiveness of railway companies and the overall experience of passengers in the transportation sector. 
Understanding and examining the reasons for delays can offer insightful information to enhance train operations and reduce interruptions.
Predictive models can anticipate potential delays and enable pro-active planning, so that resources are allocated as necessary.
<center><img src="images/introduction.png"/></center>

<p style = 'font-size:16px;font-family:Arial'>In this demo we will use synthetic data dealing with train travel from one station to another. During these travels, events are recorded. This notebook illustrates how to use Vantage to extract valuable insights from this event table.</p>


<h1 style = 'font-size:28px;font-family:Arial;color:#E37C4D'><b>Import python packages and connect to Vantage</b></h1>

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from teradataml import *

# Modify the following to match the specific client environment settings
display.max_rows = 5


<p style = 'font-size:18px;font-family:Arial;color:#E37C4D'> <b>Let's start by connecting to the Teradata system </b></p>
<p style = 'font-size:16px;font-family:Arial'>You will be prompted to provide the password. Enter your password, press the Enter key, and then use the down arrow to go to the next cell.</p>

In [None]:
%run -i ../startup.ipynb
eng = create_context(host = 'host.docker.internal', username='demo_user', password = password)
print(eng)
eng.execute('''SET query_band='DEMO=Train_Delay.ipynb;' UPDATE FOR SESSION; ''')

<p style = 'font-size:16px;font-family:Arial'>Begin running steps with Shift + Enter keys. </p>

<p style = 'font-size:20px;font-family:Arial;color:#E37C4D'><b>Getting Data for This Demo</b></p>
<p style = 'font-size:16px;font-family:Arial'>We have provided data for this demo on cloud storage. You can either run the demo using foreign tables to access the data without any storage on your environment or download the data to local storage, which may yield faster execution. Still, there could be considerations of available storage. Two statements are in the following cell, and one is commented out. You may switch which mode you choose by changing the comment string.</p>

In [None]:
#%run -i ../run_procedure.py "call get_data('DEMO_TrainDelay_cloud');"        # Takes 30 seconds
%run -i ../run_procedure.py "call get_data('DEMO_TrainDelay_local');"        # Takes 1 minute

<p style = 'font-size:16px;font-family:Arial'>Next is an optional step – if you want to see the status of databases/tables created and space used.</p>

In [None]:
%run -i ../run_procedure.py "call space_report();"        # Takes 10 seconds

<h1 style = 'font-size:28px;font-family:Arial;color:#E37C4D'><b>Data Exploration</b></h1>
<p style = 'font-size:16px;font-family:Arial'>Create a "Virtual DataFrame" that points to the data set in Vantage. Check the shape of the dataframe as check the datatypes of all the columns of the dataframe.</p>


In [None]:
mydata = DataFrame(in_schema("DEMO_TrainDelay" ,"Train_Dataset"))
mydata

<p style = 'font-size:16px;font-family:Arial'>
Two major functions from tdml are highlighted here:
<ul style = 'font-size:16px;font-family:Arial'>    
<li> DataFrame which is the key object that point to the Teradata tables of interest without data export to the client machine</li>
<li>in_schema that aims to specify on which schema/database the tables are.</li>
</ul>  
<p style = 'font-size:16px;font-family:Arial'><i>mydata</i> is a DataFrame object from Teradata. However, it shares several features and methods in common with numpy array and pandas dataframes, like:
<ul style = 'font-size:16px;font-family:Arial'> 
    <li> shape to get the number of rows and columns</li>
    <li> dtypes to get the data types per columns</li>
    <li> groupby, select, agg, ... to compute and manipulate aggregation</li>
    <li> iloc, loc to filter rows and columns</li>
    <li> columns to get the column names</li>
    </ul>
    </p>

In [None]:
%%time
type(mydata)

In [None]:
mydata.shape

<p style = 'font-size:16px;font-family:Arial'>mydata dataframe contains 54616 rows and 3 columns.

In [None]:
mydata.dtypes

<ul style = 'font-size:16px;font-family:Arial'>The columns are 3:
    <li> TravelID as int </li>
<li> events as string</li>
    <li> datetime as datetime</li>

<p style = 'font-size:16px;font-family:Arial'>As an example, we can see all different events contained in the dataset:

In [None]:
mydata.groupby(['Events']).agg('count')

<p style = 'font-size:16px;font-family:Arial'>It is possible to plot the aggregated data using the to_pandas() method to collect the data in the client and use the matplotlib

In [None]:
df4plot = mydata.groupby(['Events']).agg('count').to_pandas()
df4plot

<p style = 'font-size:16px;font-family:Arial'>Here we see that we have as much departure as arrival which is expected. The most frequent events are <i>Door light failure</i> and <i>Normal stop</i>.

In [None]:
plt.rcParams['figure.figsize'] = [15, 10]
plt.rc('ytick', labelsize=20)
df4plot.plot.barh(x='Events',y='count_TravelID')

<h1 style = 'font-size:28px;font-family:Arial;color:#E37C4D'><b>Advanced Data Exploration : Path Analysis </b></h1>
<p style = 'font-size:16px;font-family:Arial'>The *NPATH* function of the Advanced SQL of Teradata allows direct query of specific paths contained in an event table.<br>
In this example, we want to build all the path of events the travels pass through, meaning:
<ul style = 'font-size:16px;font-family:Arial'>
    <li> for each travel</li>
    <li> get the sequence of events</li>
A travel can be modelled as a sequence of event starting from the *departure* event, and ending with the *arrival* event.</ul>
<center><img src="images/npath_sankey.png"/></center>

In [None]:
myPathAnalysis = NPath(data1     = mydata,
               data1_partition_column = 'TravelID',
               data1_order_column     = 'Datetime',
               result                 = ['FIRST (TravelID OF any (Dep,Arr)) AS TravelID',
                                         'ACCUMULATE (cast(events as VARCHAR(50) CHARACTER SET UNICODE NOT CASESPECIFIC)OF any(Other,Dep,Arr)) AS MyPath',
                                         'first(Datetime of Dep) AS departure_time',
                                         'last(Datetime of Arr) As arrival_time'
                                        ],
               mode                   = 'nonoverlapping',
               pattern                = '^Dep.Other*.Arr$',
               symbols                = ["events='departure' AS Dep",
                                         "events='arrival' AS Arr",
                                         "true as Other"
                                        ],
        )

<p style = 'font-size:16px;font-family:Arial'>The results of the npath can be customized. We can add the path (here the *mypath* column) but also the departure and arrival time for each travel. 

In [None]:
myPathAnalysis.result

<p style = 'font-size:16px;font-family:Arial'>In order to visualize the distribution of the different path of events, we typically use Sankey diagram of the aggregated over the paths reported by the NPATH command.

In [None]:
from tdnpathviz.visualizations import plot_first_main_paths

In [None]:
%%time
plot_first_main_paths(myPathAnalysis.result,path_column='mypath',id_column='travelid')

<p style = 'font-size:16px;font-family:Arial'>
To check the details of any path or node we can move the mouse pointer over it and check details. The number on the path represent the count of travelids which have that path and source and target mentions the incoming and outcoming events.<br>
When the pointer is moved over a Node, for example when the pointer is on the long purple Node at the right top arrival it shows incoming flow count: 4 and outgoing flow count: 0 which means that there are 4 different events which lead to this node similarly outgoing flow count gives the count of events after this event.<br>
<br>
For sake of clarity, it is important to focus on the most important paths from a business viewpoint. Here we decided to look at the most frequent ones, i.e. a frequency > 20.

In [None]:
nPathdf_group=myPathAnalysis.result.groupby("mypath")\
                .count()\
                .sort('count_travelid',ascending=False)
nPathdf_group

In [None]:
count_travel=nPathdf_group.count_travelid
nPathdf_group_plot=nPathdf_group[count_travel >= 20]

In [None]:
%%time
plot_first_main_paths(nPathdf_group_plot,path_column='mypath',id_column='count_travelid')

<p style = 'font-size:16px;font-family:Arial'>The visualization of paths in event table is critical to design the best modeling strategy. For instance the business may decide to ignore some events because to doubt about the meaning of a given event and rapidly assess its importance in its entire dataset.

<h1 style = 'font-size:28px;font-family:Arial;color:#E37C4D'><b>Data Preparation using the Massive Parallel Processing of Teradata</b></h1>
<p style = 'font-size:16px;font-family:Arial'>
In this example, we want to predict the delay induced by each event assuming each delay adds up independently from each other. For this purpose, we will use Machine Learning algorithm to predict the delay from the frequency of each event.
<center><img src="images/data_science_model.png"/></center>
<p style = 'font-size:16px;font-family:Arial'>
It is a good practice to perform the data preparation as a table or a view. In this case, we are sure the data preparation leverage the Massive Parallel Processing of Teradata. Moreover, the data preparation is shareable across the enterprise and guarantee the operationalization of the solution.<br>In this example, we decide to use a view, named *usecase_dataset*. Doing will always provide an updated dataset with the latest data. This view can be used later on to historize as many dataset as needed for training and testing.<br>
To do so, we can push a SQL query to build this view in a data lab space in Teradata. Note that this view relies on the NPATH Teradata function and timestamp manipulation to create the target feature which is the travel duration in second (*travel_duration_sec*).

In [None]:
myquery = """REPLACE VIEW demo_user.usecase_dataset (TravelID,travel_duration_sec, travel)
 AS
 SELECT TravelID,travel_duration_sec, travel  FROM (
  SELECT 
        TravelID AS TravelID,
        departure_time AS departure_time,
        arrival_time AS arrival_time,
        (arrival_time - departure_time) HOUR TO SECOND(4) as travel_duration,
        INTERVAL(PERIOD(departure_time,arrival_time)) MINUTE(3) as travel_duration_min,
        EXTRACT(HOUR FROM travel_duration)*3600 + EXTRACT(MINUTE FROM travel_duration)*60 + EXTRACT(SECOND FROM travel_duration) as travel_duration_sec,
        travel as travel
  FROM NPATH (
  ON (
        SELECT TravelID, events, datetime
        FROM DEMO_TrainDelay.train_dataset
     ) 
    PARTITION BY TravelID ORDER BY datetime
    USING MODE (NONOVERLAPPING)
    Pattern ('^Dep.Other*.Arr$')
    Symbols (
        events='departure' AS Dep,
        events='arrival' AS Arr,
        events not in ('departure','arrival') as Other
            )
    Result (accumulate(cast(events as VARCHAR(50) CHARACTER SET UNICODE NOT CASESPECIFIC) OF ANY(Other)) AS travel,
            first(datetime of Dep) AS departure_time,
            last(datetime of Arr) As arrival_time,
            first(TravelID of ANY(Dep,Arr)) as TravelID)
) as dt
) A;
"""

In [None]:
eng.execute(myquery)

In [None]:
df_mydata = DataFrame(in_schema("demo_user","usecase_dataset"))
df_mydata

<h1 style = 'font-size:28px;font-family:Arial;color:#E37C4D'><b>Model development : prepare the data for the Machine Learning Algorithm</b></h1>
<p style = 'font-size:16px;font-family:Arial'>
The dataset built in Teradata contains all the information to address the business question. However, the Data Scientist will define how to expose these data to a machine learning algorithm to get the insights he is looking for.<br>
In this example, the strategy proposed by the data scientist consists of spliting the paths and count frequency of each event in it.
<center><img src="images/model_strategy.png"/></center>

<p style = 'font-size:16px;font-family:Arial'>We use the <i>NGramSplitter</i> function to process the paths of each travel. The function will split the corpus of texts into "terms" (grams) of selected size.

In [None]:
ngrams = NGramSplitter(data=df_mydata,
                          text_column='travel',
                          delimiter = ",",
                          grams = "1",
                          overlapping=False,
                          to_lower_case=True,
                          total_gram_count=True,
                          punctuation = "[\\]\\\\[\\`]"
              )

In [None]:
ngrams.result

<p style = 'font-size:16px;font-family:Arial'>The NGRAMS function add new columns (and rows). We will use two of them:
<ul style = 'font-size:16px;font-family:Arial'>
    <li> ngram : is the event found in the travel</li>
    <li> frequency : is the frequency of this event in the path</li>
 </ul>   
<p style = 'font-size:16px;font-family:Arial'>We need to get the number of possible ngrams: 

In [None]:
keys = (ngrams.result).select(['ngram','frequency']).groupby(['ngram']).sum().to_pandas()
keys

<p style = 'font-size:16px;font-family:Arial'>We can visualize again the distribution of events in the dataset:

In [None]:
import matplotlib.pyplot as plt
keys.sort_values('sum_frequency',ascending=True).plot.barh(x='ngram',figsize=(10,5),fontsize=20,legend=False)
plt.ylabel('events',fontsize=20)
plt.xlabel('frequency',fontsize=20)

<p style = 'font-size:16px;font-family:Arial'>In order to make the dataset ready for the Machine Learning algorithm, we need to pivot the data and fill missing values with 0.<br>
For this purpose, we use two functions:
<ul style = 'font-size:16px;font-family:Arial'>
  <li>Pivot, to pivot the data and generate as many columns as event type. When an event does not occur during the travel, pivot assign its frequency to NULL or NaN</li>
    <li>assign, is used here to fill the missing values using the *isnan* function</li>

In [None]:
df_ngram = ngrams.result

In [None]:
df_ngram

In [None]:
df_ngram.shape

<p style = 'font-size:16px;font-family:Arial'><i>* below command is pivoting the data and takes approx 1min 30sec to execute </i>

In [None]:
%%time
pivot = df_ngram.pivot(columns=df_ngram.ngram, aggfuncs=df_ngram.frequency.sum())

In [None]:
dataset = pivot.assign(drop_columns                 = True,
           travelid                              = pivot.TravelID,
           travel                                = pivot.travel,
           travel_duration_sec                   = pivot.travel_duration_sec, 
           frequency_abnormal_weather_condition  = pivot['sum_frequency_abnormalweathercondition']  if not pivot['sum_frequency_abnormalweathercondition'].isna() else (1.-pivot['sum_frequency_abnormalweathercondition'].isna()),
           frequency_accident_involving_person   = pivot['sum_frequency_accidentinvolvingperson']  if not pivot['sum_frequency_accidentinvolvingperson'].isna() else (1.-pivot['sum_frequency_accidentinvolvingperson'].isna()),
           frequency_body_on_track               = pivot['sum_frequency_bodyontrack']  if not pivot['sum_frequency_bodyontrack'].isna() else (1.-pivot['sum_frequency_bodyontrack'].isna()),          
           frequency_crowded_stop                = pivot['sum_frequency_crowdedstop']  if not pivot['sum_frequency_crowdedstop'].isna() else (1.-pivot['sum_frequency_crowdedstop'].isna()),
           frequency_door_failure                = pivot['sum_frequency_doorfailure']  if not pivot['sum_frequency_doorfailure'].isna() else (1.-pivot['sum_frequency_doorfailure'].isna()),          
           frequency_door_light_failure          = pivot['sum_frequency_doorlightfailure']  if not pivot['sum_frequency_doorlightfailure'].isna() else (1.-pivot['sum_frequency_doorlightfailure'].isna()),
           frequency_electrical_failure          = pivot['sum_frequency_electricalfailure']  if not pivot['sum_frequency_electricalfailure'].isna() else (1.-pivot['sum_frequency_electricalfailure'].isna()),          
           frequency_engine_failure              = pivot['sum_frequency_electricalfailure']  if not pivot['sum_frequency_electricalfailure'].isna() else (1.-pivot['sum_frequency_electricalfailure'].isna()),
           frequency_normal_stop                 = pivot['sum_frequency_normalstop']  if not pivot['sum_frequency_normalstop'].isna() else (1.-pivot['sum_frequency_normalstop'].isna()),          
           frequency_road_work                   = pivot['sum_frequency_roadwork'] if not pivot['sum_frequency_roadwork'].isna() else (1.-pivot['sum_frequency_roadwork'].isna()),
           frequency_stop_sign_failure           = pivot['sum_frequency_stopsignfailure'] if not pivot['sum_frequency_stopsignfailure'].isna() else (1.-pivot['sum_frequency_stopsignfailure'].isna()),          
           frequency_unexpected_stop             = pivot['sum_frequency_unexpectedstop'] if not pivot['sum_frequency_unexpectedstop'].isna() else (1.-pivot['sum_frequency_unexpectedstop'].isna())
          )

<p style = 'font-size:16px;font-family:Arial'>Here we decide to create a table with the dataset in order to test different machine learning algorithm.

In [None]:
copy_to_sql(dataset,table_name='my_dataset',schema_name= 'demo_user',if_exists='replace')

<h1 style = 'font-size:28px;font-family:Arial;color:#E37C4D'><b>Model development : apply Machine Learning Algorithm</b></h1>
<p style = 'font-size:16px;font-family:Arial'>
In our case, using a Generalized Linear Model answers the following business questions:
<ul style = 'font-size:16px;font-family:Arial'>   
    <li>what is the travel duration when no event occur ? (even if this travel does not exist) => the answer is the intercept</li>
    <li>what is the delay induced by each event type ? (under the assumption there is no interaction between events) => the answers are the coefficients of the model</li>
    <li>can I simulate a new scenario ? => this is addressed by the scoring on new data. By the way, it can be done with any Machine Learning trained model</li>
 </ul>   
<center><img src="images/GLM.png"/></center>


In [None]:
dataset_num = DataFrame(in_schema('demo_user','my_dataset'))

In [None]:
dataset_num

In [None]:
dataset_num.loc[:,['travelid','travel_duration_sec']].describe()

<p style = 'font-size:16px;font-family:Arial'>Let's make a train/test split using the *travelid*:
    <ul style = 'font-size:16px;font-family:Arial'>
    <li>training set : *travelid < 5479* with 75% of the data (5028 rows)</li>   
    <li>testing set  : *travelid > 5478* with 25% of the data (1676 rows)</li> 
</ul>
<p style = 'font-size:16px;font-family:Arial'>We assume that all the events are present in both datasets, although it has to be checked.

In [None]:
dataset_training = dataset_num.loc[dataset_num.travelid < 5479,:]
dataset_testing  = dataset_num.loc[dataset_num.travelid > 5478,:]

In [None]:
dataset_training.shape

In [None]:
dataset_testing.shape

<p style = 'font-size:16px;font-family:Arial'>We want to predict the travel_duration_sec using the frequencies of all events: we define the formula accordingly.

In [None]:
formula = 'travel_duration_sec ~ '+' + '.join(dataset_num.columns[3:-1])
formula

In [None]:
from teradataml import GLM, TDGLMPredict
glm_out = GLM(     formula      = formula,
                   linkfunction = 'IDENTITY',
                   family       = "GAUSSIAN",
                   data         = dataset_training,
                   threshold    = 0.001,
                   iter_max=300,
                   tolerance=0.001,
                   momentum=0.1,
                   nesterov=True,
                   learning_rate='CONSTANT'
                   )

In [None]:
glm_out.result

In [None]:
model_coefficients = glm_out.result.to_pandas().reset_index()
feat_imp = model_coefficients[model_coefficients['attribute'] > 0].sort_values(by = 'estimate', ascending = False)

# Specify figure size
fig, ax = plt.subplots(figsize=(10, 8))

# Use ax.barh() for horizontal bar chart
ax.barh(feat_imp['predictor'], feat_imp['estimate'], edgecolor='red')

# Add text labels on right of the bars
for x, y in zip(feat_imp['estimate'], feat_imp['predictor']):
    ax.text(x, y, str(round(x, 2)), ha='left', va='center')

# Set y-axis label
ax.set_xlabel('Estimate')

plt.title('Feature importance')

plt.show()

<br>
<p style = 'font-size:16px;font-family:Arial'>The figure above displays feature importance which are significant factors in predicting the target variable which in our case is travel_duration_sec. 

<h1 style = 'font-size:28px;font-family:Arial;color:#E37C4D'><b>Model Performances</b></h1>
<p style = 'font-size:16px;font-family:Arial'>The model accuracy is tested on the testing dataset (dataset_testing) using the GLMPredict function

In [None]:
predictions = TDGLMPredict(object=glm_out.result,
                                        newdata=dataset_testing,
                                        accumulate="travel_duration_sec",
                                        id_column="travelid")

In [None]:
predictions.result

<p style = 'font-size:16px;font-family:Arial'>The TD_RegressionEvaluator function computes metrics to evaluate and compare multiple models and summarizes how close predictions are to their expected values.

In [None]:
from teradataml import RegressionEvaluator
RegressionEvaluator_out = RegressionEvaluator(data = predictions.result,
                                                      observation_column = "travel_duration_sec",
                                                      prediction_column = "prediction",
                                                      freedom_degrees = [1, 2],
                                                      metrics = ['RMSE','R2','FSTAT'])

In [None]:
RegressionEvaluator_out

<p style = 'font-size:16px;font-family:Arial'>The Metrics of the regression evaluator has the RMSE, R2 and the F-STAT metrics which are specified in the Metrics.<br>The Regression evaluator is used to evaluate and compare the models. </p>  

<p style = 'font-size:16px;font-family:Arial'>Root mean squared error (RMSE)The most common metric for evaluating linear regression model performance is called root mean squared error, or RMSE. The basic idea is to measure how bad/erroneous the model’s predictions are when compared to actual observed values. So a high RMSE is “bad” and a low RMSE is “good”.</p>

<p style = 'font-size:16px;font-family:Arial'>The coefficient of determination — more commonly known as R² — allows us to measure the strength of the relationship between the response and predictor variables in the model. It’s just the square of the correlation coefficient R, so its values are in the range 0.0–1.0. Higher values of R- Squared is Good.</p>

<p style = 'font-size:16px;font-family:Arial'> F-statistics (FSTAT) conducts an F-test. An F-test is any statistical test in which the test statistic has an F-distribution under the null hypothesis.
<ul style = 'font-size:16px;font-family:Arial'>
    <li>F_score = F_score value from the F-test.</li>
<li>F_Critcialvalue = F critical value from the F-test.</li>
<li>p_value = Probability value associated with the F_score value.</li>
<li>F_conclusion = F-test result, either 'reject null hypothesis' or 'fail to reject null hypothesis'. If F_score > F_Critcialvalue, then 'reject null hypothesis' Else 'fail to reject null hypothesis'</li>
</ul>
</p>

<h1 style = 'font-size:28px;font-family:Arial;color:#E37C4D'><b>Conclusion</b></h1>
<p style = 'font-size:16px;font-family:Arial'>In this notebook we have seen the end-to-end model creation using the The Teradata Vantage In-Database functions. We built a basic model and you can experiment by adjusting the model parameters to observe their impact on predictions and evaluation metrics.

<h1 style = 'font-size:28px;font-family:Arial;color:#E37C4D'><b>Cleanup</b></h1>
<p style = 'font-size:16px;font-family:Arial;color:#E37C4D'><b>Work Tables</b></p>
<p style = 'font-size:16px;font-family:Arial;'>
Cleanup work tables to prevent errors next time.

In [None]:
eng.execute('DROP TABLE my_dataset;')

In [None]:
eng.execute('DROP VIEW usecase_dataset;')

<p style = 'font-size:16px;font-family:Arial;color:#E37C4D'><b>Databases and Tables</b></p>
<p style = 'font-size:16px;font-family:Arial'>The following code will clean up tables and databases created above.</p>

In [None]:
%run -i ../run_procedure.py "call remove_data('DEMO_TrainDelay');" 
#Takes 10 seconds

In [None]:
remove_context()

<p style = 'font-size:20px;font-family:Arial;color:#E37C4D'><b>Reference Links:</b></p>
<ul style = 'font-size:16px;font-family:Arial'> 
       <li>Teradata Vantage™ - Analytics Database Analytic Functions - 17.20: <a href = 'https://docs.teradata.com/r/Enterprise_IntelliFlex_VMware/Teradata-VantageTM-Analytics-Database-Analytic-Functions-17.20/Introduction-to-Analytics-Database-Analytic-Functions '>https://docs.teradata.com/r/Enterprise_IntelliFlex_VMware/Teradata-VantageTM-Analytics-Database-Analytic-Functions-17.20/Introduction-to-Analytics-Database-Analytic-Functions </a></li>    
  <li>Teradata® Package for Python User Guide - 17.20: <a href = 'https://docs.teradata.com/r/Enterprise_IntelliFlex_VMware/Teradata-Package-for-Python-User-Guide-17.20/Introduction-to-Teradata-Package-for-Python'>https://docs.teradata.com/r/Enterprise_IntelliFlex_VMware/Teradata-Package-for-Python-User-Guide-17.20/Introduction-to-Teradata-Package-for-Python</a></li>
  <li>Teradata® Package for Python Function Reference - 17.20: <a href = 'https://docs.teradata.com/r/Enterprise/Teradata-Package-for-Python-Function-Reference-17.20/Teradata-Package-for-Python-Function-Reference'>https://docs.teradata.com/r/Enterprise/Teradata-Package-for-Python-Function-Reference-17.20/Teradata-Package-for-Python-Function-Reference</a></li>      
</ul>

<footer style="padding:10px;background:#f9f9f9;border-bottom:3px solid #394851">©2023 Teradata. All Rights Reserved</footer>