<h2>Basic Analytic Functions Operations</h2>
<p>
This R Notebook provides some basic workflow operations for the Teradata analytic functions library.</p>

<i>NOTE: You must have a connection to Teradata Vantage that has the Teradata analytic functions installed.</i>
    


<h3> Get the list of installed packages </h3>

In [None]:
installed.packages()

<h3> Show help for the Teradata tdplyr package </h3>

In [None]:
help(package=tdplyr)

<h3> Show help for specific functions in Teradata tdplyr package </h3>

In [None]:
help(package=tdplyr,td_create_context)

In [None]:
help(package=tdplyr,td_naivebayes_mle)

<h3> Include the tdplyr library </h3>

In [None]:
library(tdplyr)

<h3> Create a connection using the native driver</h3>

In [None]:
# Replace your cluster details for user, passwd and host
user = "xxxxx"
passwd = "xxxxx"
host = "xxxxx"
con <- td_create_context(host = host, uid = user, pwd = passwd, dType = "native")
con

<h3>Creating tables and data frames </h3>

<h4> Include dplr and dbplyr libraries </h4>

In [None]:
library(dplyr)
library(dbplyr)

<h4>Create a table iris_flowers from R built in dataset iris</h4>

In [None]:
copy_to(con, iris, name="iris_flowers", overwrite=FALSE)

In [None]:
class(iris)

<h4>Create a tibble from a table</h4>

In [None]:
tddf_iris <- tbl(con, "iris_flowers")

In [None]:
tddf_iris

<h4> Create a data frame from a tibble </h4>

In [None]:
df_iris <- as.data.frame(tddf_iris, n=20)

In [None]:
tddf_iris

In [None]:
df_iris

<h3> Using Naïve Bayes Model </h3>

<h4> Include additional libraries DBI and MASS </h4>

In [None]:
library(DBI)
library(MASS)

<h4> Load the "MASS" package and perform preliminary tasks </h4>

In [None]:
PimaTr <-Pima.tr
PimaTr$rowID <-seq.int(nrow(Pima.tr))
PimaTr$type <-tolower(PimaTr$type)

PimaTe <-Pima.te
PimaTe$rowID <-seq.int(nrow(Pima.te))
PimaTe$type <-tolower(PimaTe$type) 

<h4> Create tables in Vantage to hold the data </h4>

In [None]:
copy_to(con, PimaTr, name="Pima_train", overwrite=FALSE)

copy_to(con, PimaTe, name="Pima_test", overwrite=FALSE)

<h4> Create R tables from the Vantage tables created in previous step </h4>

In [None]:
tddf_Pima.tr <- tbl(con, "Pima_train")

tddf_Pima.te <- tbl(con, "Pima_test")

<h4> Create the Naïve Bayes model from the training dataset using the td_naivebayes_mle() tdplyr analytic function </h4>

In [None]:
nbmodel <- td_naivebayes_mle(
  formula = (type ~ npreg + glu + bp + skin + bmi + ped + age),
  data = tddf_Pima.tr
)

<h4> Run the model on the test dataset using the td_naivebayes_predict_sqle() tdplyr analytic function </h4>

In [None]:
pred <- td_naivebayes_predict_sqle(
  formula = (type ~ npreg + glu + bp + skin + bmi + ped + age),
  modeldata = nbmodel,
  newdata = tddf_Pima.te,
  id.col = "rowID",
  responses = c("yes", "no")
)

<h3> To assess the model prediction, obtain the confusion matrix to analyze the performance of the model </h3>

<h4> Store the observed response and the predicted values in a data frame </h4>

In [None]:
df <- inner_join(pred$result, tddf_Pima.te, by="rowID") %>% dplyr::select(prediction, response = type)

<h4> Create the "confusionMatrix_tbl" table in Vantage to hold the confusion matrix </h4>

In [None]:
copy_to(con, df, name="confusionMatrix_tbl")

<h4> Create an R table from the existing Vantage table with the tbl() function </h4>

In [None]:
tddf_confusionMatrix_tbl <-tbl(con, "confusionMatrix_tbl")

<h4> Invoke the td_confusion_matrix_mle() tdplyr analytic function to analyze the performance of the model </h4>

In [None]:
cmResult <- td_confusion_matrix_mle(
  data = tddf_confusionMatrix_tbl,
  reference = "response",
  prediction = "prediction"
)

The confusion matrix analysis creates three output tables in Vantage and a fourth table that declares whether the analysis has run successfully.

These tables are stored by the td_confusion_matrix_mle() function in a named list as tibble objects.

<h3> Examine the results by invoking the output tibbles </h3>

In [None]:
print( cmResult$counttable )

print( cmResult$stattable )

print( cmResult$accuracytable )

print( cmResult$output )

<h4> Remove tables created by this example </h4>

In [None]:
dbRemoveTable(con,"iris_flowers")

In [None]:
dbRemoveTable(con,"Pima_test")

In [None]:
dbRemoveTable(con,"Pima_train")

In [None]:
dbRemoveTable(con,"confusionMatrix_tbl")

In [None]:
td_remove_context()

<span style="font-size:16px;">For more information on the Teradata analytic functions, refer to the [Teradata Documentation](https://docs.teradata.com/) and search for Teradata R Package.</span>

Copyright 2019-2020 Teradata. All rights reserved.