<header>
   <p  style='font-size:36px;font-family:Arial; color:#F0F0F0; background-color: #00233c; padding-left: 20pt; padding-top: 20pt;padding-bottom: 10pt; padding-right: 20pt;'>
       Load data to Vantage in R
  <br>
       <img id="teradata-logo" src="https://storage.googleapis.com/clearscape_analytics_demo_data/DEMO_Logo/teradata.svg" alt="Teradata" style="width: 125px; height: auto; margin-top: 20pt;">
    </p>
</header>

<p style = 'font-size:20px;font-family:Arial'><b>Introduction:</b></p>
<p style = 'font-size:16px;font-family:Arial'>Welcome to this introductory guide. This guide will walk you through steps are needed to load data in Teradata using R.</p>

<hr style="height:2px;border:none;">
<b style = 'font-size:20px;font-family:Arial'>1. Configuring the Environment</b>
<p style = 'font-size:16px;font-family:Arial'>Here, we import the required libraries, set environment variables and environment paths (if required).</p>

In [None]:
suppressMessages({
    library(tdplyr, quietly = T)
    library(dbplyr, quietly = T)
    library(dplyr, quietly = T)
    library(DBI, quietly = T)
    require(readr, quietly = T)
})

<hr style="height:2px;border:none;">
<b style = 'font-size:20px;font-family:Arial'>2. Connect to Vantage</b>
<p style = 'font-size:16px;font-family:Arial'>You will be prompted to provide the password. Enter your password, press Enter, then use down arrow to go to next cell.</p>

In [None]:
td_create_context(
    host = 'host.docker.internal',
    uid = "demo_user",
    pwd = getPass("Enter your password: "),
    dType = "NATIVE",
    logmech = "TD2"
)

In [None]:
eng = td_get_context()$connection
eng

In [None]:
dbExecute(eng, "SET query_band='DEMO=PP_Data_Loading_R.ipynb;' UPDATE FOR SESSION;") 

In [None]:
# display_analytic_functions()

<hr style='height:2px;border:none;'>
<p style = 'font-size:20px;font-family:Arial'><b>3. Load data from csv file</b></p>

<p style = 'font-size:16px;font-family:Arial'>
    The data we will be loading for this example is in CSV format. The following is a sample of the header followed by the first 5 rows:
</p>

<p style='font-size:12px;font-family:Courier;'>
    InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country<br>
    536365,85123A,WHITE HANGING HEART T-LIGHT HOLDER,6,2010-01-12 08:26:00,2.55,17850.0,United Kingdom<br>
    536365,71053,WHITE METAL LANTERN,6,2010-01-12 08:26:00,3.39,17850.0,United Kingdom<br>
    536365,84406B,CREAM CUPID HEARTS COAT HANGER,8,2010-01-12 08:26:00,2.75,17850.0,United Kingdom<br>
    536365,84029G,KNITTED UNION FLAG HOT WATER BOTTLE,6,2010-01-12 08:26:00,3.39,17850.0,United Kingdom<br>
    536365,84029E,RED WOOLLY HOTTIE WHITE HEART.,6,2010-01-12 08:26:00,3.39,17850.0,United Kingdom<br>
</p>

<p style = 'font-size:16px;font-family:Arial'>
    To load data in Teradata using R, we'll use <code>read.csv()</code> function from R. The dataframe will interpret the first row as column names.
</p>

In [None]:
df = read.csv('./Retail_Data_sample.csv')

<p style = 'font-size:16px;font-family:Arial'>The <code>dim()</code> function in R is used to retrieve the dimensions of an array, matrix, or data frame.</p>

In [None]:
dim(df)

<p style = 'font-size:16px;font-family:Arial'>The <code>sapply()</code> function in R is part of the apply family and allows you to apply a function to each element of a list, vector, or data frame.</p>

In [None]:
sapply(df, class)

In [None]:
head(df, n = 5)

In [None]:
class(df)

<p style = 'font-size:16px;font-family:Arial'>Once the data is loaded in the R dataframe, we can copy it to Vantage using the <code>copy_to()</code> function.

In [None]:
copy_to(
    eng,
    df,
    name = 'Retail_Data',
    overwrite = TRUE
)

<p style = 'font-size:16px;font-family:Arial'>Let us check the table created

In [None]:
tdf <- tbl(eng, in_schema("demo_user", "Retail_Data"))
head(tdf, n = 5)

In [None]:
class(tdf)

<hr style='height:2px;border:none;'>
<p style = 'font-size:20px;font-family:Arial'> <b>4. Load data from zip file</b></p>

<p style = 'font-size:16px;font-family:Arial;'> We can load zip file in R dataframe using the <code>read_csv()</code> from <code>readr</code> module</p>

In [None]:
data <- read_csv("Retail_Data_sample.zip", show_col_types = FALSE)
head(data, n = 5)

In [None]:
copy_to(
    eng,
    data,
    name = 'Retail_Data_zip',
    overwrite = TRUE
)

In [None]:
tdf_zip <- tbl(eng, in_schema("demo_user", "Retail_Data_zip"))
head(tdf_zip, n = 5)

<hr style="height:2px;border:none;">
<b style = 'font-size:20px;font-family:Arial'>5. Cleanup</b>

<p style = 'font-size:18px;font-family:Arial'><b>Work Tables</b></p>
<p style = 'font-size:16px;font-family:Arial'>Cleanup work tables to prevent errors next time.</p>

In [None]:
tables <- c('Retail_Data', 'Retail_Data_zip')

# Loop through the list of tables and execute the drop table command for each table
for (table in tables) {
    tryCatch(
        db_drop_table(eng, table)
    )
}

<p style = 'font-size:16px;font-family:Arial'>The following code will remove context.</p>

In [None]:
td_remove_context()

<hr style="height:2px;border:none;">
<p style = 'font-size:16px;font-family:Arial'><b>Links:</b></p>
<ul style = 'font-size:16px;font-family:Arial'>
    <li>Teradata® Package for R Function Reference: <a href = 'https://docs.teradata.com/search/all?query=Teradata+Package+for+R+Function+Reference&content-lang=en-US'>here</a></li>
</ul>

<footer style="padding-bottom:35px; border-bottom:3px solid #91A0Ab">
    <div style="float:left;margin-top:14px">ClearScape Analytics™</div>
    <div style="float:right;">
        <div style="float:left; margin-top:14px">
            Copyright © Teradata Corporation - 2024. All Rights Reserved
        </div>
    </div>
</footer>