<header>
   <p  style='font-size:36px;font-family:Arial; color:#F0F0F0; background-color: #00233c; padding-left: 20pt; padding-top: 20pt;padding-bottom: 10pt; padding-right: 20pt;'>
       Load data to Vantage in R
  <br>
       <img id="teradata-logo" src="https://storage.googleapis.com/clearscape_analytics_demo_data/DEMO_Logo/teradata.svg" alt="Teradata" style="width: 125px; height: auto; margin-top: 20pt;">
    </p>
</header>

<p style = 'font-size:20px;font-family:Arial;color:#00233C'><b>Introduction:</b></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Welcome to this introductory guide. This guide will walk you through steps are needed to load data in Teradata using R.</p>

<hr style="height:2px;border:none;background-color:#00233C;">
<b style = 'font-size:20px;font-family:Arial;color:#00233C'>1. Configuring the Environment</b>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Here, we import the required libraries, set environment variables and environment paths (if required).</p>

In [1]:
suppressMessages({
    library(tdplyr, quietly = T)
    library(dbplyr, quietly = T)
    library(dplyr, quietly = T)
    library(DBI, quietly = T)
    require(readr, quietly = T)
})

<hr style="height:2px;border:none;background-color:#00233C;">
<b style = 'font-size:20px;font-family:Arial;color:#00233C'>2. Connect to Vantage</b>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>You will be prompted to provide the password. Enter your password, press Enter, then use down arrow to go to next cell.</p>

In [2]:
td_create_context(
    host = 'host.docker.internal',
    uid = "demo_user",
    pwd = getPass("Enter your password: "),
    dType = "NATIVE",
    logmech = "TD2"
)

Enter your password:  ········


<Teradata Native Driver Connection>
  DEMO_USER@host.docker.internal
  Database: DEMO_USER
  Teradata Version: 17.20.03.26
<TeradataConnection Driver=20.0.0.15 Database=17.20.03.26 Host=host.docker.internal uConnHandle=1>

In [3]:
eng = td_get_context()$connection
eng

<Teradata Native Driver Connection>
  DEMO_USER@host.docker.internal
  Database: DEMO_USER
  Teradata Version: 17.20.03.26
<TeradataConnection Driver=20.0.0.15 Database=17.20.03.26 Host=host.docker.internal uConnHandle=1>

In [4]:
dbExecute(eng, "SET query_band='DEMO=PP_Data_Loading_R.ipynb;' UPDATE FOR SESSION;") 

In [5]:
# display_analytic_functions()

<hr style='height:2px;border:none;background-color:#00233C;'>
<p style = 'font-size:20px;font-family:Arial;color:#00233c'><b>3. Load data from csv file</b></p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>
    The data we will be loading for this example is in CSV format. The following is a sample of the header followed by the first 5 rows:
</p>

<p style='font-size:12px;font-family:Courier;;color:#00233C'>
    InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country<br>
    536365,85123A,WHITE HANGING HEART T-LIGHT HOLDER,6,2010-01-12 08:26:00,2.55,17850.0,United Kingdom<br>
    536365,71053,WHITE METAL LANTERN,6,2010-01-12 08:26:00,3.39,17850.0,United Kingdom<br>
    536365,84406B,CREAM CUPID HEARTS COAT HANGER,8,2010-01-12 08:26:00,2.75,17850.0,United Kingdom<br>
    536365,84029G,KNITTED UNION FLAG HOT WATER BOTTLE,6,2010-01-12 08:26:00,3.39,17850.0,United Kingdom<br>
    536365,84029E,RED WOOLLY HOTTIE WHITE HEART.,6,2010-01-12 08:26:00,3.39,17850.0,United Kingdom<br>
</p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>
    To load data in Teradata using R, we'll use <code>read.csv()</code> function from R. The dataframe will interpret the first row as column names.
</p>

In [6]:
df = read.csv('./Retail_Data_sample.csv')

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>The <code>dim()</code> function in R is used to retrieve the dimensions of an array, matrix, or data frame.</p>

In [7]:
dim(df)

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>The <code>sapply()</code> function in R is part of the apply family and allows you to apply a function to each element of a list, vector, or data frame.</p>

In [8]:
sapply(df, class)

In [9]:
head(df, n = 5)

Unnamed: 0_level_0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<int>,<chr>,<dbl>,<dbl>,<chr>
1,536365,85123A,WHITE HANGING HEART T-LIGHT HOLDER,6,2010-01-12 08:26:00,2.55,17850,United Kingdom
2,536365,71053,WHITE METAL LANTERN,6,2010-01-12 08:26:00,3.39,17850,United Kingdom
3,536365,84406B,CREAM CUPID HEARTS COAT HANGER,8,2010-01-12 08:26:00,2.75,17850,United Kingdom
4,536365,84029G,KNITTED UNION FLAG HOT WATER BOTTLE,6,2010-01-12 08:26:00,3.39,17850,United Kingdom
5,536365,84029E,RED WOOLLY HOTTIE WHITE HEART.,6,2010-01-12 08:26:00,3.39,17850,United Kingdom


In [10]:
class(df)

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Once the data is loaded in the R dataframe, we can copy it to Vantage using the <code>copy_to()</code> function.

In [11]:
copy_to(
    eng,
    df,
    name = 'Retail_Data',
    overwrite = TRUE
)

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Let us check the table created

In [12]:
tdf <- tbl(eng, in_schema("demo_user", "Retail_Data"))
head(tdf, n = 5)

[90m# Source:   SQL [5 x 8][39m
[90m# Database: Teradata[39m
  InvoiceNo StockCode Description      Quantity InvoiceDate UnitPrice CustomerID
  [3m[90m<chr>[39m[23m     [3m[90m<chr>[39m[23m     [3m[90m<chr>[39m[23m               [3m[90m<int>[39m[23m [3m[90m<chr>[39m[23m           [3m[90m<dbl>[39m[23m      [3m[90m<dbl>[39m[23m
[90m1[39m 536365    84406B    CREAM CUPID HEA…        8 2010-01-12…      2.75      [4m1[24m[4m7[24m850
[90m2[39m 536365    84029E    RED WOOLLY HOTT…        6 2010-01-12…      3.39      [4m1[24m[4m7[24m850
[90m3[39m 536365    84029G    KNITTED UNION F…        6 2010-01-12…      3.39      [4m1[24m[4m7[24m850
[90m4[39m 536365    71053     WHITE METAL LAN…        6 2010-01-12…      3.39      [4m1[24m[4m7[24m850
[90m5[39m 536365    85123A    WHITE HANGING H…        6 2010-01-12…      2.55      [4m1[24m[4m7[24m850
[90m# ℹ 1 more variable: Country <chr>[39m

In [13]:
class(tdf)

<hr style='height:2px;border:none;background-color:#00233C;'>
<p style = 'font-size:20px;font-family:Arial;color:#00233c'> <b>4. Load data from zip file</b></p>

<p style = 'font-size:16px;font-family:Arial;'> We can load zip file in R dataframe using the <code>read_csv()</code> from <code>readr</code> module</p>

In [14]:
data <- read_csv("Retail_Data_sample.zip", show_col_types = FALSE)
head(data, n = 5)

InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
<chr>,<chr>,<chr>,<dbl>,<dttm>,<dbl>,<dbl>,<chr>
536365,85123A,WHITE HANGING HEART T-LIGHT HOLDER,6,2010-01-12 08:26:00,2.55,17850,United Kingdom
536365,71053,WHITE METAL LANTERN,6,2010-01-12 08:26:00,3.39,17850,United Kingdom
536365,84406B,CREAM CUPID HEARTS COAT HANGER,8,2010-01-12 08:26:00,2.75,17850,United Kingdom
536365,84029G,KNITTED UNION FLAG HOT WATER BOTTLE,6,2010-01-12 08:26:00,3.39,17850,United Kingdom
536365,84029E,RED WOOLLY HOTTIE WHITE HEART.,6,2010-01-12 08:26:00,3.39,17850,United Kingdom


In [15]:
copy_to(
    eng,
    data,
    name = 'Retail_Data_zip',
    overwrite = TRUE
)

In [16]:
tdf_zip <- tbl(eng, in_schema("demo_user", "Retail_Data_zip"))
head(tdf_zip, n = 5)

[90m# Source:   SQL [5 x 8][39m
[90m# Database: Teradata[39m
  InvoiceNo StockCode Description         Quantity InvoiceDate         UnitPrice
  [3m[90m<chr>[39m[23m     [3m[90m<chr>[39m[23m     [3m[90m<chr>[39m[23m                  [3m[90m<dbl>[39m[23m [3m[90m<dttm>[39m[23m                  [3m[90m<dbl>[39m[23m
[90m1[39m 536365    84406B    CREAM CUPID HEARTS…        8 2010-01-12 [90m08:26:00[39m      2.75
[90m2[39m 536365    84029E    RED WOOLLY HOTTIE …        6 2010-01-12 [90m08:26:00[39m      3.39
[90m3[39m 536365    84029G    KNITTED UNION FLAG…        6 2010-01-12 [90m08:26:00[39m      3.39
[90m4[39m 536365    71053     WHITE METAL LANTERN        6 2010-01-12 [90m08:26:00[39m      3.39
[90m5[39m 536365    85123A    WHITE HANGING HEAR…        6 2010-01-12 [90m08:26:00[39m      2.55
[90m# ℹ 2 more variables: CustomerID <dbl>, Country <chr>[39m

<hr style="height:2px;border:none;background-color:#00233C;">
<b style = 'font-size:20px;font-family:Arial;color:#00233C'>5. Cleanup</b>

<p style = 'font-size:18px;font-family:Arial;color:#00233C'><b>Work Tables</b></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Cleanup work tables to prevent errors next time.</p>

In [17]:
tables <- c('Retail_Data', 'Retail_Data_zip')

# Loop through the list of tables and execute the drop table command for each table
for (table in tables) {
    tryCatch(
        db_drop_table(eng, table)
    )
}

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>The following code will remove context.</p>

In [18]:
td_remove_context()

<hr style="height:2px;border:none;background-color:#00233C;">
<p style = 'font-size:16px;font-family:Arial;color:#00233C'><b>Links:</b></p>
<ul style = 'font-size:16px;font-family:Arial'>
    <li>Teradata® Package for R Function Reference: <a href = 'https://docs.teradata.com/search/all?query=Teradata+Package+for+R+Function+Reference&content-lang=en-US'>here</a></li>
</ul>

<footer style="padding-bottom:35px; background:#f9f9f9; border-bottom:3px solid #00233C">
    <div style="float:left;margin-top:14px">ClearScape Analytics™</div>
    <div style="float:right;">
        <div style="float:left; margin-top:14px">
            Copyright © Teradata Corporation - 2024. All Rights Reserved
        </div>
    </div>
</footer>