<header>
   <p  style='font-size:36px;font-family:Arial; color:#F0F0F0; background-color: #00233c; padding-left: 20pt; padding-top: 20pt;padding-bottom: 10pt; padding-right: 20pt;'>
       Sales Forecasting :- SAS and Vantage Comparison
  <br>
       <img id="teradata-logo" src="https://storage.googleapis.com/clearscape_analytics_demo_data/DEMO_Logo/teradata.svg" alt="Teradata" style="width: 125px; height: auto; margin-top: 20pt;">
    </p>
</header>

<p style = 'font-size:20px;font-family:Arial;color:#00233C'><b>Introduction</b></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>
This demo walks through how a typical SAS user would use sales data to build a simple sales forecasting model and then will showcase how we can achieve the same using Vantage InDB Analytics</p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>
Customers are finding their analytical environments difficult to manage and are looking for ways to make these environments more streamlined while adapting to more contemporary technologies. Our open source analytical ecosystem can be leveraged to simplify and apply more governance to the data flows in your analytical environment, enabling you to increase efficiency of computation, reduce cost of ownership and take advantage of any analytical tool of choice.</p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Benefits our customers have received:
    <img src = 'images/SASIntro.png' width = '400' align='right' padding='padding:50px'>

<li style = 'font-size:16px;font-family:Arial'>For large data extract the time reduced from 16 hours 28 minutes to 28 secs</li>
    <br>
<li style = 'font-size:16px;font-family:Arial'>Reduced analytic processing from 30 days to a single day.</li>
    <br>
<li style = 'font-size:16px;font-family:Arial'>Eliminated over 75% of redundant data in SAS environments and Reduced data latency.</li> <br>
<li style = 'font-size:16px;font-family:Arial'>Pushed analytic processing into Teradata Database which helped eliminate 6 TB of redundant data and Improved quality of analytics.</li>

</p>

<br>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'> This overview shows how to undertake an analytical model foundation using ClearScape Analytics that uses data from a variety of sources. Teradata Vantage™ enables enterprises to automate and post timely model outputs for use in downstream business processes.</p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>In this demo we will show 
    <div class="alert alert-block alert-warning">  
<p style = 'font-size:16px;font-family:Arial;color:#000000'><i>the SAS code in tan colored box and will not be executed.</i>    
</div>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>And follow that with </p> 
    <code style = 'font-size:16px;'><i>Vantage code that we will run in executable cells</i></code>
</p>

<hr style="height:2px;border:none;background-color:#00233C;">
<p style = 'font-size:20px;font-family:Arial;color:#00233C'><b>1. Connect to Vantage</b></p>


<p style = 'font-size:16px;font-family:Arial;color:#00233C'>In the section, we import the required libraries and set environment variables and environment paths (if required).</p>

In [None]:
import warnings
warnings.filterwarnings('ignore')
import time
from teradataml import * 

import getpass

display.max_rows=5

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>You will be prompted to provide the password. Enter your password, press the Enter key, then use down arrow to go to next cell. Begin running steps with Shift + Enter keys.</p>

In [None]:
%run -i ../startup.ipynb
eng = create_context(host = 'host.docker.internal', username='demo_user', password = password)
print(eng)

In [None]:
%%capture
execute_sql('''SET query_band='DEMO=Sales_Forecasting_SAS_Vantage_PY_SQL.ipynb;' UPDATE FOR SESSION; ''')

<hr style="height:2px;border:none;background-color:#00233C;">
<p style = 'font-size:20px;font-family:Arial;color:#00233C'><b>2. Getting Data for This Demo</b></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>We have provided data for this demo on cloud storage. You can either run the demo using foreign tables to access the data without any storage on your environment or download the data to local storage, which may yield faster execution. Still, there could be considerations of available storage. Two statements are in the following cell, and one of them is commented out. You may switch between the modes by changing the comment string.</p>

In [None]:
%run -i ../run_procedure.py "call get_data('DEMO_SlsForecast_SAS_cloud');"
 # Takes about 45 seconds
#%run -i ../run_procedure.py "call get_data('DEMO_SlsForecast_SAS_local');"
 # Takes about 70 seconds

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Next is an optional step – if you want to see status of databases/tables created and space used.</p>

In [None]:
%run -i ../run_procedure.py "call space_report();"

<hr style="height:2px;border:none;background-color:#00233C;">
<p style = 'font-size:20px;font-family:Arial;color:#00233C'><b>3. Explore the dataset</b></p>
<p style = 'font-size:18px;font-family:Arial;color:#00233C'><b>Transfer and explore data in SAS </b></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>If we were to execute SAS code the first step would be to import all the data from Vantage to SAS. The first data step will create a new dataset named "sales_temp_data_1" in the "work" library. It will select all the data from the table "sales_ts_data_1_54M" located in the "Teradata" (we are calling it through our libname connection) teralib.</p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'> We replicate the same process with the second data step. We fetch all the data from the table "sales_ts_data_2_54M" and store it in "sales_temp_data_2" in the "work" library. After running the second data step we’ll now see store_id, day of sale, transaction_id, product sku id, and transaction quantity and transaction weight within the SAS library.</p>

<code>

<div class="alert alert-block alert-warning">  
<p style = 'font-size:18px;font-family:Arial;color:#000000'><b>Equivalent SAS Code</b>    
<p style = 'font-size:16px;color:#000000'> 
/* Setting up a libname for the connection with Teradata Database */
libname teralib teradata server=barbera user=tahaw pw=tahaw database=tahaw;
options SASTRACE=',,,ds' SASTRACELOC=SASLOG nostsuffix;
<p style = 'font-size:16px;color:#000000'>
/* The first Data step is to fetch all the rows from the Teradata table and create an SAS dataset in the work library */
%let start_time = %sysfunc(datetime());
data work.sales_temp_data_1;
set TERALIB.sales_ts_data_1_54M;
run;
<p style = 'font-size:16px;color:#000000'>
/* The second Data step is to fetch all the rows from the Teradata table for the remaining attributes */
data work.sales_temp_data_2;
set TERALIB.sales_ts_data_2_54M;
run;
    </p>
</div>
</code>

<p style = 'font-size:18px;font-family:Arial;color:#00233C'><b>Explore data in Vantage </b></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>As the data is already in Vantage the data transfer step is <b>NOT</b> required. So, we will explore the data in both the tables</p> 

In [None]:
df_qty = DataFrame(in_schema('DEMO_SlsForecast_SAS','Store_sales_Qty'))
df_qty

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>This data set contain store_id, day of sale, transaction_id, product sku id, and transaction quantity and transaction weight.</p> 


In [None]:
df_amt = DataFrame(in_schema('DEMO_SlsForecast_SAS','Store_Sales_Amt'))
df_amt

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>This data set contain store_id, day of sale, transaction_id, product sku id and transaction amount.</p> 


<hr style="height:2px;border:none;background-color:#00233C;">
<p style = 'font-size:20px;font-family:Arial;color:#00233C'><b>4. Aggregating the Data</b></p>
<p style = 'font-size:18px;font-family:Arial;color:#00233C'><b>Aggregate data in SAS</b></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>As a next step in SAS, we need to prepare the data into a single ADS to forecast sales for this particular store and aggregate the total sales by each store product and transaction id on a particular day.</p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Proc SQL is used in SAS, which uses SAS’s native SQL processing capabilities. Apply the sum on the transaction amount and group by on store_id, day_of_sale, product_sku_id, transaction_id. After it executes, running it will create the dataset sales_aggregated_data_1 in the work library. This new dataset has around 52.5 million records.

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Aggregation is applied on the second data set, sales_temp_2 to do a sum on transaction quantity and weight and group by on store_id, day_of_sale, product_sku_id, transaction_id. After execution, it will create a new dataset named sales_aggregated_data_2 in the work library. This new dataset has around 52.5 million records .</p>

<code>

<div class="alert alert-block alert-warning">  
<p style = 'font-size:18px;font-family:Arial;color:#000000'><b>Equivalent SAS Code</b>    
<p style = 'font-size:16px;color:#000000'> 
/* Aggregating the amount to calculate the total sales by each store and product on a particular day */
proc sql;
  create table work.sales_aggregated_data_1 as
  select 
    store_id, day_of_sale, product_sku_id, transaction_id,
    sum(transaction_amount) as total_sales
  from work.sales_temp_data_1
  group by store_id, day_of_sale, product_sku_id, transaction_id;
quit;
<p style = 'font-size:16px;color:#000000'> 
/* Aggregating the weight and quantity to calculate the total weight 
and total quantity by each store and product on a particular day */
proc sql;
  create table work.sales_aggregated_data_2 as
  select 
    store_id, day_of_sale, product_sku_id, transaction_id,
    sum(transaction_quantity) as total_quantity,
    sum(transaction_weight) as total_weight
  from work.sales_temp_data_2
  group by store_id, day_of_sale, product_sku_id, transaction_id;
quit;
    </p>
    </div>
</code>

<p style = 'font-size:18px;font-family:Arial;color:#00233C'><b>Aggregate data in Vantage </b></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>First we sum the amount using group by store_id, day_of_sale, product_sku_id, transaction_id.</p> 

In [None]:
query='''create multiset table sales_aggregated_data_amt as
(select 
    store_id, day_of_sale, product_sku_id, transaction_id,
    sum(transaction_amount) as total_sales
from DEMO_SlsForecast_SAS.Store_Sales_Amt
group by store_id, day_of_sale, product_sku_id, transaction_id)with data Primary index(transaction_id);'''

try:
    execute_sql(query)
except:
    db_drop_table('sales_aggregated_data_amt')
    execute_sql(query)

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Then, we sum the quantity and weight using group by store_id, day_of_sale, product_sku_id, transaction_id.</p>

In [None]:
query='''create multiset table sales_aggregated_data_qty as
(select 
    store_id, day_of_sale, product_sku_id, transaction_id,
    sum(transaction_quantity) as total_quantity,
    sum(transaction_weight) as total_weight
from DEMO_SlsForecast_SAS.Store_Sales_Qty
group by store_id, day_of_sale, product_sku_id, transaction_id)with data Primary index(transaction_id);'''

try:
    execute_sql(query)
except:
    db_drop_table('sales_aggregated_data_qty')
    execute_sql(query)

<hr style="height:2px;border:none;background-color:#00233C;">
<p style = 'font-size:20px;font-family:Arial;color:#00233C'><b>5. Merging the Data</b></p>
<p style = 'font-size:18px;font-family:Arial;color:#00233C'><b>Merging data in SAS</b></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>As a next step, to prepare the data into a single ADS to forecast sales for this particular store, aggregation is done of the total sales by each store product and transaction id on a particular day.

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>These 2 aggregated datasets are merged to have a single table that contains store_id, day_of_sale, product_sku_id, transaction_id, transaction_amount, transaction quantity, and transaction weight. Merge in SAS is used for the two datasets and define the join columns such as store_id, day_of_sale, product_sku_id and transaction_id. After merging the dataset will have all the required columns in one dataset. This dataset contains around 52.5 million rows.</p>

<code>

<div class="alert alert-block alert-warning">  
<p style = 'font-size:18px;font-family:Arial;color:#000000'><b>Equivalent SAS Code</b>    
<p style = 'font-size:16px;color:#000000'>   
/* Vertically merging two datasets and creating another dataset in work library with all the required attributes */
data work.merged_sales_data_c;
merge  work.sales_aggregated_data_1
       work.sales_aggregated_data_2;
       by store_id day_of_sale product_sku_id transaction_id;
       run;
<p style = 'font-size:16px;color:#000000'> 
/* Aggregating the amount to calculate total sales in a particular day */
proc sql;
  create table work.aggregated_data as
  select 
    day_of_sale,
    sum(total_sales) as total_sales
  from work.merged_sales_data_c
  group by day_of_sale;
    quit;</p>
    </div>
    </code>

<p style = 'font-size:18px;font-family:Arial;color:#00233C'><b>Merging data in Vantage </b></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>We merge the 2 datasets in Vantage.</p> 

In [None]:
query='''create MULTISET table merged_sales_data as (
      SELECT A.store_id,A.day_of_sale,A.transaction_id,A.product_sku_id,A.total_sales,B.total_quantity,B.total_weight
        FROM (SELECT store_id,day_of_sale,product_sku_id,transaction_id,total_sales
            FROM sales_aggregated_data_amt ) AS A
        INNER JOIN 
        (SELECT store_id,day_of_sale,product_sku_id,transaction_id,total_quantity,total_weight
            FROM sales_aggregated_data_qty ) AS B
        ON A.day_of_sale=B.day_of_sale AND A.product_sku_id=B.product_sku_id AND A.store_id=B.store_id 
        AND A.transaction_id=B.transaction_id) WITH DATA PRIMARY INDEX(transaction_id);'''

try:
    execute_sql(query)
except:
    db_drop_table('merged_sales_data')
    execute_sql(query)

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>We do a final aggregation to get the total sales by day in Vantage.</p> 

In [None]:
query='''create multiset table aggregated_sales_td as
  (select 
    rank() over(partition by day_of_sale order by day_of_sale) as SeriesId, ---Series Id created for using in ARIMA
    cast(day_of_sale as timestamp(0)) as day_of_sale,
    sum(total_sales) as total_sales
  from merged_sales_data 
  group by day_of_sale) with data;'''
    
try:
    execute_sql(query)
except:
    db_drop_table('aggregated_sales_td')
    execute_sql(query)    

In [None]:
df_agg_sales = DataFrame('aggregated_sales_td')
df_agg_sales

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>We plot the total sales by day of sales to check the series data.</p> 

In [None]:
figure = Figure(width=1000, height=400, heading="Daily Sales")

plot = df_agg_sales.plot(
    x=df_agg_sales.day_of_sale,
    y=df_agg_sales.total_sales,
    xlabel='Day of Sale',
    ylabel='Total Sales',
    color='carolina blue',
    figure=figure
)

plot.show()

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>In the plot we can see that the Sales vary from September 2019 till November 2019 than it is flat till November 2020. The sales again vary from November 2020 till January 2021. After that there is a steep drop in sales and it remains below 1000 from January 2021 and March 2021.</p> 

<hr style="height:2px;border:none;background-color:#00233C;">
<p style = 'font-size:20px;font-family:Arial;color:#00233C'><b>6. Using ARIMA (AutoRegressive Integrated Moving Average) model to forecast Sales</b></p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>
ARIMA functions on VANTAGE run in the following order:
<br>
<li style = 'font-size:16px;font-family:Arial'> Run the ArimaEstimate() function to get the coefficients for the ARIMA model.
<li style = 'font-size:16px;font-family:Arial'> [Optional] Run ArimaValidate() function to validate the 'goodness of fit' of the ARIMA model, when "fit_percentage" argument value is not 100 in ArimaEstimate() function.
<li style = 'font-size:16px;font-family:Arial'>Run the ArimaForecast() function with input from step 1 or step 2 to forecast the future periods beyond the last observed period.</li>
</p>


<hr style="height:1px;border:none;background-color:#00233C;">
<p style = 'font-size:18px;font-family:Arial;color:#00233C'><b>6.1 Estimation step</b></p>
<p style = 'font-size:18px;font-family:Arial;color:#00233C'><b>Estimation step in SAS</b></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>The final step is to fit the Arima model. In the first PROC ARIMA block, it identifies the best ARIMA model for the total_sales variable with a maximum lag of 30 using the identify statement and estimates the model with one seasonal difference (q=1). The parameter estimates are saved in the arima_params dataset in the work library.</p>

<code>

<div class="alert alert-block alert-warning">  
<p style = 'font-size:18px;font-family:Arial;color:#000000'><b>Equivalent SAS Code</b>    
<p style = 'font-size:16px;color:#000000'> 
/* Fit ARIMA model and calculating its coefficients */
proc arima data=work.aggregated_data;
  identify var=total_sales(1) nlag=30;
  estimate q=1 outest=arima_params;
    run;</p>
    </div>
</code>

<p style = 'font-size:18px;font-family:Arial;color:#00233C'><b>Estimation step in Vantage</b></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>The ArimaEstimate() function estimates the coefficients corresponding to an ARIMA (AutoRegressive Integrated Moving Average) model, and to fit a series with an existing ARIMA model. The function can also provide the "goodness of fit" and the residuals of the fitting operation. The function generates model layer used as input for the ArimaValidate() and ArimaForecast() functions. This function is for univariate series.</p>
<br>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Here, the previously estimated parameters, namely P (Auto-Regressive lags), d (differencing order), and Q (Moving Average lags), are required to be passed into the MODEL_ORDER function. For example, the specific values used here are MODEL_ORDER(2, 1, 8).
</p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>
Furthermore, the fit percentage is determined to be 100. This fit percentage indicates that the ARIMA model is trained using 100% of the available data.</p>

In [None]:
data_series_df = TDSeries(data=df_agg_sales,
                              id="SeriesId",
                              row_index=("day_of_sale"),
                              row_index_style= "TIMECODE",
                              payload_field="total_sales",
                              payload_content="REAL")

In [None]:
from teradataml import ArimaEstimate
# Execute ArimaEstimate function.
arima_est_out = ArimaEstimate(data1=data_series_df,
                            nonseasonal_model_order=[2,1,8],
                            constant=False,
                            algorithm="CSS_MLE",
                            coeff_stats=True,
                            fit_metrics=True,
                            residuals=True,
                            fit_percentage=100)

In [None]:
est_result=arima_est_out.fitresiduals
est_result

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>We plot the Actual Value of Sales vs the Calculated Value of the ArimaEstimate function. </p>

In [None]:
figure = Figure(width=1000, height=500, heading="Comparison of Actual vs Estimated Sales")

plot = est_result.plot(
    x=est_result.ROW_I,
    y=[est_result.ACTUAL_VALUE, est_result.CALC_VALUE],
    xlabel='Sales Date',
    ylabel='Total Sales',
    color=['carolina blue','orange'],
    figure=figure,
    grid_linestyle='-',
    grid_linewidth=0.5,
    legend=['Actual Value','Calculated Value'],
    legend_style='upper right',
)

plot.show()

<hr style="height:1px;border:none;background-color:#00233C;">
<p style = 'font-size:18px;font-family:Arial;color:#00233C'><b>6.2 Forecast step</b></p>
<p style = 'font-size:18px;font-family:Arial;color:#00233C'><b>Forecast step in SAS</b></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>In the second PROC ARIMA block, it identifies and estimates the same ARIMA model as in the first block but additionally generates forecasts for the next 30 time periods (lead=30) and stores the forecasted values in the forecasted_sales dataset in the work library. Now if run this we can see the results with the log indicating it took around 2 seconds to fit the Arima model on the aggregated data. Once the small aggregated dataset is available for the SAS procedure it executes relatively fast.<p>

<code>

<div class="alert alert-block alert-warning">  
<p style = 'font-size:18px;font-family:Arial;color:#000000'><b>Equivalent SAS Code</b>    
<p style = 'font-size:16px;color:#000000'> 
/* Forecasting future 30 values */
proc arima data=work.aggregated_data;
  identify var=total_sales(1) nlag=30;
  estimate q=1 outest=arima_params;
  forecast lead=30 out=forecasted_sales;
    run;</p>
    </div>    
</code>

<p style = 'font-size:18px;font-family:Arial;color:#00233C'><b>Forecast Step in Vantage</b></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>The ArimaForecast() function is used to forecast a user-defined number of periods based on models fitted from the ArimaEstimate() function.</p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Here in the next cell, we use the estimated model to forecast the sales for the subsequent 30 periods, i.e., next 30 days.</p>

In [None]:
from teradataml import ArimaForecast

# Create teradataml TDAnalyticResult object over the result attribute of 'arima_estimate_op'
data_art_df = TDAnalyticResult(data=arima_est_out.result)
 
arima_forcast_out = ArimaForecast(data=data_art_df, forecast_periods=30)
forecast_result=arima_forcast_out.result
forecast_result

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>We plot the Forecasted Value of Sales for the defined number of periods. </p>

In [None]:
import matplotlib.pyplot as plt
# import matplotlib.patches as patches
import seaborn as sns
forecast_result_plot=forecast_result.to_pandas().reset_index()
plt.figure(figsize=(20,8))
sns.lineplot(data= forecast_result_plot ,x="ROW_I",y="FORECAST_VALUE",ci=None)
plt.fill_between(forecast_result_plot.ROW_I, forecast_result_plot.LO_80, forecast_result_plot.HI_80, color='lightblue', alpha=0.5)
# plt.legend(['Actual Value', 'Predicted Value'], loc='best', fontsize=16)
plt.title('Forecast of Sales ', fontsize=20)
plt.ylabel('Sales', fontsize=16)
plt.xlabel('Sales Date', fontsize=16)
plt.show()


<p style = 'font-size:16px;font-family:Arial;color:#00233C'>The dark blue line is the Forecasted Sales for the next 30 days, and the blue lines are the upper and lower confidence interval with an 80% confidence level. As seen in the original Sales graph, the sales have dropped below 1000 for the latest period. Similar sales can be observed in the forecast period, which varies around 1000.</p>
<hr style="height:2px;border:none;background-color:#00233C;">
<p style = 'font-size:20px;font-family:Arial;color:#00233C'><b>7. Conclusion:</b></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>
Moving large amounts of data between Vantage and SAS is usually the main culprit for slow running jobs and complex analytics pipelines amplifying governance issues from orphaned and exposed data in SAS environments. By executing the complete flow inside Vantage using ClearScape Analytics we are reducing the complexity and achieving greater efficiency. </p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>One of our customers cited the following benefits:</p>
<table style = 'border:1px solid black;width:100%;'>
         <tr style = 'font-size:16px;font-family:Arial;border:1px solid black;color:#00233C'>
             <th style = 'border:1px solid black;'><b>Past Approach using SAS Alone</b></th>
             <th style = 'border:1px solid black;'><b>In-Database Approach</b></th>             
         </tr>
         <tr style = 'font-size:14px;font-family:Arial;border:1px solid black;color:#00233C'>
            <td style = 'border:1px solid black;'>Daily process begins with flat file creation at 6:30am – SLA delivered at ~9:30am.</td>
            <td style = 'border:1px solid black;'>Daily process begins at 4:00am with EDW load.</td>
         </tr>
         <tr style = 'font-size:14px;font-family:Arial;border:1px solid black;color:#00233C'>
            <td style = 'border:1px solid black;'>File transferred to SQL Server,<b> limited to ~350K customer records based on  specific criteria.</b></td> 
            <td style = 'border:1px solid black;'>All operational data loaded directly to EDW.  No flat file or intermediate processing is needed.</td>
         </tr>
         <tr style = 'font-size:14px;font-family:Arial;border:1px solid black;color:#00233C'>
             <td style = 'border:1px solid black;'><b>300 step process</b> to support data mining life cycle.<b>30 MINUTES TO SCORE ~350k customers</b></td> 
             <td style = 'border:1px solid black;'><b>10 step process</b> - Scoring and customer selection done in-database against  ALL customer rows <b>4 MINUTES TO SCORE ~40M customers</b></td>
         </tr> 
         <tr style = 'font-size:14px;font-family:Arial;border:1px solid black;color:#00233C'>
             <td style = 'border:1px solid black;'><b>Runs in ~ 3 HOURS</b></td> 
             <td style = 'border:1px solid black;'><b>Runs in 12 MINUTES with 114x # of customers.</b></td>
         </tr>
         <tr style = 'font-size:14px;font-family:Arial;border:1px solid black;color:#00233C'>
            <td style = 'border:1px solid black;'><b>Cost for data mart infrastructure and support, plus lost opportunity to process all customers</b></td> 
            <td style = 'border:1px solid black;'><b>Uses available capacity outside of work hours with far less complexity and support cost. By processing all 40M customers vs 350K subset, increased collections $1M - $3M/month.</b></td>
         </tr> 
         <tr>
           
</table>

<hr style="height:2px;border:none;background-color:#00233C;">
<p style = 'font-size:20px;font-family:Arial;color:#00233C'><b>8. Cleanup</b></p>
<p style = 'font-size:18px;font-family:Arial;color:#00233C'><b>Work Tables</b></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Cleanup work tables to prevent errors next time. This section drops all the tables created during the demonstration.</p>

In [None]:
tables = ['sales_aggregated_data_amt','sales_aggregated_data_qty','merged_sales_data','aggregated_sales_td']

# Loop through the list of tables and execute the drop table command for each table
for table in tables:
    try:
        db_drop_table(table_name=table)
    except:
        pass
      

<p style = 'font-size:18px;font-family:Arial;color:#00233C'><b>Databases and Tables</b></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>The following code will clean up tables and databases created above.</p>

In [None]:
%run -i ../run_procedure.py "call remove_data('DEMO_SlsForecast_SAS');" 
#Takes 45 seconds

In [None]:
remove_context()

<hr style="height:1px;border:none;background-color:#00233C;">
<p style = 'font-size:16px;font-family:Arial;color:#00233C'><b>Links:</b></p>
<ul style = 'font-size:16px;font-family:Arial'>
    <li>UAF(Unbounded Array Framework) Documentation: <a href = 'https://docs.teradata.com/r/Enterprise_IntelliFlex_VMware/Teradata-VantageTM-Unbounded-Array-Framework-Time-Series-Reference-17.20/Unbounded-Array-Framework'>https://docs.teradata.com/r/Enterprise_IntelliFlex_VMware/Teradata-VantageTM-Unbounded-Array-Framework-Time-Series-Reference-17.20/Unbounded-Array-Framework</a></li>
</ul>

<footer style="padding-bottom:35px; background:#f9f9f9; border-bottom:3px solid #00233C">
    <div style="float:left;margin-top:14px">ClearScape Analytics™</div>
    <div style="float:right;">
        <div style="float:left; margin-top:14px">
            Copyright © Teradata Corporation - 2023,2024. All Rights Reserved
        </div>
    </div>
</footer>