# Import the Data Set

## The dataset structure 

#### **Cash Flows**

The Cash Flows file (`CashFlows.txt`) presents daily measures of cash flows from January 2, 1998 to September, 30 1998. Each observation is characterized by 25 variables described in the following table.

| Variable                                                                        | Description                                                        | Example of values
:---------------------------------------------------------------------------------|--------------------------------------------------------------------|--------------------------
| Date                                                                            | Day, month and year of the readings                                | A date
| <nobr>`Cash`</nobr>                                                             | Cash flow                                                          | A numerical value with n decimals
| <nobr>`BeforeLastMonday`</nobr> <br><nobr>`LastMonday`</nobr> <br><nobr>`BeforeLastTuesday`</nobr> <br><nobr>`LastTuesday`</nobr> <br><nobr>`BeforeLastWednesday`</nobr> <br><nobr>`LastWednesday`</nobr> <br><nobr>`BeforeLastThursday`</nobr> <br><nobr>`LastThursday`</nobr> <br><nobr>`BeforeLastFriday`</nobr> <br><nobr>`LastFriday`</nobr> | Boolean variables that indicate if the information is true or false   | 1 if the information is true.
| <nobr>`Last5WDays`</nobr><br><nobr>`Last4WDays`</nobr>                         | Boolean variables that indicate if the date is in the 5 or 4 last working days of the month | 1 if the information is true.
| <nobr>`LastWMonth`</nobr><br><nobr>`BeforeLastWMonth`</nobr>                   | Boolean variables that indicate if the information is true or false | 1 if the information is true.
| <nobr>`WorkingDaysIndices`</nobr><br><nobr>`ReverseWorkingDaysIndices`</nobr>  | Indices or reverse indices of the working days | An integer value
| <nobr>`MondayMonthInd`</nobr><br><nobr>`TuesdayMonthInd`</nobr><br><nobr>`WednesdayMonthInd`</nobr><br><nobr>`ThursdayMonthInd`</nobr><br><nobr>`FridayMonthInd`</nobr> | Indices of the week days in the month | An integer value
| <nobr>`Last5WDaysInd`</nobr><br><nobr>`Last4WDaysInd`</nobr>                   | Indices of the 5 or 4 last working days of the month | An integer value

#### **Los Angeles Ozone**

The Los Angeles Ozone file (`R_ozone-la.txt`) presents monthly averages of hourly ozone (O3) readings in downtown Los Angeles from 1955 to 1972.

Each observation is characterized by 2 variables described in the following table:

| Variable        | Description                                  | Example of values
:-----------------|----------------------------------------------|--------------------------
| `Time`          | Month and year of the readings               | A date
| `R_ozone-la`    | Average of the hourly readings for the month | A numerical value

#### **Lag 1 And Cycles** & **Trend And Cyclic**

These files can be used to observe and analyze the impact of specific signal phenomenon.  

Each observation is characterized by 2 variables described in the following table:

| Variable    | Description               | Example of values
:-------------|---------------------------|--------------------------
| `TIME`      | The date of the readings  | A date
| `Signal`    | the signal value          | A numerical value

### **Set the SQLAlchemy import along with IPython-SQL magic**

In [1]:
import sqlalchemy, os
from sqlalchemy import create_engine

%reload_ext sql
%config SqlMagic.displaylimit = 5
%config SqlMagic.feedback = False
%config SqlMagic.autopandas = True

 ### **Define the connection string and the target schema**

In [2]:
hxe_connection = 'hana://ML_USER:Welcome18@hxehost:39015';

### **Inititalize the connection and set the schema**

In [3]:
%sql $hxe_connection

u'Connected: ML_USER@None'

### **Drop the tables if they exists**

In [4]:
%%sql 
drop table forecast_cashflow;
drop table forecast_ozone;
drop table forecast_lag_1_and_cycles;
drop table forecast_lag_1_and_cycles_and_wn;
drop table forecast_trend_and_cyclic;
drop table forecast_trend_and_cyclic_and_wn;
drop table forecast_trend_and_cyclic_and_4wn;

 * hana://ML_USER:***@hxehost:39015


### **Create the Forecast datasets tables**

In [5]:
%%sql 
create table forecast_cashflow (
    cashdate                    daydate,
    workingdaysindices          integer,
    reverseworkingdaysindices   integer,
    mondaymonthind              integer,
    tuesdaymonthind             integer,
    wednesdaymonthind           integer,
    thursdaymonthind            integer,
    fridaymonthind              integer,
    beforelastmonday            integer,
    lastmonday                  integer,
    beforelasttuesday           integer,
    lasttuesday                 integer,
    beforelastwednesday         integer,
    lastwednesday               integer,
    beforelastthursday          integer,
    lastthursday                integer,
    beforelastfriday            integer,
    lastfriday                  integer,
    last5wdaysind               integer,
    last5wdays                  integer,
    last4wdaysind               integer,
    last4wdays                  integer,
    lastwmonth                  integer,
    beforelastwmonth            integer,
    cash                        double,
    primary key (cashdate)
);
create table forecast_ozone (
    time                 daydate,
    reading              double,
    primary key (time)
);
create table forecast_lag_1_and_cycles (
    time                 daydate,
    signal               double,
    primary key (time)
);
create table forecast_lag_1_and_cycles_and_wn (
    time                 daydate,
    signal               double,
    primary key (time)
);
create table forecast_trend_and_cyclic (
    time                 daydate,
    signal               double,
    primary key (time)
);
create table forecast_trend_and_cyclic_and_wn (
    time                 daydate,
    signal               double,
    primary key (time)
);
create table forecast_trend_and_cyclic_and_4wn (
    time                 daydate,
    signal               double,
    primary key (time)
);

 * hana://ML_USER:***@hxehost:39015


### **Import the data in the Forecast datasets**

In [6]:
%%sql 
import from csv file '/usr/sap/HXE/HDB90/work/data/forecast/CashFlows.txt' into forecast_cashflow
with
   record delimited by '\n'
   field delimited by '\t'
   optionally enclosed by '"'
   skip first 1 row
   fail on invalid data
   error log '/home/jupyteradm/log/CashFlows.txt.err'
;
import from csv file '/usr/sap/HXE/HDB90/work/data/forecast/R_ozone-la.txt' into forecast_ozone
with
   record delimited by '\n'
   field delimited by '\t'
   optionally enclosed by '"'
   skip first 1 row
   fail on invalid data
   error log '/home/jupyteradm/log/R_ozone-la.txt.err'
;
import from csv file '/usr/sap/HXE/HDB90/work/data/forecast/Lag1AndCycles.txt' into forecast_lag_1_and_cycles
with
   record delimited by '\n'
   field delimited by '\t'
   optionally enclosed by '"'
   skip first 1 row
   fail on invalid data
   error log '/home/jupyteradm/log/Lag1AndCycles.txt.err'
;
import from csv file '/usr/sap/HXE/HDB90/work/data/forecast/Lag1AndCyclesAndWn.txt' into forecast_lag_1_and_cycles_and_wn
with
   record delimited by '\n'
   field delimited by '\t'
   optionally enclosed by '"'
   skip first 1 row
   fail on invalid data
   error log '/home/jupyteradm/log/Lag1AndCyclesAndWn.txt.err'
;
import from csv file '/usr/sap/HXE/HDB90/work/data/forecast/TrendAndCyclic.txt' into forecast_trend_and_cyclic
with
   record delimited by '\n'
   field delimited by '\t'
   optionally enclosed by '"'
   skip first 1 row
   fail on invalid data
   error log '/home/jupyteradm/log/TrendAndCyclic.txt.err'
;
import from csv file '/usr/sap/HXE/HDB90/work/data/forecast/TrendAndCyclicAndWn.txt' into forecast_trend_and_cyclic_and_wn
with
   record delimited by '\n'
   field delimited by '\t'
   optionally enclosed by '"'
   skip first 1 row
   fail on invalid data
   error log '/home/jupyteradm/log/TrendAndCyclicAndWn.txt.err'
;

import from csv file '/usr/sap/HXE/HDB90/work/data/forecast/TrendAndCyclicAnd_4Wn.txt' into forecast_trend_and_cyclic_and_4wn
with
   record delimited by '\n'
   field delimited by '\t'
   optionally enclosed by '"'
   skip first 1 row
   fail on invalid data
   error log '/home/jupyteradm/log/TrendAndCyclicAnd_4Wn.txt.err'
;

 * hana://ML_USER:***@hxehost:39015


### **Count the number of rows loaded**

In [7]:
%%sql 
select 'cashflow'                 as "TABLE", count(1) as "COUNT" FROM forecast_cashflow
union all
select 'ozone'                    as "TABLE", count(1) as "COUNT" FROM forecast_ozone
union all
select 'lag_1_and_cycles'         as "TABLE", count(1) as "COUNT" FROM forecast_lag_1_and_cycles
union all
select 'lag_1_and_cycles_and_wn'  as "TABLE", count(1) as "COUNT" FROM forecast_lag_1_and_cycles_and_wn
union all
select 'trend_and_cyclic'         as "TABLE", count(1) as "COUNT" FROM forecast_trend_and_cyclic
union all
select 'trend_and_cyclic_and_wn'  as "TABLE", count(1) as "COUNT" FROM forecast_trend_and_cyclic_and_wn
union all
select 'trend_and_cyclic_and_4wn' as "TABLE", count(1) as "COUNT" FROM forecast_trend_and_cyclic_and_4wn
;

 * hana://ML_USER:***@hxehost:39015


Unnamed: 0,table,count
0,cashflow,272
1,ozone,204
2,lag_1_and_cycles,499
3,lag_1_and_cycles_and_wn,499
4,trend_and_cyclic,500
5,trend_and_cyclic_and_wn,500
6,trend_and_cyclic_and_4wn,500
