
This is the first in a series of examples provided to demonstrate the use of SAS Visual Data   
Mining and Machine Learning actions to compose a program that follows a standard machine learning process of            
- loading data,                                                          
- preparing the data,                                                    
- building models, and                                                   
- assessing and comparing those models                                   
                                                                          
The programs are written to execute in the CAS in-memory distributed computing engine in the SAS Viya environment.                            
                                                                          
This first example showcases how to load local data into CAS             

### Import packages

In [1]:
from swat import *

### CAS Server connection details

In [4]:
cashost='localhost'
casport=5570
casauth='~/.authinfo'

### Start CAS session

In [6]:
sess = CAS(cashost, casport, authinfo=casauth, caslib="casuser")

### Details for local data to be loaded into CAS

In [7]:
indata_dir="/opt/sasinside/DemoData"
indata="bank_raw"

### Import table action set

In [8]:
sess.loadactionset(actionset="table")

NOTE: Added action set 'table'.


### Load data into CAS

The data set used for this workflow is anonymized bank data consisting of observations taken   
on a large financial services firm's accounts. Accounts in the data represent consumers of   
home equity lines of credit, automobile loans, and other types of short- to medium-term credit          
instruments.  A campaign interval for the bank runs for half of a year, denoting all marketing   
efforts that provide information about and motivate the purchase of the bank's financial services   
products.         
                                                                          
- the bankraw data set is the original data in its raw form              
- the bank data set is the resulting data set after applying appropriate data cleansing                                                         
                                                                          
The target variable "b_tgt" quantifies account responses over the current campaign season (1   
for at least one purchase, 0 for no purchases) A description of all variables can be found in   
the data dictionary for this data set available in "BankData" in your File Shortcuts.              
                                                                          
For execution in the CAS engine, data must be loaded from the local data set to a CAS table.   
This code first checks to see if the specified CAS table exists and then loads data from local   
data sets.      

In [9]:
if not sess.table.tableExists(table=indata).exists:
    tbl = sess.upload_file(indata_dir+"/"+indata+".sas7bdat", casout={"name":indata})

NOTE: Cloud Analytic Services made the uploaded file available as table BANK_RAW in caslib CASUSER(viyauser).
NOTE: The table BANK_RAW has been created in caslib CASUSER(viyauser) from binary data uploaded to Cloud Analytic Services.


In [10]:
tbl.head()

Unnamed: 0,b_tgt,cat_input1,cat_input2,cnt_tgt,demog_age,demog_ho,demog_pr,int_tgt,r_demog_homeval,r_demog_inc,...,rfm6,rfm7,rfm8,rfm9,rfm10,rfm11,rfm12,demog_genf,demog_genm,account
0,1.0,X,A,,,0.0,24.0,7000.0,57600.0,52106.0,...,22.0,4.0,6.0,5.0,20.0,9.0,92.0,0.0,1.0,100000001
1,1.0,X,A,2.0,,0.0,24.0,7000.0,57587.0,52106.0,...,22.0,4.0,6.0,5.0,20.0,9.0,92.0,0.0,1.0,100000002
2,1.0,X,A,2.0,,0.0,0.0,15000.0,44167.0,42422.0,...,16.0,3.0,8.0,16.0,27.0,11.0,91.0,0.0,1.0,100000003
3,0.0,X,A,0.0,68.0,0.0,32.0,,90587.0,59785.0,...,21.0,2.0,7.0,15.0,19.0,9.0,123.0,1.0,0.0,100000004
4,0.0,X,A,0.0,,0.0,0.0,,100313.0,,...,38.0,5.0,19.0,24.0,13.0,6.0,128.0,1.0,0.0,100000005


### End CAS session

In [8]:
sess.close()