# Chapter 4 - Managing Your Data in CAS

## Getting Started with CASLibs and CAS Tables

Create a connection to CAS and list the CASLibs using the **caslibinfo** action.

In [3]:
import swat

conn = swat.CAS('server-name.mycompany.com', 5570, 'username', 'password')

In [4]:
conn.caslibinfo()

Unnamed: 0,Name,Type,Description,Path,Definition,Subdirs,Local,Active,Personal,Hidden
0,CASUSER(username),PATH,Personal File System Caslib,/u/username/,,1.0,0.0,1.0,1.0,0.0


List the items at a path relative to the given CASLib using the **fileinfo** action.

In [5]:
conn.fileinfo('data', caslib='casuser')

Unnamed: 0,Permission,Owner,Group,Name,Size,Encryption,Time
0,-rw-r--r--,username,users,iris.csv,3716,,18Feb2016:17:25:43
1,-rw-r--r--,username,users,cars.csv,42177,,18Feb2016:17:25:49
2,-rw-r--r--,username,users,sashelp_class.sashdat,82136,NONE,19Feb2016:17:02:37
3,-rwxr-xr-x,username,users,foo.csv,3716,,06Mar2016:16:59:11
4,-rwxr-xr-x,username,users,foo.sashdat,19472,NONE,06Mar2016:16:59:41
5,-rwxr-xr-x,username,users,foo.xlsx.sashdat,19472,NONE,06Mar2016:17:00:32
6,-rwxr-xr-x,username,users,foo.xls.sashdat,19472,NONE,06Mar2016:17:00:45
7,-rwxr-xr-x,username,users,hmeq_f.sas7bdat,786432,,19Nov2015:15:49:15
8,-rw-r--r--,username,users,class.csv,519,,14Apr2016:10:39:02
9,-rw-r--r--,username,users,stock_history_small_mv_blanks.csv,2865,,15Apr2016:10:20:25


Use the **fileinfo** action with the active CASLib (i.e., casuser).

In [6]:
conn.fileinfo('data')

Unnamed: 0,Permission,Owner,Group,Name,Size,Encryption,Time
0,-rw-r--r--,username,users,iris.csv,3716,,18Feb2016:17:25:43
1,-rw-r--r--,username,users,cars.csv,42177,,18Feb2016:17:25:49
2,-rw-r--r--,username,users,sashelp_class.sashdat,82136,NONE,19Feb2016:17:02:37
3,-rwxr-xr-x,username,users,foo.csv,3716,,06Mar2016:16:59:11
4,-rwxr-xr-x,username,users,foo.sashdat,19472,NONE,06Mar2016:16:59:41
5,-rwxr-xr-x,username,users,foo.xlsx.sashdat,19472,NONE,06Mar2016:17:00:32
6,-rwxr-xr-x,username,users,foo.xls.sashdat,19472,NONE,06Mar2016:17:00:45
7,-rwxr-xr-x,username,users,hmeq_f.sas7bdat,786432,,19Nov2015:15:49:15
8,-rw-r--r--,username,users,class.csv,519,,14Apr2016:10:39:02
9,-rw-r--r--,username,users,stock_history_small_mv_blanks.csv,2865,,15Apr2016:10:20:25


## Loading Data into a CAS Table

Load data from the server-side using the **loadtable** action.

In [8]:
out = conn.loadtable('data/iris.csv', caslib='casuser')
out

ERROR: The table DATA.IRIS already exists in caslib CASUSER(username).
ERROR: The action stopped due to errors.


Specify an output table name explicitly.

In [9]:
out = conn.loadtable('data/iris.csv', caslib='casuser',
                     casout=dict(name='mydata', caslib='casuser'))
out

NOTE: Cloud Analytic Services made the file data/iris.csv available as table MYDATA in caslib CASUSER(username).


Get information about the table using the **tableinfo** action.

In [14]:
conn.tableinfo(name='data.iris', caslib='casuser')

Unnamed: 0,Name,Rows,Columns,Encoding,CreateTimeFormatted,ModTimeFormatted,JavaCharSet,CreateTime,ModTime,Global,Repeated,View,SourceName,SourceCaslib,Compressed,Creator,Modifier
0,DATA.IRIS,150,5,utf-8,01Nov2016:14:42:17,01Nov2016:14:42:17,UTF8,1793631000.0,1793631000.0,0,0,0,data/iris.csv,CASUSER(username),0,username,


Get information about the table columns using **columninfo**.

In [15]:
conn.columninfo(table=dict(name='data.iris', caslib='casuser'))

Unnamed: 0,Column,ID,Type,RawLength,FormattedLength,NFL,NFD
0,sepal_length,1,double,8,12,0,0
1,sepal_width,2,double,8,12,0,0
2,petal_length,3,double,8,12,0,0
3,petal_width,4,double,8,12,0,0
4,species,5,varchar,10,10,0,0


## Displaying Data in a CAS Table

Use the **fetch** action to download rows of data.

In [16]:
conn.fetch(table=dict(name='data.iris', caslib='casuser'), to=5)

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
0,5.1,3.5,1.4,0.2,setosa
1,4.9,3.0,1.4,0.2,setosa
2,4.7,3.2,1.3,0.2,setosa
3,4.6,3.1,1.5,0.2,setosa
4,5.0,3.6,1.4,0.2,setosa


Specify sorting options to get a predictable set of data.

In [17]:
conn.fetch(table=dict(name='data.iris', caslib='casuser'), to=5,
           sortby=['sepal_length', 'sepal_width'])

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
0,4.3,3.0,1.1,0.1,setosa
1,4.4,2.9,1.4,0.2,setosa
2,4.4,3.0,1.3,0.2,setosa
3,4.4,3.2,1.3,0.2,setosa
4,4.5,2.3,1.3,0.3,setosa


## Computing Simple Statistics

Run the **summary** action on the table.

In [18]:
conn.summary(table=dict(name='data.iris', caslib='casuser'))

Unnamed: 0,Column,Min,Max,N,NMiss,Mean,Sum,Std,StdErr,Var,USS,CSS,CV,TValue,ProbT
0,sepal_length,4.3,7.9,150.0,0.0,5.843333,876.5,0.828066,0.067611,0.685694,5223.85,102.168333,14.171126,86.425375,3.331256e-129
1,sepal_width,2.0,4.4,150.0,0.0,3.054,458.1,0.433594,0.035403,0.188004,1427.05,28.0126,14.197587,86.264297,4.3749770000000004e-129
2,petal_length,1.0,6.9,150.0,0.0,3.758667,563.8,1.76442,0.144064,3.113179,2583.0,463.863733,46.942721,26.090198,1.9943049999999999e-57
3,petal_width,0.1,2.5,150.0,0.0,1.198667,179.8,0.763161,0.062312,0.582414,302.3,86.779733,63.66747,19.236588,3.2097039999999996e-42


## Dropping a CAS Table

In [19]:
conn.droptable('data.iris', caslib='casuser')

NOTE: Cloud Analytic Services dropped table data.iris from caslib CASUSER(username).


## The Active CASLib

Use the **caslibinfo** action to display information about CASLibs.  The Active column indicates whether the CASLib is the active CASLib.

In [20]:
conn.caslibinfo()

Unnamed: 0,Name,Type,Description,Path,Definition,Subdirs,Local,Active,Personal,Hidden
0,CASUSER(username),PATH,Personal File System Caslib,/u/username/,,1.0,0.0,1.0,1.0,0.0


You can get the active CASLib setting using the **getsessopt** action.

In [21]:
conn.getsessopt('caslib')

You can set the active CASLib using the **setsessopt** action.

In [23]:
conn.setsessopt(caslib='otherlib')
# NOTE: 'CASTestTmp' is now the active caslib.
# Out[39]: + Elapsed: 0.000289s, mem: 0.0948mb

ERROR: The value otherlib for session option CASLIB is invalid.
ERROR: The action stopped due to errors.


## Uploading Data Files to CAS Tables

Use the **upload** method on **CAS** connection objects to upload data from client-side files.  This uploads the file to the server as-is.  It is then parsed on the server.

In [25]:
conn.upload('/u/username/data/iris.csv')

NOTE: Cloud Analytic Services made the uploaded file available as table IRIS in caslib CASUSER(username).
NOTE: The table IRIS has been created in caslib CASUSER(username) from binary data uploaded to Cloud Analytic Services.


In [26]:
conn.columninfo(table=dict(name='iris', caslib='casuser'))

Unnamed: 0,Column,ID,Type,RawLength,FormattedLength,NFL,NFD
0,sepal_length,1,double,8,12,0,0
1,sepal_width,2,double,8,12,0,0
2,petal_length,3,double,8,12,0,0
3,petal_width,4,double,8,12,0,0
4,species,5,varchar,10,10,0,0


Specify an explicit table name on **upload**.

In [27]:
conn.upload('/u/username/data/iris.csv', casout=dict(name='iris2', caslib='casuser'))

NOTE: Cloud Analytic Services made the uploaded file available as table MYDATA2 in caslib CASUSER(username).
NOTE: The table MYDATA2 has been created in caslib CASUSER(username) from binary data uploaded to Cloud Analytic Services.


The **upload** method will pass a given **importoptions=** parameter to the underlying **loadtable** action.

In [4]:
out = conn.upload('/u/username/data/iris.tsv',
                  importoptions=dict(filetype='csv', delimiter='\t'),
                  casout=dict(name='iris_tsv', caslib='casuser'))
out

NOTE: Cloud Analytic Services made the uploaded file available as table IRIS_TSV in caslib CASUSER(username).
NOTE: The table IRIS_TSV has been created in caslib CASUSER(username) from binary data uploaded to Cloud Analytic Services.


## Uploading Data from URLs to CAS Tables

Rather than specifying a filename, you can specify a URL.

In [33]:
conn.upload('https://github.com/sassoftware/' 
            'sas-viya-programming/blob/master/data/class.csv')

NOTE: Cloud Analytic Services made the uploaded file available as table CLASS in caslib CASUSER(username).
NOTE: The table CLASS has been created in caslib CASUSER(username) from binary data uploaded to Cloud Analytic Services.


## Uploading Data from a Pandas DataFrame to a CAS Table

In addition to files, you can upload Pandas **DataFrames**.  Note, however, that the **DataFrame** will be exported to a CSV file, then uploaded.

In [34]:
import pandas as pd

df = pd.read_csv('/u/username/data/iris.csv')
df.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
0,5.1,3.5,1.4,0.2,setosa
1,4.9,3.0,1.4,0.2,setosa
2,4.7,3.2,1.3,0.2,setosa
3,4.6,3.1,1.5,0.2,setosa
4,5.0,3.6,1.4,0.2,setosa


In [36]:
conn.upload(df, casout=dict(name='iris_df', caslib='casuser'))

NOTE: Cloud Analytic Services made the uploaded file available as table DF_IRIS in caslib CASUSER(username).
NOTE: The table DF_IRIS has been created in caslib CASUSER(username) from binary data uploaded to Cloud Analytic Services.


In [37]:
conn.fetch(table=dict(name='iris_df', caslib='casuser'), to=5)

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
0,5.1,3.5,1.4,0.2,setosa
1,4.9,3.0,1.4,0.2,setosa
2,4.7,3.2,1.3,0.2,setosa
3,4.6,3.1,1.5,0.2,setosa
4,5.0,3.6,1.4,0.2,setosa


# Using Data Message Handlers

Data message handlers allow you to write custom data loaders.

In [2]:
from swat.cas import datamsghandlers as dmh

Display all of the SWAT-supplied data message handler subclasses.

In [39]:
dmh.CASDataMsgHandler.__subclasses__()

[swat.cas.datamsghandlers.PandasDataFrame, swat.cas.datamsghandlers.DBAPI]

In [40]:
dmh.PandasDataFrame.__subclasses__()

[swat.cas.datamsghandlers.JSON,
 swat.cas.datamsghandlers.CSV,
 swat.cas.datamsghandlers.Clipboard,
 swat.cas.datamsghandlers.Text,
 swat.cas.datamsghandlers.SQLQuery,
 swat.cas.datamsghandlers.SAS7BDAT,
 swat.cas.datamsghandlers.HTML,
 swat.cas.datamsghandlers.Excel,
 swat.cas.datamsghandlers.FWF,
 swat.cas.datamsghandlers.SQLTable]

## The HTML Data Message Handler

In [41]:
htmldmh = dmh.HTML('https://www.fdic.gov/bank/' +         
                   'individual/failed/banklist.html', index=0)

Display the **addtable** parameters created by the HTML data message handler.

In [42]:
htmldmh.args.addtable

{'datamsghandler': <swat.cas.datamsghandlers.HTML at 0x7f638ca7a630>,
 'reclen': 104,
 'vars': [{'length': 16,
   'name': 'Bank Name',
   'offset': 0,
   'rtype': 'CHAR',
   'type': 'VARCHAR'},
  {'length': 16,
   'name': 'City',
   'offset': 16,
   'rtype': 'CHAR',
   'type': 'VARCHAR'},
  {'length': 16,
   'name': 'ST',
   'offset': 32,
   'rtype': 'CHAR',
   'type': 'VARCHAR'},
  {'length': 8,
   'name': 'CERT',
   'offset': 48,
   'rtype': 'NUMERIC',
   'type': 'INT64'},
  {'length': 16,
   'name': 'Acquiring Institution',
   'offset': 56,
   'rtype': 'CHAR',
   'type': 'VARCHAR'},
  {'length': 16,
   'name': 'Closing Date',
   'offset': 72,
   'rtype': 'CHAR',
   'type': 'VARCHAR'},
  {'length': 16,
   'name': 'Updated Date',
   'offset': 88,
   'rtype': 'CHAR',
   'type': 'VARCHAR'}]}

Call the **addtable** action using the generated parameters.

In [43]:
out = conn.addtable(table='banklist', caslib='casuser',
                    **htmldmh.args.addtable)
out

In [45]:
conn.columninfo(table=dict(name='banklist', caslib='casuser'))

Unnamed: 0,Column,ID,Type,RawLength,FormattedLength,NFL,NFD
0,Bank Name,1,varchar,90,90,0,0
1,City,2,varchar,17,17,0,0
2,ST,3,varchar,2,2,0,0
3,CERT,4,int64,8,12,0,0
4,Acquiring Institution,5,varchar,65,65,0,0
5,Closing Date,6,varchar,18,18,0,0
6,Updated Date,7,varchar,18,18,0,0


Parse the dates in columns 5 and 6.  Use the **replace=** option in the new **addtable** call to replace the existing table.

In [46]:
htmldmh = dmh.HTML('https://www.fdic.gov/bank/' +
                   'individual/failed/banklist.html',
                   index=0, parse_dates=[5, 6])

out = conn.addtable(table='banklist', caslib='casuser',
                            replace=True,
                            **htmldmh.args.addtable)

In [47]:
conn.columninfo(table=dict(name='banklist', 
                           caslib='casuser'))

Unnamed: 0,Column,ID,Type,RawLength,FormattedLength,Format,NFL,NFD
0,Bank Name,1,varchar,90,90,,0,0
1,City,2,varchar,17,17,,0,0
2,ST,3,varchar,2,2,,0,0
3,CERT,4,int64,8,12,,0,0
4,Acquiring Institution,5,varchar,65,65,,0,0
5,Closing Date,6,datetime,8,20,DATETIME,0,0
6,Updated Date,7,datetime,8,20,DATETIME,0,0


Fetch a few rows of the data using **sastypes=False** so that we get actual dates in the resulting **DataFrame**.

In [48]:
conn.fetch(table=dict(name='banklist', caslib='casuser'), 
           sastypes=False, to=3)

Unnamed: 0,Bank Name,City,ST,CERT,Acquiring Institution,Closing Date,Updated Date
0,Allied Bank,Mulberry,AR,91,Today's Bank,2016-09-23,2016-10-17
1,The Woodbury Banking Company,Woodbury,GA,11297,United Bank,2016-08-19,2016-10-17
2,First CornerStone Bank,King of Prussia,PA,35312,First-Citizens Bank & Trust Company,2016-05-06,2016-09-06


## The Excel Data Message Handler

In [49]:
exceldmh = dmh.Excel('http://www.fsa.usda.gov/Internet/' +  
                     'FSA_File/disaster_cty_list_ytd_14.xls')

Add the data to the server.

In [50]:
out = conn.addtable(table='crops', caslib='casuser', 
                    **exceldmh.args.addtable)
out

In [52]:
conn.columninfo(table=dict(name='crops', caslib='casuser'))

Unnamed: 0,Column,ID,Type,RawLength,FormattedLength,Format,NFL,NFD
0,FIPS,1,int64,8,12,,0,0
1,County,2,varchar,17,17,,0,0
2,State,3,varchar,14,14,,0,0
3,Designation Code,4,int64,8,12,,0,0
4,Designation Number,5,varchar,5,5,,0,0
5,DROUGHT,6,int64,8,12,,0,0
6,"FLOOD, Flash flooding",7,int64,8,12,,0,0
7,"Excessive rain, moisture, humidity",8,int64,8,12,,0,0
8,"Severe Storms, thunderstorms",9,int64,8,12,,0,0
9,Ground Saturation\nStanding Water,10,int64,8,12,,0,0


## The PandasDataFrame Data Message Handler

In [53]:
import pandas as pd

Read the Excel file into a Pandas **DataFrame**.

In [54]:
exceldf = pd.read_excel('http://www.fsa.usda.gov/Internet/' + 
                        'FSA_File/disaster_cty_list_ytd_14.xls')

Create a **PandasDataFrame** data message handler.

In [55]:
exceldmh = dmh.PandasDataFrame(exceldf)

Add the table to the server.

In [56]:
out = conn.addtable(table='dfcrops', caslib='casuser', 
                    **exceldmh.args.addtable)
out

## Using Data Message Handlers with Databases

In [58]:
import csv
import sqlite3

Create an in-memory database.

In [59]:
sqlc = sqlite3.connect('iris.db')

In [60]:
cur = sqlc.cursor()

Define the table.

In [61]:
cur.execute('''CREATE TABLE iris (sepal_length REAL, 
                                           sepal_width REAL, 
                                           petal_length REAL, 
                                           petal_width REAL, 
                                           species CHAR(10));''')

<sqlite3.Cursor at 0x7f638ab94f10>

Parse the iris CSV file and format it as tuples.

In [63]:
with open('/u/username/data/iris.csv', 'r') as iris:
    data = csv.DictReader(iris)
    rows = [(x['sepal_length'], 
             x['sepal_width'], 
             x['petal_length'], 
             x['petal_width'], 
             x['species']) for x in data]

Load the data into the database.

In [64]:
cur.executemany('''INSERT INTO iris (sepal_length, 
                                              sepal_width, 
                                              petal_length, 
                                              petal_width, 
                                              species) 
                            VALUES (?, ?, ?, ?, ?);''', rows)

<sqlite3.Cursor at 0x7f638ab94f10>

In [65]:
sqlc.commit()

Verify that the data looks correct.

In [66]:
cur.execute('SELECT * from iris')

<sqlite3.Cursor at 0x7f638ab94f10>

In [67]:
cur.fetchmany(5)

[(5.1, 3.5, 1.4, 0.2, 'setosa'),
 (4.9, 3.0, 1.4, 0.2, 'setosa'),
 (4.7, 3.2, 1.3, 0.2, 'setosa'),
 (4.6, 3.1, 1.5, 0.2, 'setosa'),
 (5.0, 3.6, 1.4, 0.2, 'setosa')]

Create an SQLAlchemy database engine.

In [68]:
eng = dmh.SQLTable.create_engine('sqlite:///iris.db')

Create the data message handler.

In [70]:
sqldmh = dmh.SQLTable('iris', eng)

Load the database into CAS.

In [71]:
out = conn.addtable(table='iris_sql', caslib='casuser', 
                    **sqldmh.args.addtable)
out

Check the data in the server.

In [73]:
conn.columninfo(table=dict(name='iris_sql', caslib='casuser'))

Unnamed: 0,Column,ID,Type,RawLength,FormattedLength,NFL,NFD
0,sepal_length,1,double,8,12,0,0
1,sepal_width,2,double,8,12,0,0
2,petal_length,3,double,8,12,0,0
3,petal_width,4,double,8,12,0,0
4,species,5,varchar,10,10,0,0


In [74]:
conn.fetch(table=dict(name='iris_sql', caslib='casuser'), to=5)

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
0,5.1,3.5,1.4,0.2,setosa
1,4.9,3.0,1.4,0.2,setosa
2,4.7,3.2,1.3,0.2,setosa
3,4.6,3.1,1.5,0.2,setosa
4,5.0,3.6,1.4,0.2,setosa


Set up a query to use with the data message handler.

In [76]:
sqldmh = dmh.SQLQuery('''SELECT * FROM iris 
                             WHERE species = "versicolor" 
                                   AND sepal_length > 6.6''', eng)

Load the result of the query into CAS.

In [77]:
out = conn.addtable(table='iris_sql2', caslib='casuser', 
                    **sqldmh.args.addtable)
out

Check the data on the server.

In [79]:
conn.fetch(table=dict(name='iris_sql2', caslib='casuser'))

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
0,7.0,3.2,4.7,1.4,versicolor
1,6.9,3.1,4.9,1.5,versicolor
2,6.7,3.1,4.4,1.4,versicolor
3,6.8,2.8,4.8,1.4,versicolor
4,6.7,3.0,5.0,1.7,versicolor
5,6.7,3.1,4.7,1.5,versicolor


### Streaming Data from a Database into a CAS Table

In [80]:
import sqlite3

In [81]:
sqlc = sqlite3.connect('iris.db')

In [82]:
c = sqlc.cursor()

Execute a query on the database.

In [83]:
c.execute('SELECT * FROM iris')

<sqlite3.Cursor at 0x7f6389f60420>

Create a **DBAPI** data message handler to stream the data.

In [84]:
dbdmh = dmh.DBAPI(sqlite3, c, nrecs=10)

Run the **addtable** action.

In [85]:
conn.addtable(table='iris_db', caslib='casuser', 
              **dbdmh.args.addtable)

Verify the data on the server.

In [86]:
conn.columninfo(table=dict(name='iris_db', caslib='casuser'))

Unnamed: 0,Column,ID,Type,RawLength,FormattedLength,NFL,NFD
0,sepal_length,1,double,8,12,0,0
1,sepal_width,2,double,8,12,0,0
2,petal_length,3,double,8,12,0,0
3,petal_width,4,double,8,12,0,0
4,species,5,varchar,10,10,0,0


In [87]:
conn.fetch(table=dict(name='iris_db', caslib='casuser'), to=5)

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
0,5.1,3.5,1.4,0.2,setosa
1,4.9,3.0,1.4,0.2,setosa
2,4.7,3.2,1.3,0.2,setosa
3,4.6,3.1,1.5,0.2,setosa
4,5.0,3.6,1.4,0.2,setosa


# Writing Your Own Data Message Handlers

Create a data message handler that subclasses from **CASDataMsgHandler**.

In [88]:
class MyDMH(dmh.CASDataMsgHandler):

    def __init__(self):
        self.data = [
            ('Alfred',  'M', 14, 69,   112.5),
            ('Alice',   'F', 13, 56.5, 84),
            ('Barbara', 'F', 13, 65.3, 98),
            ('Carol',   'F', 14, 62.8, 102.5),
            ('Henry',   'M', 14, 63.5, 102.5),           
        ] 

        vars = [
            dict(name='name', label='Name', type='varchar'),
            dict(name='sex', label='Sex', type='varchar'),
            dict(name='age', label='Age', type='int32'),
            dict(name='height', label='Height', type='double'),
            dict(name='weight', label='Weight', type='double'),
        ]

        super(MyDMH, self).__init__(vars)

    def getrow(self, row):
        try:
            return self.data[row]
        except IndexError:
            return

Create an instance of the data message handler.

In [94]:
mydmh = MyDMH()

Call the **addtable** action.

In [95]:
conn.addtable(table='myclass', caslib='casuser',
              **mydmh.args.addtable)

Verify the data on the server.

In [97]:
conn.columninfo(table=dict(name='myclass', caslib='casuser'))

Unnamed: 0,Column,Label,ID,Type,RawLength,FormattedLength,NFL,NFD
0,name,Name,1,varchar,7,7,0,0
1,sex,Sex,2,varchar,1,1,0,0
2,age,Age,3,int32,4,12,0,0
3,height,Height,4,double,8,12,0,0
4,weight,Weight,5,double,8,12,0,0


In [98]:
conn.fetch(table=dict(name='myclass', caslib='casuser'), to=5)

Unnamed: 0,name,sex,age,height,weight
0,Alfred,M,14.0,69.0,112.5
1,Alice,F,13.0,56.5,84.0
2,Barbara,F,13.0,65.3,98.0
3,Carol,F,14.0,62.8,102.5
4,Henry,M,14.0,63.5,102.5


## Adding Data Transformers

Add a date column to the data message handler.

In [99]:
class MyDMH(dmh.CASDataMsgHandler):

    def __init__(self):
        self.data = [
            ('Alfred',  'M', 14, 69,   112.5, '1987-03-01'),
            ('Alice',   'F', 13, 56.5, 84,    '1988-06-12'),
            ('Barbara', 'F', 13, 65.3, 98,    '1988-12-13'),
            ('Carol',   'F', 14, 62.8, 102.5, '1987-04-17'),
            ('Henry',   'M', 14, 63.5, 102.5, '1987-01-30'),           
        ] 

        vars = [
            dict(name='name', label='Name', type='varchar'),
            dict(name='sex', label='Sex', type='varchar'),
            dict(name='age', label='Age', type='int32'),
            dict(name='height', label='Height', type='double'),
            dict(name='weight', label='Weight', type='double'),
            dict(name='birthdate', label='Birth Date',
                 type='date', format='DATE', formattedlength=12),
        ]

        super(MyDMH, self).__init__(vars)

    def getrow(self, row):
        try:
            return self.data[row]
        except IndexError:
            return

Take that same data message handler and add a **transformers=** parameter with a function that converts the date strings to CAS dates.

In [3]:
class MyDMH(dmh.CASDataMsgHandler):

    def __init__(self):
        self.data = [
            ('Alfred',  'M', 14, 69,   112.5, '1987-03-01'),
            ('Alice',   'F', 13, 56.5, 84,    '1988-06-12'),
            ('Barbara', 'F', 13, 65.3, 98,    '1988-12-13'),
            ('Carol',   'F', 14, 62.8, 102.5, '1987-04-17'),
            ('Henry',   'M', 14, 63.5, 102.5, '1987-01-30'),           
        ] 

        vars = [
            dict(name='name', label='Name', type='varchar'),
            dict(name='sex', label='Sex', type='varchar'),
            dict(name='age', label='Age', type='int32'),
            dict(name='height', label='Height', type='double'),
            dict(name='weight', label='Weight', type='double'),
            dict(name='birthdate', label='Birth Date',
                 type='date', format='DATE', formattedlength=12),
        ]

        transformers = {
            'birthdate': dmh.str2cas_date,
        }

        super(MyDMH, self).__init__(vars, transformers=transformers)

    def getrow(self, row):
        try:
            return self.data[row]
        except IndexError:
            return


Create an instance of the data message handler.

In [4]:
mydmh = MyDMH()

Run the **addtable** action.

In [5]:
conn.addtable(table='myclass', caslib='casuser', replace=True,
              **mydmh.args.addtable)

Verify the data on the server.

In [7]:
conn.columninfo(table=dict(name='myclass', caslib='casuser'))

Unnamed: 0,Column,Label,ID,Type,RawLength,FormattedLength,Format,NFL,NFD
0,name,Name,1,varchar,7,7,,0,0
1,sex,Sex,2,varchar,1,1,,0,0
2,age,Age,3,int32,4,12,,0,0
3,height,Height,4,double,8,12,,0,0
4,weight,Weight,5,double,8,12,,0,0
5,birthdate,Birth Date,6,date,4,12,DATE,0,0


In [8]:
conn.fetch(table=dict(name='myclass', caslib='casuser'), sastypes=False)

Unnamed: 0,name,sex,age,height,weight,birthdate
0,Alfred,M,14,69.0,112.5,1987-03-01
1,Alice,F,13,56.5,84.0,1988-06-12
2,Barbara,F,13,65.3,98.0,1988-12-13
3,Carol,F,14,62.8,102.5,1987-04-17
4,Henry,M,14,63.5,102.5,1987-01-30


# Managing CASLibs

## Creating a CASLib

Create a new filesystem-based CASLib.

In [9]:
conn.addcaslib(path='/research/data', 
               caslib='research', 
               description='Research Data',  
               subdirs=False, 
               session=False, 
               activeonadd=False)

NOTE: Failed to resolve path /disk/research/data/ for caslib research.
NOTE: Cloud Analytic Services added the caslib 'research'.


Unnamed: 0,Name,Type,Description,Path,Definition,Subdirs,Local,Active,Personal,Hidden
0,research,PATH,Research Data,/disk/research/data/,,0.0,0.0,0.0,0.0,0.0


## Setting an Active CASLib

The active CASLib for a session can be set using the **setsessopt** action.

In [10]:
conn.setsessopt(caslib='research')

NOTE: 'research' is now the active caslib.


In [11]:
conn.caslibinfo(caslib='research')

Unnamed: 0,Name,Type,Description,Path,Definition,Subdirs,Local,Active,Personal,Hidden
0,research,PATH,Research Data,/disk/research/data/,,0.0,0.0,1.0,0.0,0.0


## Dropping a CASLib

In [12]:
conn.dropcaslib('research')

NOTE: 'CASUSER(username)' is now the active caslib.
NOTE: Cloud Analytic Services removed the caslib 'research'.


In [13]:
conn.close()