### Add our Data Files to the CAS Server

For this simple notebook, we'll create our caslibs and load the hourly CSVs into CAS, then write the data into a SASHDAT in their library.

Taking this step now saves us the load times later when we want to do our processing and analytics.

We do this twice here - once for windows host files, and once for netflow.

*Note that the Netflow file is larger than Github allows us to load; if you would like to use this, please contact `damian.herrick@sas.com` for instructions on how to generate this file.*

__Damian Herrick__  
__SAS Institute__  
__[damian.herrick@sas.com](mailto:damian.herrick@sas.com)__  

In [1]:
import os
import pandas as pd
import swat
from swat.cas import datamsghandlers as dmh

Standard connection details.

In [2]:
os.environ["CAS_CLIENT_SSL_CA_LIST"]="/home/ds/cascert.pem"

conn = swat.CAS("<your-CAS-server-url>", 5570)

Create the `WH` caslib, for windows host events.

If for some reason we've already created this caslib, we just drop it first and recreate from there.

In [3]:
conn.dropcaslib(caslib='LANL_NF', quiet=True)

conn.addcaslib(name='WH', path='/home/datasets/LANL/WH/', 
               description="Windows Host Events",
               session=False)
conn.setsessopt(caslib='WH')

NOTE: Cloud Analytic Services removed the caslib 'LANL_NF'.
NOTE: Caslib WH already exists.
NOTE: 'WH' is now the active caslib.


ERROR: The action stopped due to errors.


Now load the CSV file and add it to CAS.

Note:
* You are loading the CSV into the container's local memory.
* We want the data available to CAS on the CAS server.

By adding the table this way, we'll make the data available on the CAS server.

In [4]:
dfHost = pd.read_csv("/home/ds/datasets/WH/wls-day_02_hr13.csv")
dmhHost = dmh.PandasDataFrame(dfHost)
out = conn.addtable(table='wls_day_02_hr13', caslib='WH', **dmhHost.args.addtable)

Now we've loaded it, we'll go ahead and save it to disk.

In [5]:
conn.save(table='wls_day_02_hr13', name='wls_day_02_hr13.sashdat', caslib='WH')

ERROR: Connection failed. Server returned: Session reconnect failed: Could not find the specified session.


SWATError: An error occurred while sending request.

Repeat the same steps for the raw Netflow data.

In [None]:
conn.dropcaslib(caslib='LANL_NF', quiet=True)

conn.addcaslib(name='LANL_NF', path='/home/datasets/LANL/NF/', 
               description="LANL Netflow",
               session=False)
conn.setsessopt(caslib='LANL_NF')

In [None]:
dfNetflow = pd.read_csv("/home/ds/datasets/NF/netflow_day-02_hr13.csv")
dmhNetflow = dmh.PandasDataFrame(dfNetflow)
out1 = conn.addtable(table='nf_day_02_hr13', caslib='LANL_NF', **dmhNetflow.args.addtable)

In [None]:
conn.save(table='nf_day_02_hr13', name='nf_day_02_hr13.sashdat', caslib='LANL_NF')

In [None]:
conn.close()