# Demo: Using the SWAT Package Plotting Methods

### 1. Import Packages and Connect to the CAS Server

Visit the documentation for the SWAT [(SAS Scripting Wrapper for Analytics Transfer)](https://sassoftware.github.io/python-swat/index.html) package.

In [None]:
## Import packages
import swat
import pandas as pd
import matplotlib.pyplot as plt
plt.style.use('seaborn')

## Set options
pd.set_option('display.max_columns', None)

## Connect to CAS
conn = swat.CAS('server.demo.sas.com', 30571, 'student', 'Metadata0', name = 'py04d01')

## Function to load the loans_raw.sashdat file into memory if necessary
def loadLoans():
    conn.loadTable(path = 'loans_raw.sashdat', caslib = 'PIVY',
                   casOut = {'name' : 'loans_raw',
                            'caslib' : 'casuser',
                            'promote' : True})

### 2. Explore Available CAS Tables

a. Use the tableInfo action to view all available in-memory tables in the **Casuser** caslib. If the **LOANS_RAW** CAS table is not available, uncomment the loadLoans function and execute the cell.

In [None]:
#loadLoans()
conn.tableInfo(caslib = 'casuser')

b. Reference the **LOANS_RAW** CAS table and add the where parameter to subset for rows where the **Category** column contains *Mortgage*.

In [None]:
mTbl = conn.CASTable('loans_raw', 
                     caslib = 'casuser', 
                     where = 'Category = "Mortgage"')
mTbl

c. Preview the **LOANS_RAW** table using the head method.

In [None]:
mTbl.head()

d. View the number of rows in the **LOANS_RAW** table where **Category** equals *Mortgage*. Notice that the results show that over a million rows contain *Mortgage*. 

In [None]:
numRows = mTbl.numRows()['numrows']

print("{:,}".format(numRows))

### 3. CASTable Plotting

a. Use the scatter method to plot the **CASTable** object. Notice that Python returns a plot and a warning. Because plotting is done on the client side, data must be returned to the client to create the visualization. By default, the amount of data sent to the client is limited by the cas.dataset.max_rows_fetched option. The default number of rows returned to the client is 10,000.

In [None]:
(mTbl
 .plot
 .scatter(x = 'Salary', y = 'Amount', figsize = (8,6), title = 'Mortgage Amount by Income'))

b. Execute the same code as the previous cell. Notice that the two graphs differ slightly. This occurs because a random sample of 10,000 rows is retrieved from the CAS table.

In [None]:
(mTbl
 .plot
 .scatter(x = 'Salary', y = 'Amount', figsize = (8,6), title = 'Mortgage Amount by Income'))

c. You can add the sample_seed option to specify a constant random sample from the CAS table. Here, create a function that will create a scatter plot of a CAS table. Within the plotting method, specify the sample_seed option to specify a constant random sample. Notice that using the sample_seed option creates the same visualization.

In [None]:
## Function to create the scatter plot using sample_seed
def createScatter(_ax):
    (mTbl
     .plot
     .scatter(x = 'Salary', y = 'Amount', 
              title='Mortgage Amount by Income', 
              sample_seed = 1, ax=_ax))

## Create a plot for each axes
fig, (ax1, ax2) = plt.subplots(ncols=2, figsize = (16,6))
createScatter(ax1)
createScatter(ax2)

### 4. Terminate the CAS Session

It's best practice to always terminate the CAS session when you are done.

In [None]:
conn.terminate()