# Demo: SAS Viya and Python Integration Fundamentals
The goal in using SAS Viya and Python is to process big data on the massively parallel processing engine in SAS Viya, and then return the smaller summarized results back to your Python client. Once the results are on your client, you can use familiar Python packages on the results.

### 1. Import Packages and Connect to SAS Viya

In [None]:
## Import packages
import swat
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
plt.style.use('seaborn')

## Set options
pd.set_option('display.max_columns', None)

## Connect to SAS Viya
conn = swat.CAS('server.demo.sas.com', 30571, 'student', 'Metadata0', name = 'py00d01')
display(conn)

### 2. Explore Available Data in SAS Viya

In [None]:
## View available data sources in SAS Viya
ci = conn.caslibInfo()

## View available data source files
fi = conn.fileInfo(caslib = 'PIVY')

## View available in-memory tables
ti = conn.tableInfo(caslib = 'casuser')

display(ci, fi, ti)

### 3. Load Data into Memory in SAS Viya

In [None]:
## Load a table into memory and display the output
lt = conn.loadTable(path = 'loans_raw.sashdat', caslib = 'PIVY',
                    casOut = {'caslib' : 'casuser', 
                              'replace' : True})

## View available in-memory tables
ti = conn.tableInfo(caslib = 'casuser')

display(lt, ti)

### 4. Make a Reference to a Table in SAS Viya

In [None]:
## Create a reference to a table and view the object
tbl = conn.CASTable('loans_raw', caslib = 'casuser')
tbl

### 5. Explore a Table

In [None]:
## View the dimensions of the table
shape = tbl.shape

## Preview the table
df_head = tbl.head()

## View column attributes
df_ci = tbl.columnInfo()

## Obtain summary statistics
colNames = ['Age', 'Salary', 'EmpLength', 'Amount', 'InterestRate']
df_summary = tbl.summary(input = colNames)

## Obtain missing and distinct values
maxDistinct = 10000
df_distinct = (tbl
               .distinct(maxNVals = maxDistinct)['Distinct']
               .query(f'NDistinct != {maxDistinct}'))

## Display the results from SAS Viya
display(shape, df_head, df_ci, df_summary, df_distinct)

## Plot the summarized results using Pandas
fig, (ax1, ax2) = plt.subplots(ncols = 2, figsize =  (18,6))

## ax1
(df_distinct
 .sort_values('NDistinct', ascending = False)
 .plot(kind = 'bar', x = 'Column', y = 'NDistinct', 
       ax = ax1, 
       title = 'Number of Distinct Values in Each Column (10,000 value limit)'))

## ax2
(df_distinct
 .sort_values('NMiss', ascending = False)
 .plot(kind = 'bar', x = 'Column', y = 'NMiss', 
       ax = ax2, 
       title = 'Number of Missing Values in Each Column'));

### 6. Analyze a Table

a. Determine the percentage of loans by each loan **Category**.

In [None]:
## Calculate the frequency of Category in SAS Viya
df = (tbl
      .Category
      .value_counts(normalize = True))
display(df)

## Plot the summarized results on the client using Pandas
df.plot(kind = 'bar', 
        figsize = (10,6), 
        title = 'Percent of Loans by Category');

b. View the total amount of mortgage loans by **Year**.

In [None]:
df = (tbl
      .query('Category = "Mortgage"') 
      .groupby('Year') 
      .Amount
      .sum())

display(df)

df.plot(kind = 'line', 
        figsize = (10,6), 
        title = 'Total Amount of New Mortgage Loans by Year');

### 7. Create a New Table in SAS Viya
Create a new table with only rows where **Category** equals *Credit Card*. Create a new column named **AccOpenDate** that creates a single column with the date on which the credit card was opened. The new table will keep only the specified columns.

In [None]:
## Add parameters to the input table
tbl.where = 'Category = "Credit Card"'
tbl.computedVars = [dict(name = 'AccOpenDate', format = 'mmddyy10.')]
tbl.computedVarsProgram = 'AccOpenDate = mdy(Month, Day, Year)'
tbl.vars = ['ID', 'AccNumber', 'LoanGrade', 'AccOpenDate', 'Salary', 'Category', 'Amount', 'InterestRate', 'Cancelled', 'CancelledReason', 'LastPurchase', 'Promotion']

## Specify output table information
newTbl = dict(name = 'CreditCards', 
              caslib = 'casuser', 
              replace = True)

## Create a new table in SAS Viya
ct = tbl.copyTable(casOut = newTbl)
display(ct)

## View available in-memory tables
ti = conn.tableInfo(caslib = 'casuser')
display(ti)

## Preview the newly created table
ccTbl = conn.CASTable('creditcards', caslib = 'casuser')
ccTbl.head()

### 8. Terminate Connection to SAS Viya

In [None]:
conn.terminate()