# Week 14 Part 1 (SASPy)
### SASPy is a Python Application Programming Interface (API) to the SAS system

* Enabling communication between Jupyter and SAS when using the SAS Kernel
* Running Python code using commonly used IDE other than Jupyter Notebook
* Loading SAS data sets into Python-Pandas DataFrame objects
* Converting Python-Pandas DataFrame objects into SAS data sets
* Using Python convenience methods on SAS data sets
* Imitating the SAS macro facility
* Generating SAS code from Python code

##### If you have a licensed SAS software installed in your computer and you want to use [SASPy](https://support.sas.com/en/software/saspy.html)

* Install the [SASPy](https://support.sas.com/en/software/saspy.html) module for adding it to your Python environment
* Configure it to connect to your SAS environment
    * Set up the sascfg_personal.py configuration file
    * Make SAS-supplied Java.jar files available to SASPy
* Install Anaconda distribution of Python for JupyterLab
* Install SAS kernel to enable SASPy communication between Jupyter and SAS

SASPy can be used from:
* Jupyter Notebook or JupyterLab
* any Python console/scripting environment


##### Loading a SAS data set into a Python Object using the sasdata method
* Import saspy
* Create a connection with SAS, authenticating and spinning up a SAS session; winlocal is configuration name in the setup
* Create  python object using the sasdata method
* Run descriptive statistics

In [16]:
import saspy
sas = saspy.SASsession(cfgname='winlocal')
iris = sas.sasdata("iris","SASHELP")
iris.describe()

SAS Connection established. Subprocess id is 212



Unnamed: 0,Variable,Label,N,NMiss,Median,Mean,StdDev,Min,P25,P50,P75,Max
0,SepalLength,Sepal Length (mm),150,0,58.0,58.433333,8.280661,43,51,58.0,64,79
1,SepalWidth,Sepal Width (mm),150,0,30.0,30.573333,4.358663,20,28,30.0,33,44
2,PetalLength,Petal Length (mm),150,0,43.5,37.58,17.652982,10,16,43.5,51,69
3,PetalWidth,Petal Width (mm),150,0,13.0,11.993333,7.622377,1,3,13.0,18,25


If you are using "JupyterLab in SAS University Edition", use the following code block. Notice the null argument in the second line of the code block below.
``` Python
import saspy
sas = saspy.SASsession()
iris = sas.sasdata("iris","SASHELP")
iris.describe()
```

##### Print() function 

* prints the class type of the object that is specified as the argument in the type() function.

In [17]:
print(type(iris))

<class 'saspy.sasdata.SASdata'>


##### Type() function

In [18]:
type(iris)

saspy.sasdata.SASdata

##### Transferring a data set between SAS and Python - Paired methods

* df2sd 
* sd2df
* Retention of the data types, column names, and other basic elements on the destination side

* Non-retention of some metadata unique to SAS data on the Python side

[A Basic Introduction to SASPy and Jupyter Notebooks By Jason Philips. 2018](https://support.sas.com/content/dam/SAS/support/en/sas-global-forum-proceedings/2018/2822-2018.pdf)

In [1]:
import saspy
sas = saspy.SASsession(cfgname='winlocal')
class_sds = sas.sd2df(table='class', libref='sashelp')
class_sds.describe()  

SAS Connection established. Subprocess id is 12156



Unnamed: 0,Age,Height,Weight
count,19.0,19.0,19.0
mean,13.315789,62.336842,100.026316
std,1.492672,5.127075,22.773933
min,11.0,51.3,50.5
25%,12.0,58.25,84.25
50%,13.0,62.8,99.5
75%,14.5,65.9,112.25
max,16.0,72.0,150.0


In [None]:
print(type(class_sds))

##### Using the to_df() function for loading the SAS data set in to the pandas dataframe

In [None]:
import saspy
import pandas as pd
pd_iris =iris.to_df()
print(type(pd_iris))

##### Using %cd magic command for referencing a SAS data set

In [None]:
import saspy
import pandas as pd
sas = saspy.SASsession(cfgname='winlocal')
%cd C:\Data
p_cars = pd.read_sas('cars.sas7bdat', format='sas7bdat', encoding="utf-8")
p_cars.describe()

##### Documenting the codes in the next cell
* Import SASPy
* Import pandas
* Establish SAS connection
* Reference a SAS data set
* Convert a SAS data set to a dataframe
* Use the astype() function to change the data type?
* Concatenate two strings (VARSTR and VARPSU)

In [21]:
import saspy
import pandas as pd
sas =saspy.SASsession(cfgname='winlocal')
sas.saslib(libref='new', path="C:\\Data")
py_obvisits17 = sas.sd2df(table='h197g', libref='new', dsopts={"keep": "dupersid var: VSTCTGRY"})
py_obvisits17 = py_obvisits17.astype({"VARSTR":'object', "VARPSU":'object'})
py_obvisits17['CLUSTER']= py_obvisits17['VARSTR'].astype(str)+py_obvisits17['VARPSU'].astype(str)
py_obvisits17.info()

SAS Connection established. Subprocess id is 9784

3                                                          The SAS System                           06:44 Saturday, December 7, 2019

21         
22         libname new    'C:\Data'  ;
NOTE: Libref NEW was successfully assigned as follows: 
      Engine:        V9 
      Physical Name: C:\Data
23         
24         
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 170491 entries, 0 to 170490
Data columns (total 5 columns):
DUPERSID    170491 non-null object
VSTCTGRY    170491 non-null int64
VARSTR      170491 non-null object
VARPSU      170491 non-null object
CLUSTER     170491 non-null object
dtypes: int64(1), object(4)
memory usage: 6.5+ MB


##### Display()
* displays multiple outputs

In [22]:
import saspy
import pandas as pd
from IPython.display import display
display(len(py_obvisits17['CLUSTER'].unique().tolist()))
display(len(py_obvisits17['VARSTR'].unique().tolist()))
display(len(py_obvisits17['VARPSU'].unique().tolist()))
display(len(py_obvisits17['DUPERSID'].unique().tolist()))

621

282

3

22352

In [23]:
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
len(py_obvisits17['CLUSTER'].unique().tolist())
len(py_obvisits17['VARSTR'].unique().tolist())
len(py_obvisits17['VARPSU'].unique().tolist())
len(py_obvisits17['DUPERSID'].unique().tolist())

22352

##### Generating a profile of the Python data object created from the SAS data set

In [3]:
import saspy
import pandas
import pandas_profiling
sas = saspy.SASsession(cfgname='winlocal')
df_heart = sas.sd2df(table='heart', libref='sashelp')
pandas_profiling.ProfileReport(df_heart)

SAS Connection terminated. Subprocess id was 12156
SAS Connection established. Subprocess id is 6828





 ##### Generating SAS code by using the SASPy module

In [2]:
import saspy
import pandas as pd
sas = saspy.SASsession(cfgname='winlocal')
w_class = sas.sasdata("CARS","SASHELP")
code=sas.teach_me_SAS(1)
w_class.columnInfo()


SAS Connection established. Subprocess id is 3856

proc contents data=SASHELP.CARS ;ods select Variables;run;


##### Running a SAS program by using a Jupyter Notebook magic command (%%SAS)

In [15]:
%%SAS
proc print data=sashelp.class (obs=5); 
run;

Using SAS Config named: winlocal
SAS Connection established. Subprocess id is 4808



Obs,Name,Sex,Age,Height,Weight
1,Alfred,M,14,69.0,112.5
2,Alice,F,13,56.5,84.0
3,Barbara,F,13,65.3,98.0
4,Carol,F,14,62.8,102.5
5,Henry,M,14,63.5,102.5


[Tips, Tricks, Hacks, and Magic: How to Effortlessly Optimize Your Jupyter Notebook by Anne Bonner](https://towardsdatascience.com/how-to-effortlessly-optimize-jupyter-notebooks-e864162a06ee)

[Profiling and Optimizing Jupyter Notebooks - A Comprehensive Guide by Muriz Serifovic
](https://towardsdatascience.com/speed-up-jupyter-notebooks-20716cbe2025)