# Getting Python Version

In [1]:
import platform
print(platform.sys.version)

3.5.1 (default, May 12 2016, 17:09:49) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-16)]


_Note:_
<br>
- The command will print the Python version and operating-system information (Python version 3.5.1 on the Red Hat operating System).
<br>
- This Notebook was developed using the Python 3.5.1 kernel provided with SAS University Edition.

# Using SASPy to connect to a SAS kernel 

In [3]:
from saspy import SASsession
sas = SASsession()
type(sas)

Using SAS Config named: default
SAS Connection established. Subprocess id is 13628



saspy.sasbase.SASsession

_Note:_
<br>
- An empty object $sas$ was created by calling the constructor $SASsession()$ with no pramater (object-oriented programming) 
- $sas = SASsession()$ only need to be excute once and all subsequent cells in this Notebook will assume it has been executed

# Running SAS Code in Python

In [20]:
fish = sas.sasdata('fish','sashelp')
print(type(fish))
print(fish['LOG'])
fish.columnInfo()

<class 'saspy.sasbase.SASdata'>
LOG
<class 'str'>
None


Unnamed: 0,Member,Num,Variable,Type,Len,Pos
0,SASHELP.FISH,6,Height,Num,8,32
1,SASHELP.FISH,3,Length1,Num,8,8
2,SASHELP.FISH,4,Length2,Num,8,16
3,SASHELP.FISH,5,Length3,Num,8,24
4,SASHELP.FISH,1,Species,Char,9,48
5,SASHELP.FISH,2,Weight,Num,8,0
6,SASHELP.FISH,7,Width,Num,8,40


_Note:_
<br>
- We use SAS dataset $fish$ in the sashelp library as an example
- $sas$ calls the method $sasdata$ to access a SAS library defined within this SAS session
- We can use _fish['LOG']_ to access log output and _fish['LST']_ to access results output

In [23]:
%%SAS sas
proc means data=sashelp.fish;
    class species;
    var Weight;
run;

Analysis Variable : Weight,Analysis Variable : Weight,Analysis Variable : Weight,Analysis Variable : Weight,Analysis Variable : Weight,Analysis Variable : Weight,Analysis Variable : Weight
Species,N Obs,N,Mean,Std Dev,Minimum,Maximum
Bream,35,34,626.0,206.604585,242.0,1000.0
Parkki,11,11,154.8181818,78.7550864,55.0,300.0
Perch,56,56,382.2392857,347.6177172,5.9,1100.0
Pike,17,17,718.7058824,494.140765,200.0,1650.0
Roach,20,20,152.05,88.828916,0.0,390.0
Smelt,14,14,11.1785714,4.1315258,6.7,19.9
Whitefish,6,6,531.0,309.6029716,270.0,1000.0


_Note:_
- We can use _%%SAS_ command to redirect all subsequent code in the cell to the SAS kernel 

# Convert SAS dataset into a DataFrame

In [35]:
fishdf = sas.sasdata2dataframe('fish','sashelp')
print(type(fishdf))
fishdf.dtypes

<class 'pandas.core.frame.DataFrame'>


Species     object
Weight     float64
Length1    float64
Length2    float64
Length3    float64
Height     float64
Width      float64
dtype: object

_Note:_
<br>
- We created an object _fishdf_ whose type is a DataFrame from the SAS dataset sashelp.fish 

In [36]:

fishdf.groupby('Species')['Weight'].agg(['count', 'std', 'mean', 'min', 'max'])

Unnamed: 0_level_0,count,std,mean,min,max
Species,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Bream,34,206.604585,626.0,242.0,1000.0
Parkki,11,78.755086,154.818182,55.0,300.0
Perch,56,347.617717,382.239286,5.9,1100.0
Pike,17,494.140765,718.705882,200.0,1650.0
Roach,20,88.828916,152.05,0.0,390.0
Smelt,14,4.131526,11.178571,6.7,19.9
Whitefish,6,309.602972,531.0,270.0,1000.0


In [55]:
fishdf[:1] #selecting 1st column

Unnamed: 0,Species,Weight,Length1,Length2,Length3,Height,Width
0,Bream,242.0,23.2,25.4,30.0,11.52,4.02


In [56]:
fishdf.describe() # summary statistics for numerical 

Unnamed: 0,Weight,Length1,Length2,Length3,Height,Width
count,158.0,159.0,159.0,159.0,159.0,159.0
mean,398.69557,26.24717,28.415723,31.227044,8.970994,4.417486
std,359.086204,9.996441,10.716328,11.610246,4.286208,1.685804
min,0.0,7.5,8.4,8.8,1.7284,1.0476
25%,120.0,19.05,21.0,23.15,5.9448,3.38565
50%,272.5,25.2,27.3,29.4,7.786,4.2485
75%,650.0,32.7,35.5,39.65,12.3659,5.5845
max,1650.0,59.0,63.4,68.0,18.957,8.142
