Let us first load a data to play with using procedures

In [1]:
data practice_dataset;
    set sashelp.class;
    bmi = weight/(height**2)*703;
    format bmi 5.2;
    proc print data=practice_dataset;
run;

SAS Connection established. Subprocess id is 4515



Obs,Name,Sex,Age,Height,Weight,bmi
1,Alfred,M,14,69.0,112.5,16.61
2,Alice,F,13,56.5,84.0,18.5
3,Barbara,F,13,65.3,98.0,16.16
4,Carol,F,14,62.8,102.5,18.27
5,Henry,M,14,63.5,102.5,17.87
6,James,M,12,57.3,83.0,17.77
7,Jane,F,12,59.8,84.5,16.61
8,Janet,F,15,62.5,112.5,20.25
9,Jeffrey,M,13,62.5,84.0,15.12
10,John,M,12,59.0,99.5,20.09


PRINT PROCEDURE BASIC SYNTAX

In [2]:
PROC SORT data = practice_dataset;/*We need to sort the data in order to apply sorting in other procedures*/
    by descending sex;

PROC PRINT DATA=practice_dataset;     /*If we do not specify a dataset then it will print the latest available dataset in temporary memory*/
    BY descending sex;                /*This will sort the printed data by columns defined. By default it is ascending*/
    VAR sex age height weight;        /*This will limit the variables in output. If not specified then it will print all the variables*/
    SUM height weight;                /*This will print sum of the column in the end. Do not confuse it with group by statememt*/
    pageby sex;                       /*It will seperate the outputs by specified variable on different page*/
    sumby  sex;                       /*This will provide sum by specified variables*/
    Label Height="Height in inches";  /*This will rename the variable in OUTPUT and not in SOURCE DATASET*/
run;

Obs,Sex,Age,Height,Weight
1,M,14.0,69.0,112.5
2,M,14.0,63.5,102.5
3,M,12.0,57.3,83.0
4,M,13.0,62.5,84.0
5,M,12.0,59.0,99.5
6,M,16.0,72.0,150.0
7,M,12.0,64.8,128.0
8,M,15.0,67.0,133.0
9,M,11.0,57.5,85.0
10,M,15.0,66.5,112.0

Obs,Sex,Age,Height,Weight
11,F,13.0,56.5,84.0
12,F,13.0,65.3,98.0
13,F,14.0,62.8,102.5
14,F,12.0,59.8,84.5
15,F,15.0,62.5,112.5
16,F,11.0,51.3,50.5
17,F,14.0,64.3,90.0
18,F,12.0,56.3,77.0
19,F,15.0,66.5,112.0
Sex,,,545.3,811.0


PRINT PROCEDURE OUTPUT FORMATTING

In [3]:
PROC PRINT data = practice_dataset
    (
    firstobs = 3   /*To tell data from where to start*/
    obs = 5        /*To tell data at which observation to end*/
    );    
run;

Obs,Name,Sex,Age,Height,Weight,bmi
3,James,M,12,57.3,83.0,17.77
4,Jeffrey,M,13,62.5,84.0,15.12
5,John,M,12,59.0,99.5,20.09


FREQ PROCEDURE

1. Produces frequency and cross tabulation
2. tests and measures of association for 2 way tables
3. stratified analysis, providing statistics within and across strata
4. output can be saved as SAS datasets

In [4]:
PROC FREQ DATA = practice_dataset;
/*Outputting it to another dataset*/
/*Not printing the data*/
    tables sex*Age /out = freq_table noprint ;
    /*variables specified in tables will be the one for whom the frequency of observations and %ages will be calculated*/
    /*For more than variables just right multiple variables seperated by asterisk*/
    /*List will get outout in list format where as not mentioning list will give output in crosstabs*/
    /*List is generally used when freq output needs to be used further in a dataset*/
   /*SAS will automatically remove missing values. To include them as well write missing in tables as well*/                                       

proc print data = freq_table;
RUN;

Obs,Sex,Age,COUNT,PERCENT
1,F,11,1,5.2632
2,F,12,2,10.5263
3,F,13,2,10.5263
4,F,14,2,10.5263
5,F,15,2,10.5263
6,M,11,1,5.2632
7,M,12,3,15.7895
8,M,13,1,5.2632
9,M,14,2,10.5263
10,M,15,2,10.5263


PROC CONTENT

1. HOW TO USE CONTENT OF PROC CONTENT AS A DATA POINT

In [5]:
PROC CONTENTS 
        DATA = practice_dataset 
        OUT = CONTENTS_DATASET 
        NOPRINT ;
/*SAVE THE CONTENTS TO A DATASET AND THEN USE THAT DATASET TO FURTHER MANIPULATE WITH VARIABLE NAMES*/

PROC PRINT DATA = CONTENTS_DATASET;
VAR NAME LENGTH ;
RUN;

Obs,NAME,LENGTH
1,Age,8
2,Height,8
3,Name,8
4,Sex,1
5,Weight,8
6,bmi,8


PROC DATASETS
1. copy SAS files from one library to another
2. append, rename, delete and list SAS files in a SAS library
3. modify attributes of SAS datasets and variables within datasets

An example would not be possible in here. Mainly used for file handling purposes if UI is not available

PROC FORMAT
1. Proc format is used to define user defined formats.
2. The formats can be numeric or character
3. Useful for grouping of variable

This is generally an alternative to IF statements used to create categorical variables in data steps.
Benefit of this procedure is we can create independent procedures only once unlike writing several IFs in each datasets and can be apply those to various datasets

In [6]:
/*Let us define BMI category in character*/

proc format;
     value bmi_ind
         LOW - 16 = "GONNA DIE"
         16 - 22 = "SHALL LIVE"
         22 - HIGH = "WILL REIGN"
    ;

data practice_dataset_1;
     set practice_dataset;
     bmi_indicator = put(bmi,bmi_ind.);     /*Please remember; the format name is followed by a . (dot) else you will get an error saying format name missing*/

proc print data = practice_dataset_1 (obs = 5);
run;

Obs,Name,Sex,Age,Height,Weight,bmi,bmi_indicator
1,Alfred,M,14,69.0,112.5,16.61,SHALL LIVE
2,Henry,M,14,63.5,102.5,17.87,SHALL LIVE
3,James,M,12,57.3,83.0,17.77,SHALL LIVE
4,Jeffrey,M,13,62.5,84.0,15.12,GONNA DIE
5,John,M,12,59.0,99.5,20.09,SHALL LIVE


PROC EXPORT
1. Export SAS data to flat files (CSV, Text Delimited Files)

Generic Syntax

PROC EXPORT DATA=<dataset>
<
   OUTFILE="filename“
        <DBMS=identifier> <
        RUN;
If DBMS= dlm , then also specify

PROC CHART

1. proc chart data = practice_dataset;
    vbar height;
    hbar weight;
    pie sex;
run;

2. Not running the same in here as outputted charts are too large to accomodate in ipynb file format

PROC REPORT

1. OUTPUTS DATA AS REPORTS

In [7]:
proc report data=
sashelp.class nowindows headline headskip;
Columns name age sex height weight;
/*In define we can do following things
Rename variable name for output
Group the final output by a variable
Calculate mean/median/mode etc of a variable using analysis keyword
Compute a variable using compute and End Compute keywords
Create a break in the report output using break after
Format report with Double Under Line format
*/
Define name / display "Name_of_Staff " width = 7;
Define sex /group "Staff_Gender " width = 9;
Define age / analysis mean "Age_of_Staff " width = 5;
Define Height / analysis mean "Height_of_Staff " format= 7.2;
Define weight / analysis mean "Weight_of_Staff " format =7.2;
Define BMI /computed format = 7.2;
Compute BMI ;
BMI = height.mean/weight.mean;
Endcompute ;
Break after sex/skip summarize DOL DUL;
Run;

Name_of_Staff,Age_of_Staff,Staff_Gender,Height_of_Staff,Weight_of_Staff
Alice,13.0,F,56.5,84.0
Barbara,13.0,,65.3,98.0
Carol,14.0,,62.8,102.5
Jane,12.0,,59.8,84.5
Janet,15.0,,62.5,112.5
Joyce,11.0,,51.3,50.5
Judy,14.0,,64.3,90.0
Louise,12.0,,56.3,77.0
Mary,15.0,,66.5,112.0
,13.222222,F,60.59,90.11


PROC TABULATE

1.  GENERATE A PROFESSIONAL LOOKING TABLE

In [10]:
PROC TABULATE DATA = PRACTICE_DATASET OUT = TABLUATE_OUTPUT ;
CLASS SEX AGE;    /*VARIABLE ON WHCH WE NEED TO BASE THE REPORT*/
VAR HEIGHT   ;      /*VARIABLE WHICH WE NEED TO FILL IN THE CROSS TAB CELLS*/
TABLE SEX*AGE;    /*VARIABLES WHICH WILL BE THE CROSSTAB COLUMNS*/
RUN;

Sex,Sex,Sex,Sex,Sex,Sex,Sex,Sex,Sex,Sex,Sex
F,F,F,F,F,M,M,M,M,M,M
Age,Age,Age,Age,Age,Age,Age,Age,Age,Age,Age
11,12,13,14,15,11,12,13,14,15,16
N,N,N,N,N,N,N,N,N,N,N
1,2,2,2,2,1,3,1,2,2,1
