# SAS Code Structure

## Example SAS Code

In [1]:
/*
This code creates a dataset called "example" and prints out the SAS table 
*/

data example; 
    input name $ age height; 
    datalines;
Alice 30 5.5
Bob 25 6.1
; 
run; 

proc print data=example; 
run;

Obs,name,age,height
1,Alice,30,5.5
2,Bob,25,6.1


## Structure of a SAS Program

A SAS program consists of a sequence of steps. It can be a combination of either a DATA step and PROC step.

- DATA steps usually read, process, or create the data. In addition, a DATA step can contain a variety of data manipulations, including filtering rows, computing new 
columns, and joining tables
-- These steps always begin with the SAS keyword DATA;.
- PRO (procedure)C steps are where the reporting, managing, and analyzing the data occurs
-- These steps always begin with the SAS keyword PROC;

The example code written above has 2 steps - 1 DATA step and 1 PROC step. Steps end with run (while some end with quit).

Most SAS statements (not steps) begin with a identifying keyword, but ALL statements end with a semicolon. A step is usually a sequence of SAS statements.

### Comments 

Multi-line comments start with /* and end with */ - everything in between these two will be commented out. 
A single-line comment for a SAS statement that ends with a ; will start with just *.

To comment and uncomment a line, you can also highlight the code then press Ctrl +/. 

In [8]:
/* This code sets the sashelp.class table into a dataset called myclass and calculates height in cm 
while also computing its summary statistics */

data myclass;
    set sashelp.class;
    heightcm=height*2.54;
run;

proc print data=myclass;
run;

proc means data=myclass;
    var age heightcm;
run;

Obs,Name,Sex,Age,Height,Weight,heightcm
1,Alfred,M,14,69.0,112.5,175.26
2,Alice,F,13,56.5,84.0,143.51
3,Barbara,F,13,65.3,98.0,165.862
4,Carol,F,14,62.8,102.5,159.512
5,Henry,M,14,63.5,102.5,161.29
6,James,M,12,57.3,83.0,145.542
7,Jane,F,12,59.8,84.5,151.892
8,Janet,F,15,62.5,112.5,158.75
9,Jeffrey,M,13,62.5,84.0,158.75
10,John,M,12,59.0,99.5,149.86

Variable,N,Mean,Std Dev,Minimum,Maximum
Age heightcm,19 19,13.3157895 158.3355789,1.4926722 13.0227711,11.0000000 130.3020000,16.0000000 182.8800000


This new example code above has 3 steps, 1 DATA step and 2 PROC steps. The whole code has 8 statements. The only statement of code that does not have an identidying keyword is the code creating the heightcm.

### General SAS Statement Rules 
- All SAS Statements must end with a semicolon (except statements with data).
- SAS statements begin with a SAS keyword.
- SAS Statements are not case-sensitive.
- Words are usually separated by blanks or specical characters.

### Sas Table
A SAS table is structured data that has defined columns and rows. You can think of it as an excel sheet, an R dataframe. The file extension of these SAS tables are usually: `.sas7bdat` A SAS table has 2 parts, a **descriptor** portion and a **data** portion. 
- The descriptor contains information of the attributes of the table. This includes table name, number of rows, date and time table was created.
- The data portion contains the data values in the columns

SAS Tables have the following column attributes:
- **name**
- **type**
- **length**
 <img src="_static/sas_col.png" width = "700">

Column length is the number of bytes that is allocated for your column values. The length is related to your column types. Numeric columns are defaulted to 8 bytes (16 sig digits) and characters columns can be between 1-32,767 bytes. For predefined datasets, you can find information about the column attributes by running `proc contents data=*datasetname*; run;`

## SAS Libraries

A SAS library is a collection of SAS files that are stored in the same folder or directory on your computer.
A library can be creatred with the LIBNAME SAS statement, this does not need a run at the end since it is a global statement (more on that later!). **Note: A SAS Library is temporary and will be removed after every session!**

### Importing Overwatch Dataset into a Library
After we upload the Overwatch dataset into the SAS Studio session, we can create the MA505 library by using the following code below: 

In [None]:
/* creates library called ma505 */
libname ma505 "/home/u63936157";

This code creates a library called "ma505" in my SAS Studio home folder, which is the same folder I uploaded the `overwatch_stats.csv` dataset. To import the overwatch dataset into the `ma505` library, we will be using a `proc import` statement. 

### Importing a CSV

In [None]:
/* imports overwatch csv into ma505 library */
proc import datafile="/home/u63936157/overwatch_stats.csv" dbms=csv 
		out=ma505.overwatch;
run;

After running the code above, everytime we reference the `overwatch_stats` dataset, we can reference it as: `ma505.overwatch` instead of coding in its respective filepath.

### Importing an EXCEL File
In a similar manner, we can import excel files into a library using the following code structure below, where 
- lib is the name of the library you would like to create
- XLSX is the engine that reads excel files
- path/file.xlsx is the file you are importing

In [None]:
LIBNAME lib XLSX "path/file.xlsx"