# Lecture5: Creating List Reports

# Basic Reports

We have already used PRINT procedure to print contents — variables and observations — of a SAS data set. Using the PRINT procedure, we can create quite informative list reports. The PRINT procedure takes the following general form:

       PROC PRINT options;
           statement1;
           statement2;
           etc;
       RUN;
       
As usual, throughout this lesson, we'll look at some examples in order to learn about the various options and features of the PRINT procedure.

In [1]:
OPTIONS LS = 75 PS = 58 NODATE;

DATA basic;
  input subj 1-4 name $ 6-23 clinic $ 25-28 
        gender 30 no_vis 32-33 type_vis 35-37
        expense 39-45;
  DATALINES;
1024 Alice Smith        LEWN 1  7 101 1001.98
1167 Maryann White      LEWN 1  2 101 2999.34
1168 Thomas Jones       ALTO 2 10 190 3904.89
1201 Benedictine Arnold ALTO 2  1 190 1450.23
1302 Felicia Ho         MNMC 1  7 190 1209.94
1471 John Smith         MNMC 2  6 187 1763.09
1980 Jane Smiley        MNMC 1  5 190 3567.00
  ;
RUN;

PROC PRINT data = basic;
RUN;

Obs,subj,name,clinic,gender,no_vis,type_vis,expense
1,1024,Alice Smith,LEWN,1,7,101,1001.98
2,1167,Maryann White,LEWN,1,2,101,2999.34
3,1168,Thomas Jones,ALTO,2,10,190,3904.89
4,1201,Benedictine Arnold,ALTO,2,1,190,1450.23
5,1302,Felicia Ho,MNMC,1,7,190,1209.94
6,1471,John Smith,MNMC,2,6,187,1763.09
7,1980,Jane Smiley,MNMC,1,5,190,3567.0


By default: 

- all observations and variables in the data set are printed, 
- a column for observation numbers appears on the far left, and 
- variables appear in the order in which they occur in the data set.

# VAR Statement

By default, the PRINT procedure lists all of the variables contained in a SAS data set. We can use the PRINT procedure's VAR statement to not only select variables, but also to control the order in which the variables appear in our reports.

In [2]:
PROC PRINT data = basic;
   var name no_vis expense;
RUN;


Obs,name,no_vis,expense
1,Alice Smith,7,1001.98
2,Maryann White,2,2999.34
3,Thomas Jones,10,3904.89
4,Benedictine Arnold,1,1450.23
5,Felicia Ho,7,1209.94
6,John Smith,6,1763.09
7,Jane Smiley,5,3567.0


# NOOBS Option

Using the NOOBS option, we can suppress the printing of the default observation number

In [3]:
PROC PRINT data = basic noobs;
   var name no_vis expense;
RUN;


name,no_vis,expense
Alice Smith,7,1001.98
Maryann White,2,2999.34
Thomas Jones,10,3904.89
Benedictine Arnold,1,1450.23
Felicia Ho,7,1209.94
John Smith,6,1763.09
Jane Smiley,5,3567.0


# Identifying Observations: ID Statement

Using the ID statement, we can emphasize one or more key variables. The ID statement, which automatically suppresses the printing of the observation number, tells SAS to print the variable(s) specified in the ID statement as the first column(s) of your output. Thus, the ID statement allows you to use the values of the variables to identify observations, rather than the (usually meaningless) observation number. The following SAS code illustrates the use of the ID statement option:

In [4]:
PROC PRINT data = basic;
   id name;
   var gender expense;
RUN;


name,gender,expense
Alice Smith,1,1001.98
Maryann White,1,2999.34
Thomas Jones,2,3904.89
Benedictine Arnold,2,1450.23
Felicia Ho,1,1209.94
John Smith,2,1763.09
Jane Smiley,1,3567.0


# Selecting Observations

By default, the PRINT procedure displays all of the observations in a SAS data set. You can control which observations are printed by:

- using the FIRSTOBS= and OBS = options to tell SAS which range of observation numbers to print
- using the WHERE statement to print only those observations that meet a certain condition

In [5]:
OPTIONS LS = 75 PS = 58 NODATE;

PROC PRINT data = basic (FIRSTOBS = 2 OBS = 5);
   var subj name no_vis expense;
RUN;

Obs,subj,name,no_vis,expense
2,1167,Maryann White,2,2999.34
3,1168,Thomas Jones,10,3904.89
4,1201,Benedictine Arnold,1,1450.23
5,1302,Felicia Ho,7,1209.94


### Note on the FIRSTOBS= and OBS = options
The FIRSTOBS= option tells SAS the first observation to print, and the OBS= option tells SAS the last observation to print. Both options must be placed in parentheses, and the parentheses must immediately follow the DATA= option. You will get a syntax error if you try to use the options without also using the DATA= option. (Incidentally, if you don't use the DATA= option to tell SAS which data set to print, SAS will print the most recent data set.)

In [6]:
PROC PRINT data = basic;
   var name no_vis type_vis expense;
   where no_vis > 5;
RUN;

Obs,name,no_vis,type_vis,expense
1,Alice Smith,7,101,1001.98
3,Thomas Jones,10,190,3904.89
5,Felicia Ho,7,190,1209.94
6,John Smith,6,187,1763.09


### Note on WHERE statement in PRINT procedure

- Only one WHERE statement can appear in each PRINT procedure.
- You can specify any variable in your SAS data set in a WHERE statement, not just the variables that appear in the VAR statement.
- The WHERE statement works for both character and numeric variables. To specify a condition based on the value of a character variable
- Any of the comparison operators and logical operators — such as eq, ne, gt, lt, ge, or le — that you learned for if-then-else statements can be used in a WHERE statement.
- You can also use a CONTAINS operator to select observations that include the specified substring.


In [7]:
PROC PRINT data = basic;
    var name gender no_vis type_vis expense;
	where name contains 'Smi';
RUN;

Obs,name,gender,no_vis,type_vis,expense
1,Alice Smith,1,7,101,1001.98
6,John Smith,2,6,187,1763.09
7,Jane Smiley,1,5,190,3567.0


# Sorting Data

By default, the PRINT procedure displays observations in the order in which they appear in your data set. Alternatively, you can use the SORT procedure to first sort your data set based on the values of one or more variables. Then, when you use the PRINT procedure, SAS will display the observations in the order in which you sorted the data.

In [8]:
PROC SORT data = basic out = srtd_basic;
   by clinic no_vis;
RUN;

PROC PRINT data = srtd_basic NOOBS;
   var clinic no_vis subj name gender type_vis expense;
RUN;

clinic,no_vis,subj,name,gender,type_vis,expense
ALTO,1,1201,Benedictine Arnold,2,190,1450.23
ALTO,10,1168,Thomas Jones,2,190,3904.89
LEWN,2,1167,Maryann White,1,101,2999.34
LEWN,7,1024,Alice Smith,1,101,1001.98
MNMC,5,1980,Jane Smiley,1,190,3567.0
MNMC,6,1471,John Smith,2,187,1763.09
MNMC,7,1302,Felicia Ho,1,190,1209.94


### Note about SORT procedure
- SORT procedure's BY statement is required, it's OUT= option is optional. If you don't use it, however, then the SORT procedure permanently sorts the data set that is specified in the DATA= option. Therefore, if you need your data to be sorted just to produce output temporarily, then you should use the OUT= option in conjunction with a temporary SAS data set name.

- By default, SAS sorts the values of the variables appearing in the BY statement in ascending order. If you want them sorted in descending order, you need to use the BY statement's DESCENDING option.

In [9]:
PROC SORT data = basic out = srtd_basic;
   by descending clinic no_vis;
RUN;

PROC PRINT data = srtd_basic NOOBS;
   var clinic no_vis subj name gender type_vis expense;
RUN;

clinic,no_vis,subj,name,gender,type_vis,expense
MNMC,5,1980,Jane Smiley,1,190,3567.0
MNMC,6,1471,John Smith,2,187,1763.09
MNMC,7,1302,Felicia Ho,1,190,1209.94
LEWN,2,1167,Maryann White,1,101,2999.34
LEWN,7,1024,Alice Smith,1,101,1001.98
ALTO,1,1201,Benedictine Arnold,2,190,1450.23
ALTO,10,1168,Thomas Jones,2,190,3904.89


In [10]:
PROC SORT data = basic out = srtd_basic;
   by descending clinic descending no_vis;
RUN;

PROC PRINT data = srtd_basic NOOBS;
   var clinic no_vis subj name gender type_vis expense;
RUN;

clinic,no_vis,subj,name,gender,type_vis,expense
MNMC,7,1302,Felicia Ho,1,190,1209.94
MNMC,6,1471,John Smith,2,187,1763.09
MNMC,5,1980,Jane Smiley,1,190,3567.0
LEWN,7,1024,Alice Smith,1,101,1001.98
LEWN,2,1167,Maryann White,1,101,2999.34
ALTO,10,1168,Thomas Jones,2,190,3904.89
ALTO,1,1201,Benedictine Arnold,2,190,1450.23


# Column Totals

There may be situations in which you want SAS to calculate and present column totals for some of the numeric variables appearing in your reports. In that case, you'll want to take advantage of the SUM statement. We'll investigate the use of the statement here.

In [11]:
PROC PRINT data = basic;
   id name;
   var clinic no_vis;
   where type_vis = 190;
   sum no_vis;
RUN;

name,clinic,no_vis
Thomas Jones,ALTO,10
Benedictine Arnold,ALTO,1
Felicia Ho,MNMC,7
Jane Smiley,MNMC,5
,,23


- There may be situations in which you want not just column totals, but also column subtotals. Using the PRINT procedure's BY statement, you can tell SAS to print observations in groups based on the values of the different BY variables. When a SUM statement is specified in the presence of a BY statement, SAS produces subtotals each time the value of a BY variable changes.
- As you'll see is always the case, whenever a BY statement is used in any DATA step or procedure, the data set must first be sorted in order based on the variables specified in the BY statement. If not, your program will halt execution, and SAS will print a message in the log indicating that the data set is not properly sorted. 

In [12]:
PROC SORT data = basic out = srtd_basic;
  by clinic;
RUN;

PROC PRINT data = srtd_basic;
   by clinic;
   var subj name no_vis type_vis expense;
   sum expense;
RUN;

Obs,subj,name,no_vis,type_vis,expense
1,1168.0,Thomas Jones,10.0,190.0,3904.89
2,1201.0,Benedictine Arnold,1.0,190.0,1450.23
clinic,,,,,5355.12

Obs,subj,name,no_vis,type_vis,expense
3,1024.0,Alice Smith,7.0,101.0,1001.98
4,1167.0,Maryann White,2.0,101.0,2999.34
clinic,,,,,4001.32

Obs,subj,name,no_vis,type_vis,expense
5,1302.0,Felicia Ho,7.0,190.0,1209.94
6,1471.0,John Smith,6.0,187.0,1763.09
7,1980.0,Jane Smiley,5.0,190.0,3567.0
clinic,,,,,6540.03
,,,,,15896.47


In [13]:
PROC PRINT data = basic;
   by clinic;
   var subj name no_vis type_vis expense;
   sum expense;
RUN;

Obs,subj,name,no_vis,type_vis,expense
1,1024,Alice Smith,7,101,1001.98
2,1167,Maryann White,2,101,2999.34


In [14]:
PROC SORT data = basic out = srtd_basic;
  by clinic;
RUN;

PROC PRINT data = srtd_basic UNIFORM;
   by clinic;
   var subj name no_vis type_vis expense;
   sum expense;
   id clinic;
RUN;

clinic,subj,name,no_vis,type_vis,expense
ALTO,1168.0,Thomas Jones,10.0,190.0,3904.89
ALTO,1201.0,Benedictine Arnold,1.0,190.0,1450.23
ALTO,,,,,5355.12

clinic,subj,name,no_vis,type_vis,expense
LEWN,1024.0,Alice Smith,7.0,101.0,1001.98
LEWN,1167.0,Maryann White,2.0,101.0,2999.34
LEWN,,,,,4001.32

clinic,subj,name,no_vis,type_vis,expense
MNMC,1302.0,Felicia Ho,7.0,190.0,1209.94
MNMC,1471.0,John Smith,6.0,187.0,1763.09
MNMC,1980.0,Jane Smiley,5.0,190.0,3567.0
MNMC,,,,,6540.03
,,,,,15896.47


# PAGEBY Statement 

The PAGEBY statement tells SAS to print the data for each clinic on a separate page. Note that the variable that is specified in the PAGEBY statement must also be specified in the PRINT procedure's BY statement.

In [15]:
PROC SORT data = basic out = srtd_basic;
  by clinic;
RUN;

PROC PRINT data = srtd_basic UNIFORM;
   by clinic;
   var subj name no_vis type_vis expense;
   sum expense;
   id clinic;
   pageby clinic;
RUN;

clinic,subj,name,no_vis,type_vis,expense
ALTO,1168.0,Thomas Jones,10.0,190.0,3904.89
ALTO,1201.0,Benedictine Arnold,1.0,190.0,1450.23
ALTO,,,,,5355.12

clinic,subj,name,no_vis,type_vis,expense
LEWN,1024.0,Alice Smith,7.0,101.0,1001.98
LEWN,1167.0,Maryann White,2.0,101.0,2999.34
LEWN,,,,,4001.32

clinic,subj,name,no_vis,type_vis,expense
MNMC,1302.0,Felicia Ho,7.0,190.0,1209.94
MNMC,1471.0,John Smith,6.0,187.0,1763.09
MNMC,1980.0,Jane Smiley,5.0,190.0,3567.0
MNMC,,,,,6540.03
,,,,,15896.47


# Output Appearance

So far, we've focused on how to alter the content and structure of our PRINT procedure's output. Now, we'll focus a bit on how to "prettify" our output using TITLE and FOOTNOTE statements and the DOUBLE option.

In [16]:
OPTIONS LS = 72 PS = 20 NODATE NONUMBER;

PROC PRINT data = basic;
    title 'Our BASIC Data Set';
	footnote1 'Clinic: ALTO = altoona,  LEWN = Lewistown,  MNMC = Mount Nittany';
	footnote3 'Type_vis: 101 = Gynecology, 190 = Physical Therapy, 187 = Cardiology';
	footnote10 'Gender: 1 = female,  2 = male';
RUN;

footnote;

Obs,subj,name,clinic,gender,no_vis,type_vis,expense
1,1024,Alice Smith,LEWN,1,7,101,1001.98
2,1167,Maryann White,LEWN,1,2,101,2999.34
3,1168,Thomas Jones,ALTO,2,10,190,3904.89
4,1201,Benedictine Arnold,ALTO,2,1,190,1450.23
5,1302,Felicia Ho,MNMC,1,7,190,1209.94
6,1471,John Smith,MNMC,2,6,187,1763.09
7,1980,Jane Smiley,MNMC,1,5,190,3567.0


# TITLE and FOOTNOTE Statements 

- In general, the TITLE and FOOTNOTE statements can appear anywhere in your code, as they are global statements. 
- They each work as a "toggle" statement: once you specify a title and footnote, they are used for all of the subsequent output your program generates until you define another title and footnote or cancel them with empty TITLE and FOOTNOTE statements. 
- The last footnote statement in the above code is an empty footnote statement that just "turns off" the previously specified footnotes.
- You can have up to ten titles and ten footnotes appearing in a single SAS program, each denoted by a number: title1, title2, ..., title10 and footnote1, footnote2, ..., footnote10. The number tells SAS on which of ten lines you'd like the title or footnote printed. The footnotes in the above program tell SAS to print the footnotes on the first, third, and tenth footnote line.

# DOUBLE Option

If you want to make your output more readable by double-spacing it, you can use the PRINT procedure's DOUBLE option.

# Descriptive Labels: LABEL Statement

There may be some cases in which your variable names would not be particularly meaningful to other people reading your reports. Then you need to attach labels instead of your variable names:
- a LABEL statement to assign a descriptive label to a variable, and
- the LABEL option in the PROC PRINT statement to specify that labels, rather than variable names, be displayed.

The LABEL statement can be placed either in a DATA step or directly in the PRINT procedure. 
- When you place the LABEL statement in a DATA step, the label gets permanently affixed to the variable and therefore is available for all subsequent procedures. That is, you permanently change the variable's label attribute. 
- When you place the LABEL statement directly in the PRINT procedure, the label is available for use only in the PRINT procedure in which it is specified.

### IMPORTANT: As a default, SAS does not print labels. You must use the LABEL option to tell it to do so. 

In [17]:
PROC PRINT data = basic LABEL;
    label name = 'Name'
	      no_vis = 'Number of Visits'
		  type_vis = 'Type of Visit'
		  expense = 'Expense';
	id name;
	var no_vis type_vis expense;
RUN;

Name,Number of Visits,Type of Visit,Expense
Alice Smith,7,101,1001.98
Maryann White,2,101,2999.34
Thomas Jones,10,190,3904.89
Benedictine Arnold,1,190,1450.23
Felicia Ho,7,190,1209.94
John Smith,6,187,1763.09
Jane Smiley,5,190,3567.0


# SPLIT Option: Splitting Label

In [18]:
PROC PRINT data = basic SPLIT='/';
    label name = 'Name';
	label no_vis = 'Number of/Visits';
	label type_vis = 'Type of Visit';
    label expense = 'Expense';
	id name;
	var no_vis type_vis expense;
RUN;

Name,Number of Visits,Type of Visit,Expense
Alice Smith,7,101,1001.98
Maryann White,2,101,2999.34
Thomas Jones,10,190,3904.89
Benedictine Arnold,1,190,1450.23
Felicia Ho,7,190,1209.94
John Smith,6,187,1763.09
Jane Smiley,5,190,3567.0


# FORMAT Statement: Formatting Data Values

You might recall that informats are used to tell SAS how to read special data values into your SAS data sets, and formats are used to tell SAS how to display those special data values in your reports. As you might recall from your prior (but admittedly brief) work with dates, when SAS stores special data values, it doesn't necessarily store numbers that would be meaningful to a casual reader of your reports. As a result, you have to use a FORMAT statement to tell SAS to display the stored numbers in a way that is meaningful to you and your readers.

In [19]:
PROC PRINT data = basic LABEL;
   label name = 'Name'
         clinic = 'Clinic'
         expense = 'Expense';
   format expense dollar9.2;
   id name;
   var clinic expense;
RUN;

Name,Clinic,Expense
Alice Smith,LEWN,"$1,001.98"
Maryann White,LEWN,"$2,999.34"
Thomas Jones,ALTO,"$3,904.89"
Benedictine Arnold,ALTO,"$1,450.23"
Felicia Ho,MNMC,"$1,209.94"
John Smith,MNMC,"$1,763.09"
Jane Smiley,MNMC,"$3,567.00"


- The FORMAT statement tells SAS to associate, for the duration of the PRINT procedure, the dollar9.2 format with the expense variable. 
- The dollar9.2 format tells SAS to display the expense values using dollar signs, commas (when appropriate), and two decimal places. 
- The 9 tells SAS that it will need at most 9 spaces to accommodate each expense value — 1 for the dollar sign, 1 for the comma sign, 4 for the digits before the decimal place, 1 for the decimal place, and 2 for the decimal place digits.

# Most Commonly Used SAS Formats

In general, you can use a separate FORMAT statement for each variable, or you can format several variables in a single FORMAT statement. The table below illustrates some of the most commonly used SAS formats:

- COMMAw.d: w spaces that contain commas and d decimal places
- DOLLARw.d: w spaces that contain dollar signs, commas, and d decimal places
- MMDDYYw.: as date values of the form 10/03/08 (mmddyy8.) or 10/03/2008 (mmddyy10.)
- w.:	rounded to the nearest integer in w spaces
- w.d: rounded to d decimal places in w spaces
- \$w.: as character values in w spaces
- DATEw.: as date values of the form 02OCT08 (date7.) or 02OCT2008 (date9.) in w spaces

Of course, you can find the other formats that are available using the SAS Help and Documentation.