### GWU STAT 4197/STAT 6197
##### Week 6 SAS Code Examples: Summarizing Data
[Williams, Christianna. (2015). PROC SQL for PROC SUMMARY Stalwarts.  SESUG](https://support.sas.com/resources/papers/proceedings15/3154-2015.pdf)

[Carpenter. A. L. (2010). The MEANS/SUMMARY Procedure: Getting Started. SAS Global Forum.](https://support.sas.com/resources/papers/proceedings10/135-2010.pdf)

[A Simple Proc Summary Example in SAS](https://sasnrd.com/proc-summary-sas-example/)

The following two code blocks (DATA Step and PROC SQL) 
    provide the same results -  there are other DATA step solutions
    (not shown here).

In [1]:
options nocenter nodate nonumber;
*Count occurrences of dates by patient id;
data HAVE;
input pt_id $ vdate :$10.;
cards;
1      09/01/2017   
1      09/01/2017   
1      03/04/2018   
2      05/01/2017  
2      06/03/2017 
;
run;
proc sort data=HAVE;
by pt_id vdate;
run;

SAS Connection established. Subprocess id is 11136



##### PROC SUMMARY 


In [33]:
*Ex8_Sum_Retain_Do_Until.sas (Part 5);
options nocenter nodate nonumber;
** Summarize data using PROC SUMMARY;
proc summary data=have nway;
  var value;
  class id;  
  output out=want (drop =_type_ _freq_)
    sum=sum_value;
run;
title1 'Sumarizing by Group - using PROC SUMMARY';
proc print data=want noobs; run;

ID,sum_value
1,90
2,80
3,60


In [40]:
*Ex8B_Collapse_multi_records.sas (Part 1);
options nocenter nodate nonumber;
data have;
 input id type value count;
 cards;
 1 1 32 2
 1 1 10 7
 1 2 20 10
 1 2 59 2
 1 3 54 1
 1 3 82 4
 1 4 68 2
 1 5 56 0
 2 1 52 8
 2 5 64 9
 2 3 76 6
 2 4 98 8
 2 2 39 9
 2 4 96 5
 3 1 58 2
 3 2 63 6
 3 4 72 3
 3 5 99 4
 3 3 37 1
 3 1 66 0
 ;
 title1 'Example Data Set';
proc print noobs; run;

id,type,value,count
1,1,32,2
1,1,10,7
1,2,20,10
1,2,59,2
1,3,54,1
1,3,82,4
1,4,68,2
1,5,56,0
2,1,52,8
2,5,64,9


###### PROC SUMMARY

* The procedure generates descriptive statistics including the following:
    * N, MIN, MAX, MEAN and STD (by default)
    * sum 
* The output statement with the OUT= option creates a SAS data set with the summary statistics
* The procedure allows you to name more than one statistic on the OUTPUT statement
* It does not automatically generate output
* It creates three new variables:
    * `_TYPE_`: Indicates which combination of the class variables is used to compute the        statistic. 
    * `_FREQ_`: The number of observations that contribute to the calculation of the statistic in Proc Summary
    * `_STAT_`: The name of the statistical size.
* When NWAY specified, it outputs only the observations where all class variables (if any) contribute to the statistic, but no overall statistics


In [41]:
*Ex8B_Collapse_multi_records.sas (Part 2);
title1 'Agggregate the values of the numeric variable';
options nocenter nodate nonumber;
 proc summary data=have nway missing;
   class id type;
   var value count;
   output out=sums(drop=_:) sum=;
 proc print data=sums noobs; run;

id,type,value,count
1,1,42,9
1,2,79,12
1,3,136,5
1,4,68,2
1,5,56,0
2,1,52,8
2,2,39,9
2,3,76,6
2,4,194,13
2,5,64,9


In [42]:
 *Ex8B_Collapse_multi_records.sas (Part 3);
options nocenter nodate nonumber;
title1 'Transpose the Agggregated table';
 proc transpose data=sums out=trans
   prefix=type_;
   by id;
   id type;
 proc print data=trans noobs; run;

id,_NAME_,type_1,type_2,type_3,type_4,type_5
1,value,42,79,136,68,56
1,count,9,12,5,2,0
2,value,52,39,76,194,64
2,count,8,9,6,13,9
3,value,124,63,37,72,99
3,count,2,6,1,3,4


In [43]:
 *Ex8B_Collapse_multi_records.sas (Part 4);
options nocenter nodate nonumber;
title1 'Merge the Transposed data using two SET statements';
 data want;
   set trans(where=(_name_='count'));
   count=sum(of type_:);
   set trans(where=(_name_='value'));
   drop _name_;
 proc print data=want noobs; run;


id,type_1,type_2,type_3,type_4,type_5,count
1,42,79,136,68,56,28
2,52,39,76,194,64,45
3,124,63,37,72,99,16
