### Analyzing MEPS-HC Data with SAS® 9.4 M6 
#### By Pradip K. Muhuri

## Exercise 1

#### Objective
* Generate the following estimates
     * mean health care expenses per person
     * mean health expenses per person with an expense ( overall, and by age group)

#### Data and Analysis
     * Use 2017 MEPS Full-Year Consolidated File
     * Run PROC FREQ for data checks
     * Run PROC SURVEYMEANS for complex survey estimates

### MEPS Full-Year Consolidated File, 2017

This is a person-level data which includes annual variables such as 
* total annual healthcare expenditures by type of care
* payment source, and type of provider seen
* annual and monthly health insurance type indicators
* health conditions, healthcare access and utilization
* quality of care, patient satisfaction, and demographics

[Read here more anout the 2017 Full-Year Consolidated File.](https://meps.ahrq.gov/mepsweb/data_stats/download_data_files_detail.jsp?cboPufNumber=HC-201)

This file contains a total of 31,880 persons who were part of one of the two MEPS panels for whom data were collected in that year:

* 2017 portion of Round 3, Rounds 4 and 5 for Panel 21
* Rounds 1, 2 and the 2017 portion of Round 3 for Panel 22


## Why Use PROC SURVEYMEANS

The MEPS-HC uses  sample design features including stratification, clustering, and oversampling. [Due to complexities in the MEPS-HC  sample designs](https://meps.ahrq.gov/data_files/publications/mr33/mr33.shtml), we must specify the survey weight and design characteristics in PROC SURVEYMEANS step when estimating the parameter for the U.S. civilian noninstitutionalized population.

[See SAS/STAT® 15.1 User’s Guide Introduction to Survey Sampling and Analysis Procedures](https://support.sas.com/documentation/onlinedoc/stat/151/introsamp.pdf)
```
PROC SURVEYMEANS DATA=WORK.PUF201;
   VAR TOTEXP17;
   STRATUM VARSTR;
   CLUSTER VARPSU;
   WEIGHT PERWT17F;
RUN;

(In the above example code ...)
The VAR statement identifies the variable to be analyzed.
The STRATUM statement lists the variable that form the strata.
The CLUSTER statement specifies the cluster identification variable.
The WEIGHT statement names the sampling weight variable.

```    
Notes: If you do not specify statistic-keywords in the PROC SURVEYMEANS statement, it computes the NOBS, MEANS, STDERR, and CLM statistics by default. If you specify the statistic-keywords of your interest including SUM (i.e., estimated population total when the appropriate sampling weights are used) in that statement, the procedure computes STD by default.


The following SAS program (split into four separate code cells) generates the following estimates on national health care expenses for the civilian noninstitutionalized population.

### Code snippets for DATA Step
* Subset the number of variables
* Create new variables

In [1]:
options nocenter nodate nonumber;
ods html close;
proc datasets lib=work nolist kill; quit; /* delete  all files in the WORK library */
libname CDATA "C:\DATA"; 
/* READ IN DATA FROM 2017 CONSOLIDATED DATA FILE (HC-201) */
DATA WORK.PUF201;
  SET CDATA.H201 (KEEP = TOTEXP17 AGELAST VARSTR  VARPSU  PERWT17F);
       /* Create a new TOTEXP17_X variable */
       TOTEXP17_x = TOTEXP17; 
run;

SAS Connection established. Subprocess id is 14140



### Code snippets for PROC FORMAT

In [2]:
options nocenter nodate nonumber nosource;
ods html close;
PROC FORMAT;
  VALUE AGECAT
        0-64 = '0-64'
       65-high = '65+';

   VALUE totexp17_x
       0-high     = 'Some Expense'
       other      = 'None';
RUN;

### Code snippets for PROC SURVEYMEANS

In [3]:
options nocenter nodate nonumber;
ods html close;
ods graphics off; /*Suppress the graphics */
TITLE 'OVERALL EXPENSES';
PROC SURVEYMEANS DATA=WORK.PUF201 NOBS SUMWGT MEAN STDERR SUM ;
    VAR TOTEXP17  ;
    STRATUM VARSTR;
    CLUSTER VARPSU;
    WEIGHT PERWT17F;
RUN;

Data Summary,Data Summary.1
Number of Strata,282
Number of Clusters,621
Number of Observations,31880
Number of Observations Used,30716
Number of Obs with Nonpositive Weights,1164
Sum of Weights,324779909

Statistics,Statistics,Statistics,Statistics,Statistics,Statistics,Statistics,Statistics
Variable,Label,N,Sum of Weights,Mean,Std Error of Mean,Sum,Std Error of Sum
TOTEXP17,TOTAL HEALTH CARE EXP 17,30716,324779909,5305.562271,125.92792,1723140000000.0,46588497185


#### Code explanation for the DOMAIN statement
##### TOTEXP17_X('Some Expense')*AGELAST  in the code cell below indicates that only the results associated with TOTEXP17_X='Some Expense' (subpopulation) for each category of AGECAT are of interest here.

In [12]:
options nocenter nodate nonumber ls=132;
TITLE 'MEAN EXPENSE PER PERSON WITH AN EXPENSE, FOR OVERALL, AGE 0-64, AND AGE 65+';
ods graphics off; /*Suppress the graphics */
ODS EXCLUDE STATISTICS; /* Not to generate output for the overall population */
PROC SURVEYMEANS DATA= WORK.PUF201 NOBS SUMWGT MEAN STDERR SUM ;
    VAR  TOTEXP17;
    STRATUM VARSTR ;
    CLUSTER VARPSU ;
    WEIGHT  PERWT17F ;
    DOMAIN TOTEXP17_X('Some Expense')  TOTEXP17_X('Some Expense')*AGELAST ;
    FORMAT TOTEXP17_X TOTEXP17_X. AGELAST agecat. ;
RUN;

Data Summary,Data Summary.1
Number of Strata,282
Number of Clusters,621
Number of Observations,31880
Number of Observations Used,30716
Number of Obs with Nonpositive Weights,1164
Sum of Weights,324779909

Statistics for TOTEXP17_x Domains,Statistics for TOTEXP17_x Domains,Statistics for TOTEXP17_x Domains,Statistics for TOTEXP17_x Domains,Statistics for TOTEXP17_x Domains,Statistics for TOTEXP17_x Domains,Statistics for TOTEXP17_x Domains,Statistics for TOTEXP17_x Domains,Statistics for TOTEXP17_x Domains
TOTEXP17_x,Variable,Label,N,Sum of Weights,Mean,Std Error of Mean,Sum,Std Error of Sum
Some Expense,TOTEXP17,TOTAL HEALTH CARE EXP 17,30716,324779909,5305.562271,125.92792,1723140000000.0,46588497185

Statistics for TOTEXP17_x*AGELAST Domains,Statistics for TOTEXP17_x*AGELAST Domains,Statistics for TOTEXP17_x*AGELAST Domains,Statistics for TOTEXP17_x*AGELAST Domains,Statistics for TOTEXP17_x*AGELAST Domains,Statistics for TOTEXP17_x*AGELAST Domains,Statistics for TOTEXP17_x*AGELAST Domains,Statistics for TOTEXP17_x*AGELAST Domains,Statistics for TOTEXP17_x*AGELAST Domains,Statistics for TOTEXP17_x*AGELAST Domains
TOTEXP17_x,AGELAST,Variable,Label,N,Sum of Weights,Mean,Std Error of Mean,Sum,Std Error of Sum
Some Expense,0-64,TOTEXP17,TOTAL HEALTH CARE EXP 17,25829,272065819,4104.254934,119.828968,1116627500000.0,36159659869
,65+,TOTEXP17,TOTAL HEALTH CARE EXP 17,4887,52714089,11506.0,351.766501,606512548182.0,24232442940


In [20]:
options nocenter nodate nonumber;
ods html close;
proc datasets lib=work nolist kill; quit; /* delete  all files in the WORK library */
libname CDATA "C:\DATA"; 
/* READ IN DATA FROM 2017 CONSOLIDATED DATA FILE (HC-201) */
DATA WORK.PUF201;
  SET CDATA.H201 (KEEP = TOTEXP17 OBVEXP17 OPTEXP17 ERTEXP17 IPTEXP17 RXEXP17
                         AGELAST VARSTR  VARPSU  PERWT17F);
       /* Create a new TOTEXP17_X variable */
       TOTEXP17_x = TOTEXP17; 
run;

In [25]:
options nocenter nodate nonumber nosource;
ods html close;
PROC FORMAT;
  VALUE AGECAT
       0-34 = '0-34'
       35-64 = '35-64'
       65-high = '65+';

   VALUE totexp17_x
       0-high     = 'Some Expense'
       other      = 'None';
RUN;

In [26]:
options nocenter nodate nonumber ls=132;
libname CDATA "C:\DATA";
TITLE 'MEAN EXPENSE PER PERSON WITH AN EXPENSE, FOR OVERALL, AGE 0-64, AND AGE 65+';
ods graphics off; /*Suppress the graphics */
ODS select domain; /* Generate output for domain only */
PROC SURVEYMEANS DATA= work.puf201  ;
    VAR  TOTEXP17 OBVEXP17 OPTEXP17 ERTEXP17  IPTEXP17 RXEXP17;
    STRATUM VARSTR ;
    CLUSTER VARPSU ;
    WEIGHT  PERWT17F ;
    DOMAIN TOTEXP17_X('Some Expense')  TOTEXP17_X('Some Expense')*AGELAST ;
    FORMAT TOTEXP17_X TOTEXP17_X. AGELAST agecat. ;
RUN;

Statistics for TOTEXP17_x Domains,Statistics for TOTEXP17_x Domains,Statistics for TOTEXP17_x Domains,Statistics for TOTEXP17_x Domains,Statistics for TOTEXP17_x Domains,Statistics for TOTEXP17_x Domains,Statistics for TOTEXP17_x Domains,Statistics for TOTEXP17_x Domains
TOTEXP17_x,Variable,Label,N,Mean,Std Error of Mean,95% CL for Mean,95% CL for Mean.1
Some Expense,TOTEXP17,TOTAL HEALTH CARE EXP 17,30716,5305.562271,125.92792,5057.86376,5553.26078
,OBVEXP17,TOTAL OFFICE-BASED EXP 17,30716,1311.941694,36.212779,1240.71165,1383.17174
,OPTEXP17,TOTAL OUTPATIENT FAC + DR EXP 17,30716,465.383297,23.720744,418.72491,512.04168
,ERTEXP17,TOTAL ER FACILITY + DR EXP 17,30716,197.528578,7.24858,183.27072,211.78644
,IPTEXP17,TOT HOSP IP FACILITY + DR EXP 17,30716,1300.459866,79.885776,1143.32563,1457.59411
,RXEXP17,TOTAL RX-EXP 17,30716,1260.571022,43.095949,1175.80187,1345.34017

Statistics for TOTEXP17_x*AGELAST Domains,Statistics for TOTEXP17_x*AGELAST Domains,Statistics for TOTEXP17_x*AGELAST Domains,Statistics for TOTEXP17_x*AGELAST Domains,Statistics for TOTEXP17_x*AGELAST Domains,Statistics for TOTEXP17_x*AGELAST Domains,Statistics for TOTEXP17_x*AGELAST Domains,Statistics for TOTEXP17_x*AGELAST Domains,Statistics for TOTEXP17_x*AGELAST Domains
TOTEXP17_x,AGELAST,Variable,Label,N,Mean,Std Error of Mean,95% CL for Mean,95% CL for Mean.1
Some Expense,0-34,TOTEXP17,TOTAL HEALTH CARE EXP 17,14225,2503.337764,129.826867,2247.9701,2758.7055
,,OBVEXP17,TOTAL OFFICE-BASED EXP 17,14225,691.992413,25.014473,642.7893,741.1955
,,OPTEXP17,TOTAL OUTPATIENT FAC + DR EXP 17,14225,176.283632,16.958637,142.9262,209.641
,,ERTEXP17,TOTAL ER FACILITY + DR EXP 17,14225,143.952879,9.742985,124.7886,163.1172
,,IPTEXP17,TOT HOSP IP FACILITY + DR EXP 17,14225,646.419685,104.552299,440.7667,852.0726
,,RXEXP17,TOTAL RX-EXP 17,14225,390.05481,28.546296,333.9046,446.205
,35-64,TOTEXP17,TOTAL HEALTH CARE EXP 17,11604,6006.545902,193.515751,5625.903,6387.1888
,,OBVEXP17,TOTAL OFFICE-BASED EXP 17,11604,1554.145385,76.934449,1402.8164,1705.4744
,,OPTEXP17,TOTAL OUTPATIENT FAC + DR EXP 17,11604,615.379593,44.102435,528.6307,702.1285
,,ERTEXP17,TOTAL ER FACILITY + DR EXP 17,11604,229.032803,12.508893,204.428,253.6376
