### Analyzing MEPS-HC Data with SAS® 9.4M6 
#### By Pradip K. Muhuri, PhD

## Exercise 3

### Objective

* Estimate mean out-of-pocket health care expenses for individuals who were aged 26-30 years with high income and uninsured for the whole year

### Data and Analysis
* Combine data from 2016 and 2017 MEPS Full-Year Consolidated Files 
* Run PROC FREQ and PROC MEANS for data checks
* Run PROC SURVEYMEANS for complex survey estimates


### Concatenate two full-year data files into one SAS data set
* KEEP only needed variables when loading the original SAS data sets
* RENAME year-specific variables as variable with year suffix 
        prior to concatenating those files
* Create new variables including the subpopulation variable

In [9]:
options nodate nonumber nosource;
ods html close;
LIBNAME CDATA 'C:\DATA';
DATA POOL;
    SET CDATA.H192 
       (in=a KEEP= DUPERSID INSCOV16 PERWT16F VARSTR VARPSU POVCAT16 AGELAST TOTSLF16
        RENAME=(INSCOV16=INSCOV PERWT16F=PERWT POVCAT16=POVCAT TOTSLF16=TOTSLF))
        
       CDATA.H201 
        (in=b KEEP= DUPERSID INSCOV17 PERWT17F VARSTR VARPSU POVCAT17 AGELAST TOTSLF17
        RENAME=(INSCOV17=INSCOV PERWT17F=PERWT POVCAT17=POVCAT TOTSLF17=TOTSLF));
                
     /* Create a YEAR Variable for checking data*/
     if a=1 then year=2016;
     else if b=1 then year=2017;

     POOLWT = PERWT/2 ;  /* Pooled survey weight */

     /*Create a dichotomous SUBPOP variable 
     (POPULATION WITH AGE=26-30, UNINSURED WHOLE YEAR, AND HIGH INCOME)
     */
     IF 26 LE AGELAST LE 30 AND POVCAT=5 AND INSCOV=3 THEN SUBPOP=1;
     ELSE SUBPOP=2;    
 run;


### Code snippet for PROC FORMAT

In [10]:
options nocenter nodate nonumber nosource;
PROC FORMAT;
VALUE POVCAT 
    1 = '1 POOR/NEGATIVE'
    2 = '2 NEAR POOR'
    3 = '3 LOW INCOME'
    4 = '4 MIDDLE INCOME'
    5 = '5 HIGH INCOME'   ;

VALUE INSF
    1 = '1 ANY PRIVATE'
    2 = '2 PUBLIC ONLY'
    3 = '3 UNINSURED';

   VALUE  SUBPOP (max= 50)
    1 = 'AGE 26-30, UNINSURED WHOLE YEAR, AND HIGH INCOME'
    2 ='OTHERS';
run;


#### Use PROC FREQ and PROC MEANS for data checks

In [11]:
TITLE "COMBINED MEPS DATA FROM 2016 and 2017 Consolidated Files";
PROC FREQ DATA=WORK.POOL;
    TABLES YEAR*SUBPOP*POVCAT*INSCOV /LIST MISSING nopercent ;
           FORMAT SUBPOP SUBPOP. POVCAT POVCAT. INSCOV INSF.;
RUN;
PROC MEANS DATA=POOL;
RUN;

year,SUBPOP,POVCAT,INSCOV,Frequency,Cumulative Frequency
2016,"AGE 26-30, UNINSURED WHOLE YEAR, AND HIGH INCOME",5 HIGH INCOME,3 UNINSURED,73,73
2016,OTHERS,1 POOR/NEGATIVE,1 ANY PRIVATE,1112,1185
2016,OTHERS,1 POOR/NEGATIVE,2 PUBLIC ONLY,5337,6522
2016,OTHERS,1 POOR/NEGATIVE,3 UNINSURED,1087,7609
2016,OTHERS,2 NEAR POOR,1 ANY PRIVATE,551,8160
2016,OTHERS,2 NEAR POOR,2 PUBLIC ONLY,1284,9444
2016,OTHERS,2 NEAR POOR,3 UNINSURED,315,9759
2016,OTHERS,3 LOW INCOME,1 ANY PRIVATE,2155,11914
2016,OTHERS,3 LOW INCOME,2 PUBLIC ONLY,2609,14523
2016,OTHERS,3 LOW INCOME,3 UNINSURED,905,15428

Variable,Label,N,Mean,Std Dev,Minimum,Maximum
AGELAST POVCAT INSCOV TOTSLF PERWT VARSTR VARPSU year POOLWT SUBPOP,"PERSON'S AGE LAST TIME ELIGIBLE FAMILY INC AS % OF POVERTY LINE - CATEGO HEALTH INSURANCE COVERAGE INDICATOR 2016 TOTAL AMT PAID BY SELF/FAMILY 16 FINAL PERSON WEIGHT, 2016 VARIANCE ESTIMATION STRATUM - 2016 VARIANCE ESTIMATION PSU - 2016",66535 66535 66535 66535 66535 66535 66535 66535 66535 66535,37.1551965 3.3853911 1.5502668 504.3511084 9738.06 1314.88 1.6346284 2016.48 4869.03 1.9981964,23.1197665 1.4828090 0.6699385 1712.03 8326.14 416.0162751 0.6268726 0.4995687 4163.07 0.0424304,0 1.0000000 1.0000000 0 0 1001.00 1.0000000 2016.00 0 1.0000000,85.0000000 5.0000000 3.0000000 130816.00 104865.55 2117.00 3.0000000 2017.00 52432.78 2.0000000


#### Code explanation for the next cell
* With no statistic-keywords specifiedin the PROC SURVEYMEANS statement, it computes the NOBS, MEANS, STDERR, and CLM statistics by default.

* ODS GRAPHICS OFF;  - suppresses the graphics 
* ODS EXCLUDE STATISTICS; - tells SAS not to generate output for the overall population 

* PROC SURVEYMEANS estimates mean out-of-pocket health care expenses for individuals who were aged 26-30 years with high income and uninsured for the whole year


In [12]:
TITLE2 'WEIGHTED ESTIMATE FOR OUT-OF-POCKET EXPENSES FOR PERSONS AGES 26-30, UNINSURED WHOLE YEAR, AND HIGH INCOME';
ODS GRAPHICS OFF;
ODS EXCLUDE STATISTICS; 
PROC SURVEYMEANS DATA=WORK.POOL; 
    VAR  TOTSLF;
    STRATUM VARSTR ;
    CLUSTER VARPSU ;
    WEIGHT  POOLWT;
  DOMAIN  SUBPOP("AGE 26-30, UNINSURED WHOLE YEAR, AND HIGH INCOME");
    FORMAT SUBPOP SUBPOP.;
RUN;

Data Summary,Data Summary.1
Number of Strata,282
Number of Clusters,625
Number of Observations,66535
Number of Observations Used,63975
Number of Obs with Nonpositive Weights,2560
Sum of Weights,323960798

Statistics for SUBPOP Domains,Statistics for SUBPOP Domains,Statistics for SUBPOP Domains,Statistics for SUBPOP Domains,Statistics for SUBPOP Domains,Statistics for SUBPOP Domains,Statistics for SUBPOP Domains,Statistics for SUBPOP Domains
SUBPOP,Variable,Label,N,Mean,Std Error of Mean,95% CL for Mean,95% CL for Mean.1
"AGE 26-30, UNINSURED WHOLE YEAR, AND HIGH INCOME",TOTSLF,TOTAL AMT PAID BY SELF/FAMILY 16,98,300.038263,93.384198,116.360484,483.716042
