Skip to content

Latest commit

 

History

History
110 lines (71 loc) · 5.29 KB

README.md

File metadata and controls

110 lines (71 loc) · 5.29 KB

Medical Expenditure Panel Survey data

https://meps.ahrq.gov/mepsweb/

A quick guide:

cd to the current code directory, and run

Rscript download_data.R

You should see the files h181.csv and h192.csv in the code directory. Then, to clean the raw files and create the datasets, run

python main_clean_and_save_to_csv.py

Now, you should see 3 new files: meps_19_reg.csv, meps_20_reg.csv, and meps_21_reg.csv. These are the csv files that we used in our experiments.

The following sections provide more detailed explanation.

Note: the code and the following text is copied from IBM's AIF360 package.

The Medical Expenditure Panel Survey (MEPS) data consists of large scale surveys of families and individuals, medical providers, and employers, and collects data on health services used, costs & frequency of services, demographics, etc., of the respondents.

Please refer to https://github.com/IBM/AIF360 for more details.

Source / Data Set Description:

Data Use Agreement

As the user of the data it is your responsibility to read and abide by any copyright/usage rules and restrictions as stated on the MEPS web site before downloading the data.

Download instructions

In order to use the MEPS datasets, please follow the following directions to download the datafiles and convert into csv files.

Follow either set of instructions below for using R or SPSS. Further instructions for SAS, and Stata, are available at the AHRQ MEPS Github repository.

  • Generating CSV files with R

    In the current folder run the R script download_data.R. R can be downloaded from CRAN. If you are working on Mac OS X the easiest way to get the R command line support is by installing it with Homebrew brew install R.

    Rscript download_data.R

    Example output:

    Loading required package: foreign
    
    trying URL 'https://meps.ahrq.gov/mepsweb/data_files/pufs/h181ssp.zip'
    Content type 'application/zip' length 13303652 bytes (12.7 MB)
    ==================================================
    downloaded 12.7 MB
    
    Loading dataframe from file: h181.ssp
    Exporting dataframe to file: h181.csv
    
    trying URL 'https://meps.ahrq.gov/mepsweb/data_files/pufs/h192ssp.zip'
    Content type 'application/zip' length 15505898 bytes (14.8 MB)
    ==================================================
    downloaded 14.8 MB
    
    Loading dataframe from file: h192.ssp
    Exporting dataframe to file: h192.csv
    
  • Generating CSV files with SPSS

    The instructions below require the use of SPSS.

    1. 2015 full Year Consolidated Data File

      • Download the Data File, ASCII format
      • Extract the file h181.dat from downloaded zip archive
      • Convert the file to comma-delimited format, h181.csv, and save in this folder.
        • To convert the .dat file into csv format,download one of the programming statements files, such as the SPSS Programming Statements file.
        • Edit this file to change the FILE HANDLE name to the complete path/name of the downloaded data file, execute the SPSS programming statements to load the data, and 'save as' a comma-delimited file called 'h181.csv' in the current folder.
    2. 2016 full Year Consolidated Data File

      • Download the Data File, ASCII format
      • Extract the file h192.dat from downloaded zip archive
      • Convert the file to comma-delimited format, h192.csv, and save in current repository.
        • To convert the .dat file into csv format,download one of the programming statements files, such as the SPSS Programming Statements file.
        • Edit this file to change the FILE HANDLE name to the complete path/name of the downloaded data file, execute the SPSS programming statements to load the data, and 'save as' a comma-delimited file called 'h192.csv' in this folder.

Cleaning the Data

To clean the raw files and create the 3 MEPS datasets used in the our paper, run

python main_clean_and_save_to_csv.py

which produces the files: 'meps_19_reg.csv', 'meps_20_reg.csv', and 'meps_21_reg.csv'.