A repo for working with data from CMS's EHR Incentive Programs Data and Program Reports page. after a tweet from @Cascadia and subsequent chat through direct messages inspired me to do something with the data.
Screenshot last updated 9/12/2013 at 12pm EST
- Disclaimer front and center on the live site: according to CMS's EHR Incentive Programs Data and Program Reports page, only hospital-level data on Medicare eligible hospitals (EH) and providers (EP) are available as HITECH "Act does not require CMS to post the names of eligible professionals, eligible hospitals and CAHs that have received Medicaid EHR Incentive Program payments."
- All files in this repo's
data
directory are from CMS's EHR Incentive Programs Data and Program Reports page. They are included only for convenience to fellow developers looking to get up and running with a copy of the data. - Using Data Science Toolkit for geocoding provider addresses but started getting 500 Internal Server Errors when using public DSTK host so I brought up my own instance (m1.medium) on Amazon EC2. If you choose to do the same, edit the
DSTK_HOST
variable inlib/tasks/geocode.rake
- ProvidersPaidByEHRProgram_June2013 data files have been normalized by @geek_nurse and @skram to make them more suitable for database querying
- When geocoding, the address information from the CMS ProvidersPaidByEHRProgram is used, if available. If the provider has not received incentive payments or no address is available, the address from the Hospital General Information data set is used in the geocoding process
- The normalized EP spreadsheet has about 1,300 duplicate NPIs out of 190,000+. This is after the normalization effort.
EH: Providers Paid By EHR Program: September 2013 Eligible Hospitals
-
Create a directory for the raw data and later exports:
mkdir -p public/data/ProvidersPaidByEHRProgram_Sep2013_EH/geojson
-
Download data file:
curl http://www.cms.gov/Regulations-and-Guidance/Legislation/EHRIncentivePrograms/Downloads/EH_ProvidersPaidByEHRProgram_Sep2013_FINAL.zip -o public/data/ProvidersPaidByEHRProgram_Sep2013_EH/EH_ProvidersPaidByEHRProgram_Sep2013_FINAL.zip
-
Unzip data file:
unzip public/data/ProvidersPaidByEHRProgram_Sep2013_EH/EH_ProvidersPaidByEHRProgram_Sep2013_FINAL.zip -d public/data/ProvidersPaidByEHRProgram_Sep2013_EH/
-
Import CSV into MongoDB and ensure the fields are properly formatted.
bundle exec rake hospitals:ingest_latest_payments_csv bundle exec rake hospitals:ensure_fields_are_properly_formatted
-
Bring in additional data from the General Hospital Information and HCAHPS (patient experience) data sets on Socrata:
bundle exec rake hospitals:ingest_general_info bundle exec rake hospitals:ingest_hcahps bundle exec rake hospitals:ingest_joint_commission_ids bundle exec rake hospitals:ingest_hc_hais bundle exec rake hospitals:ingest_hc_hacs bundle exec rake hospitals:ingest_ahrq_m bundle exec rake hospitals:ingest_ooc bundle exec rake hospitals:ingest_cms_form_2552_10
-
Geocode provider addresses:
bundle exec rake geocode
-
Print out a nice little report about hospital counts with different types of data (geo, general info, hcahps):
bundle exec rake hospitals:simple_report
-
Export select information to CSV for safe keeping and offline analysis:
mongoexport --csv -d cms_incentives -c ProvidersPaidByEHRProgram_June2013_EH -o public/data/ProvidersPaidByEHRProgram_June2013_EH/ProvidersPaidByEHRProgram_June2013_EH-normalized-geocodedAndSelectedData.csv -f "PROVIDER NPI,PROVIDER CCN,PROVIDER - ORG NAME,PROVIDER STATE,PROVIDER CITY,PROVIDER ADDRESS,PROVIDER ZIP 5 CD,PROVIDER ZIP 4 CD,PROVIDER PHONE NUM,PROVIDER PHONE EXT,PROGRAM YEAR 2011,PROGRAM YEAR 2012,PROGRAM YEAR 2013,geo.provider,geo.updated_at,geo.data.types.0,geo.data.geometry.location.lat,geo.data.geometry.location.lng,general.hospital_type,general.hospital_owner,general.emergency_services,general.country_name,hcahps.survey_response_rate_percent,hcahps.number_of_completed_surveys,hcahps.percent_of_patients_who_reported_yes_they_would_definitely_recommend_the_hospital_,jc.org_id,hc_hais"
-
Create MongoDB indexes:
bundle exec rake mongodb:mongoid_create_indexes
-
If you intend to run the visualization in a production environemnt:
bundle exec ruby app.rb -p 4567 -e development curl http://localhost:4567/db/cms_incentives/EH/all.geojson -o public/data/ProvidersPaidByEHRProgram_Sep2013_EH/geojson/all.geojson
rm public/static/* bundle exec rake assetpack:build
git push heroku master
bundle exec rake mongodb:export_to_mongohq
EP: Providers Paid By EHR Program: June 2013 Eligible Providers
-
Create a directory for the raw data and later exports:
mkdir -p public/data/ProvidersPaidByEHRProgram_June2013_EP/
-
Download data file:
curl http://www.cms.gov/Regulations-and-Guidance/Legislation/EHRIncentivePrograms/Downloads/ProvidersPaidByEHRProgram_June2013_EP.zip -o public/data/ProvidersPaidByEHRProgram_June2013_EP/ProvidersPaidByEHRProgram_June2013_EP.zip
-
Unzip data file:
unzip public/data/ProvidersPaidByEHRProgram_June2013_EP/ProvidersPaidByEHRProgram_June2013_EP.zip -d public/data/ProvidersPaidByEHRProgram_June2013_EP/
-
Import CSV into MongoDB and ensure the fields are properly formatted. See EH section note for step 4 above. Same applies here, for EPs.
mongoimport --type csv -d cms_incentives -c ProvidersPaidByEHRProgram_June2013_EP --headerline --file public/data/ProvidersPaidByEHRProgram_June2013_EP/ProvidersPaidByEHRProgram_June2013_EP-normalizedByBrianNorris.csv
bundle exec rake providers:ensure_fields_are_properly_formatted
-
Update for latest CSV which includes payment data: mkdir -p public/data/ProvidersPaidByEHRProgram_Sep2013_EP/geojson
curl http://www.cms.gov/Regulations-and-Guidance/Legislation/EHRIncentivePrograms/Downloads/EP_ProvidersPaidByEHRProgram_Sep2013_FINAL.zip -o public/data/ProvidersPaidByEHRProgram_Sep2013_EP/ProvidersPaidByEHRProgram_Sep2013_EP.zip
unzip public/data/ProvidersPaidByEHRProgram_Sep2013_EP/ProvidersPaidByEHRProgram_Sep2013_EP.zip -d public/data/ProvidersPaidByEHRProgram_Sep2013_EP/
iconv -f ISO-8859-1 -t UTF-8 public/data/ProvidersPaidByEHRProgram_Sep2013_EP/EP_ProvidersPaidByEHRProgram_Sep2013_FINAL.csv > public/data/ProvidersPaidByEHRProgram_Sep2013_EP/EP_ProvidersPaidByEHRProgram_Sep2013_FINAL-utf8.csv
bundle exec rake providers:ingest_latest_payments_csv
-
If you are running in a production environment, export the geojson to flat files (instead of hitting the database) by running the following rake task:
bundle exec rake providers:output_provider_geojson_by_state