This repository hosts code and data gathered while performing the study "2018-07-15: How well are the National Guideline Clearinghouse and the National Quality Measures Clearinghouse Archived?" posted in the Web Science and Digital Libraries Research Group blog.
This repository hosts code and data gathered while trying to determine how much of www.guideline.gov and qualitymeasures.ahrq.gov were archived before they disappeared after July 16, 2018. It was written over the course of 3 days and it is not of great quality. There is no intention to clean it up.
Because I was concerned that they might be too large for GitHub, datasets containing the URIs that were used during that study are here: