Skip to content

Latest commit

 

History

History
9 lines (7 loc) · 617 Bytes

README.md

File metadata and controls

9 lines (7 loc) · 617 Bytes

Aneja-Lab-Public-MissingData

Research project examining prevalence of missing data unable to be ascertained from the medical record and associated survival outcomes for cancer patients. Manuscript is currently under submission. The project uses the National Cancer Database Participant Use Files (PUF).

To reproduce our analysis:

  • Process PUF files to .dta per NCDB instructions
  • Run process_missing.do to convert all missing and unknowns to same sentinel values
  • Categorize all variables of interest in an excel file. Record in column A "variable" and B "category"
  • Run analysis.do to reproduce analysis