Note: this course is code-agnostic, but will presume R. If you use any other
coding environment, make sure that it is fully reproducible (for instance, a
Jupyter notebook). The .gitignore
file in the repository is specifically for
R. You may wish to modify this file if you are using Python, Julia, or any other
such language.
This individual extra credit opportunity worth up to 1 absolute percentage point to your semester grade. The purpose of this extra credit assignment is to further explore reproducible research and survival analysis.
Your goal is to reproduce Tables 1 and 4 from the NEJM article on the DIG trial.
The data are available as dig.csv
in the course Sakai page under Resources.
A documentation file is also provided, as is the NEJM paper. You must create your
own private GitHub repository; please invite the instructor at @Yue-J.
Reproducible code and tables are due Thursday, February 25. As well, briefly comment on whether you were able to exactly reproduce the results and whether there were any discrepancies (if so, where?).
You must submit a .pdf document to Gradescope that corresponds to an .Rmd file on a GitHub repository in order to receive credit for this extra credit opportunity.
- This is an individual extra credit assignment.
- Everything in your repository is for your eyes only except for the instructor or TAs.
- As always, you must cite any code you use as inspiration. A failure to cite is plagiarism.
By submitting an assignment, you pledge to uphold the Duke Community Standard:
- I will not lie, cheat, or steal in my academic endeavors;
- I will conduct myself honorably in all my endeavors; and
- I will act if the Standard is compromised.
You additionally agree not to use these data beyond Case 01, share the datasets, or share any results and/or reports written using these data.
These data were prepared to enable students to reproduce the analysis leading to the results of the NEJM paper. Some data not discussed in the NEJM article are included in the teaching data set (body mass index, serum creatinine, serum potassium, systolic and diastolic blood pressure, etc.). In order to create an anonymous dataset that protects patient confidentiality, most variables have been permuted over the set of patients within treatment group. Therefore, this dataset can reproduce the results of the NEJM paper; however, **it would be inappropriate to use this dataset for other research or ** publication purposes. Multidimensional relationships, not included in the NEJM, may not have been preserved during the permutation process.
Grading will be based on whether the student has reproduced the tables and their comment on any potential discrepancies. Students will either earn the extra credit point or not; there is no partial credit.
Note: Submissions missing code used for the submission to Gradescope will automatically receive 0 credit!