-
Notifications
You must be signed in to change notification settings - Fork 0
/
assignment.qmd
164 lines (131 loc) · 7.67 KB
/
assignment.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
# Group project assignment {#sec-assignment}
To maximize how much you learn and how much you will retain, you as a
group will take what you learn in the course and apply it to create a
reproducible project. This project, based on a simple data analysis of a
dataset of your choice, will have a GitHub repository and an HTML
document as a report to demonstrate the reproducibility of the analysis.
The dataset **cannot be the same** as the one we used in class and must
be one of the ones listed below. The datasets we've selected are tidy
enough to start working on without too much wrangling.
During the last session of the course you will work on this assignment.
In the last \~20 minutes of this session, the lead instructor will
download your project from GitHub and re-generate your report to check
that it is reproducible.
## Specific tasks
You will be collaborating as a team using Git and GitHub to manage your
group assignment. We will set up the project with Git and GitHub for you
so you can quickly start collaborating together on the project. You will
be pushing and pulling a lot of content, so you will need to maintain
regular and open communication with your team members.
Your specific tasks are:
1. All team members need to **clone your team's repository to their own
computer**. The steps are found through
`File -> New Project -> Version Control -> Git` (partly learned in
@sec-remotes-setting-up).
2. **Select one of the open datasets** to use for your analysis and
report:
- [The Global Carbon Project's fossil CO2 emissions
dataset](https://zenodo.org/record/7215364). Click this
[link](https://zenodo.org/record/7215364/files/GCB2022v27_MtCO2_flat.csv?download=1)
to view and save the data directly (by right-clicking while in
the page and selecting "Save As").
- Data from a published study: [*Early-life disease exposure and
associations with adult survival, cause of death, and
reproductive success in preindustrial
humans*](https://zenodo.org/record/4993147). Click this
[link](https://zenodo.org/record/4993147/files/Hayward%20et%20al_Early%20disease%20and%20later%20fitness_DATA.csv?download=1)
to view and save the data directly (by right-clicking while in
the page and selecting "Save As").
- Data from a published study: [*Clinical and metabolic
characteristics among Mexican children with different types of
diabetes mellitus*](https://zenodo.org/record/5008784). Click
this
[link](https://zenodo.org/record/5008784/files/Database_Mexican_children_with_different_types_of_DM.csv?download=1)
to view and save the data directly (by right-clicking while in
the page and selecting "Save As").
3. Assign one team member to **download the dataset** and put it into
the `data-raw/` folder of your project.
- If there is any documentation on the dataset, e.g. variable
definitions, put those files in the `data-raw/` folder as well.
- This team member should then add and commit the new file(s) in
the project, so it gets saved to the Git history. Then push the
changes to the team's GitHub repository.
- Next, have all team members pull the new changes to their own
computers.
4. **Each team member creates a Quarto file** named
`report-YOURNAME.qmd` (replacing `YOURNAME` with your actual name)
in the `doc/` folder of your project (covered in
@sec-reproducible-documents). This means that each team member will
have their own document to work in without conflicting with the
other team members. **Note**, if there are several members with the
same name, make sure your `qmd` documents are unique (for instance,
by using your full name).
- For each team member's `.qmd` file, include a `setup` code chunk
at the top that lists the packages used. Include code to import
your data here.
- Create a few figures of the data. Discuss with your team what
figures to create and divide which figures to make between your
team members.
- Create at least one table. Discuss with your team what table to
make, if you want to create more than one, and assign who will
create it.
- When making both the figures and the tables, make sure to
incorporate some data wrangling code from `{dplyr}`.
- Use the "Git workflow" by adding to the staged area, commit,
push, and pull the changes you've made.
- Make sure to "render" your document to HTML often to ensure it
is always reproducible.
- **Note**: For now, *do not* add and commit the HTML output file.
5. Assign one team member as the "report coordinator" and have them
**merge all the individual `report-YOURNAME.qmd` into one
`report.qmd`** file on their computer (after having everyone push,
and the coordinator will pull). They do this by copying and pasting
the content of the reports into section headers (using `##`) in the
report. After they are done, add and commit the `report.qmd`.
Lastly, delete all the report files *except* the `report.qmd` and
add and commit these deleted files to the Git history. There should
only be the `report.qmd` file in the repository.
- Create section headers (e.g. `# Results`, `## Tables`,
`## Figures`, and `# Discussion`).
- As the report coordinator merges the report file, assign another
team member to update the `README.md` file with some details
about the project and about who was on the team. Save, add,
commit, and push these changes to the GitHub repository.
- Once there is one report, as a group discuss and write a few
sentences in the "Discussion" section of their report file on
some things you liked about doing the project with the tools you
learned and a few sentences on some challenges you experienced.
6. **Generate an HTML of the report** and commit it to Git, then push
it up to GitHub. Include all the updated code and files on GitHub
for the presentation.
These tasks may seem like a lot, with a lot of new terminology and tools
to use. But don't worry! We will be going over many of these topics and
you will have time to complete the project over the three days. Make use
of the website to remind you and use our
[cheatsheet](../includes/cheatsheet.pdf) as a reference.
At the end, the lead instructor will download each of the teams' Git
projects, render the documents to test their reproducibility, and show
them on the screen for each team to present on.
## Quick "checklist" for a good project
- Project used Git and is on GitHub.
- Included a README with some basic details.
- Used Quarto for making figures and tables.
- Added some basic challenges and general experiences in the
discussion section.
- Generated a final HTML file from a single `report.qmd` Quarto file.
## Expectations for the project
What we expect you to do for the group project:
- Use Git and GitHub throughout your work.
- Work collaboratively as a group and share responsibilities and
tasks.
- Use as much of what we covered in the course to practice what you
learned.
What we don't expect:
- Complicated analysis or coding. The simpler it is, the easier is to
for you to do the coding and understand what is going on. It also
helps us to see that you've practiced what you've learned.
- Clever or overly concise code. Clearly written and readable code is
*always* better than clever or concise code. Keep it simple and
understandable!
Essentially, the group project is a way to reinforce what you learned
during the course, but in a more relaxed and collaborative setting.