-
Notifications
You must be signed in to change notification settings - Fork 0
/
slides.qmd
756 lines (433 loc) · 21.9 KB
/
slides.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
---
format:
revealjs:
width: 1600
height: 900
slide-number: true
incremental: true
theme: ["slides-style.scss"]
---
# [Writing a Data Analysis]{.white} <br> [report]{.white} {background-image="figures/patrick-fore-typewriter.jpg" background-size="cover"}
<h2>[ISI-BUDS 2023]{.light-blue}</h2>
<h3>[Federica Zoe Ricci]{.light-green}</h3>
##
### Link to slides
On the website with the [ISI-BUDS Summer 2023 Schedule](https://isi-buds.github.io/program-2023/)
![](figures/link-to-slides.jpg){width="300"}
<br>
### GitHub
In the [isi-buds organization](https://github.com/isi-buds) on GitHub, find and clone your research-team repo\
`writing-workshop-team-X`
<br>
### Set up
Sit close to your research-team members
## About Me
::: columns
::: {.column width="50%"}
![](figures/federica-photo.jpg){width="536"}
:::
::: {.column width="50%"}
**Federica Zoe Ricci**
4th Year PhD student in Statistics at UC Irvine
[{{< fa link >}} Website](https://federicazoe.github.io)
[{{< fa file-lines >}} Publications](https://federicazoe.github.io/research.html)
<br>
### Interests
Network data, Bayesian nonparametrics, Stats edu
<br>
### Background
BS in Management for Arts, Culture and Communication (Bocconi University, Milan)
MS in Economics (Bocconi University, Milan)
:::
:::
## About you: DA reading experience
![](figures/survey-ever-read.jpg){width="536"}
## About you: DA structure familiarity
![](figures/survey-familiar-structure.jpg){width="536"}
## About you: DA writing experience
![](figures/survey-ever-written.jpg){width="536"}
## About you: what you may [like]{.medium-green} about writing
<br>
- ***making** the analyzed data to **look easy**.*
<br>
- *I liked the puzzle of structuring the report and figuring out the best way to **tell a story***
<br>
- *It's satisfying to be able to present research to others. There isn't much **point to research** without sharing it.*
## About you: what you may [dislike]{.maroon} about writing
<br>
- *Not sure what is important, and which one is not. **What to include or not***
<br>
- *Figuring out **where to start** and **formatting***
<br>
- *I **don't think I'm very good at writing**, and thus I can sometimes iterate too many times over my work and obsess over making it better*
## About this workshop
<br>
- **What we will learn**: writing a data analysis (DA) report
- USRESP competition guidelines
- writing papers in RStudio + references management
<br>
- **How we will learn**\
{{< fa circle-question >}} Asking ourselves questions (e.g. why we like/dislike a title)
{{< fa eye >}} Examples
{{< fa feather >}} Practice
{{< fa comments >}} Learn from each other
# DA Report: *Big* picture
## DA report [VS.]{.maroon} Creative Writing
. . .
::: {.callout-warning icon="false"}
## [{{< fa circle-question >}} Question]{.black}
How is a DA report different from creative writing?
:::
- There is a lot of **structure**
. . .
> *"It felt very straight forward \[...\] as in there seemed to be a set of common things people did in data analysis reports that you just needed to do"*
>
> (From one of your responses to *What did you like of writing the report?*)
- There are **bricks** to build with
- They are determined by your research project. e.g., figures, tables, previous research findings
. . .
{{< fa smile >}} You don't really start from a blank page
. . .
{{< fa meh >}} Not a ton of room for creativity\
*"The writing in data analysis reports often feels very clinical and boring"*
## Types of DA report
USRESP identifies 2 main types of projects:
- [application]{.medium-green} focused
- [methodology]{.orange} focused (i.e. focused on the properties of a statistical method)
. . .
::: {.callout-tip icon="false"}
## [{{< fa feather >}} Practice]{.black}
Can you tell the type from the title of past USRESP submissions?
- *Performance of LDA And QDA On Non-Normally Distributed Predictors*
- [methodology]{.orange}
- *Spatial Modeling Of Bird Populations Using Citizen Science Data*
- [application]{.medium-green}
- *Behind The Smoke: An Extreme Value Analysis Of Air Pollution In Minnesota*
- [application]{.medium-green}
- *An Evaluation Of Regularization Methods: When There Are More Predictors Than Observations*
- [methodology]{.orange}
:::
## A good DA report
::: {.callout-warning icon="false"}
## [{{< fa circle-question >}} Question]{.black}
What makes for a good DA report?
:::
- It is clear and easy to read
- It tells an interesting story
- It makes a good statistical analysis and explains the results well
. . .
From *Assessment of the USRESP projects* on [causeweb.org/usproc/usresp](https://www.causeweb.org/usproc/usresp)
. . .
> Some **general criteria** that the judges may use include:
>
> 1. Overall [clarity]{.maroon} and [presentation]{.maroon}
> 2. [Originality, creativity]{.maroon}, and [significance]{.maroon} of the study
> 3. [Accuracy]{.maroon} of data analysis, conclusions, and discussion
. . .
> *... you should construct a paper that is [understandable to a reader with little knowledge of any applied domains]{.maroon} that relate to your paper.*
## A good DA report: [clarity and presentation]{.maroon}
::: {.callout-note icon="false"}
## [{{< fa eye >}} Example]{.black}
- **Make notation concrete**
From *An Evaluation Of Regularization Methods: When There Are More Predictors Than Observations* (Kenny Chen, honorable mention at 2021 Fall USRESP)
![](figures/example-clarity-1.png){fig-align="center"}
:::
## A good DA report: [clarity and presentation]{.maroon}
::: {.callout-note icon="false"}
## [{{< fa eye >}} Example]{.black}
- **Design visualizations to help digest complicated methods**
- **Write clear figure captions**
From *Storm Chasers: Synthesizing New England Weather Data On A Dashboard For Emergency Response Workers* (Irene Foster, Sunshine Schneider, Caitlin Timmons, Katelyn Diaz,winner at 2022 Fall USRESP)
![](figures/example-clarity-2.png){fig-align="center" width="700"}
:::
## A good DA report: [originality, creativity and significance]{.maroon}
::: {.callout-note icon="false"}
## [{{< fa eye >}} Example]{.black}
- **Creativity**
From *Behind The Smoke: An Extreme Value Analysis Of Air Pollution In Minnesota* (Yicheng Shen, Jacob Flignor, Libby Nachreiner, & Karen Wang , winner at 2022 Spring USRESP)
![](figures/example-originality.png){fig-align="center" width="700"}
:::
## A good DA report: [originality, creativity and significance]{.maroon}
::: {.callout-note icon="false"}
## [{{< fa eye >}} Example]{.black}
- **Significance**
From *Exploring Missingness and its Implications on Traffic Stop Data* (Amber Lee, winner at 2020 Fall USRESP)
![](figures/example-significance-1.png){fig-align="center"}
:::
## A good DA report: [accuracy]{.maroon}
::: {.callout-note icon="false"}
## [{{< fa eye >}} Example]{.black}
- **Choosing a statistical model that accounts for specific aspects of the application considered (and motivating the choice)**
From *Psychiatric Comorbidity In Opioid Use Treatment Outcomes* (Linda Tang, winner at 2021 Fall USRESP)
![](figures/example-accuracy.png){fig-align="center"}
:::
# DA Report: *Fine* picture
## Two questions
::: columns
::: {.column width="50%"}
::: {.callout-warning icon="false"}
## [{{< fa circle-question >}} Question]{.black}
**Why** do we write a DA report?
:::
- To let **someone else** learn about:
- an interesting problem (that they might not know)
- ways to approach the problem (that they might not know!)
- what results one gets when they approach the problem in these ways
- what insights these results tell us (and how they relate to other insights that people could read elsewhere)
:::
::: {.column width="50%"}
::: {.callout-warning icon="false"}
## [{{< fa circle-question >}} Question]{.black}
What are the **common sections** of DA report?
:::
- From [USREP - Report Template](https://www.causeweb.org/usproc/report-template-USRESP):
- Title
- Abstract
1. Introduction (aka Background)
2. Methods
3. Results
4. Discussion/Conclusion
- References
:::
:::
## ~~Two questions~~ Two ways of asking the same question
::: columns
::: {.column width="50%"}
::: {.callout-warning icon="false"}
## [{{< fa circle-question >}} Question]{.black}
**Why** do we write a DA report?
:::
::: nonincremental
- To let someone else learn about
1. [an interesting problem (that they might not know)]{.orange}
2. [ways to approach the problem (that they might not know!)]{.medium-green}
3. [what results one gets when they approach the problem in these ways]{.maroon}
4. [what insights these results tell us (and how they relate to other insights that people can find elsewhere)]{.dark-pink}
:::
:::
::: {.column width="50%"}
::: {.callout-warning icon="false"}
## [{{< fa circle-question >}} Question]{.black}
What are the **common sections** of DA report?
:::
::: nonincremental
- From USREP - Report Template:
- Title
- Abstract
1. [Introduction (aka Background)]{.orange}
2. [Methods]{.medium-green}
3. [Results]{.maroon}
4. [Discussion/Conclusion]{.dark-pink}
- References
:::
:::
:::
## Introduction
::: {.callout-warning icon="false"}
## [{{< fa circle-question >}} Question]{.black}
Guess what makes for a good introduction according to USRESP judging criteria?
:::
- Does the background and significance have a **logical organization**? Does it move from the **general to the specific**?
- Has **sufficient background** been provided **to understand** the paper? How does this work **relate to other work** in the scientific literature?
- Has a reasonable explanation been given for why the research was done? **Why is the work important?** Why is it relevant?
- Does this section end with **statements about the hypothesis/goals** of the paper?
## Introduction
::: {.callout-note icon="false"}
## [{{< fa eye >}} Example]{.black}
Individually, read the snippet in `01-practice-introduction.qmd` in the folder `practice`
[(5 min)]{.maroon}
:::
. . .
::: {.callout-tip icon="false"}
## [{{< fa feather >}} Practice]{.black}
Discuss with your team members:\
- How do you think the snippet did, with reference to the USRESP judging criteria?
- What could be improved?
[(5 min)]{.maroon}
:::
. . .
::: {.callout-important icon="false"}
## [{{< fa comments >}} Learn from each other]{.black}
Let's share our thoughts between all groups.
[(5 min)]{.maroon}
:::
## Methods
What should be included (according to USRESP template):
- **Data collection**\
Explain how the data was collected/experiment was conducted. Additionally, you should provide information on the individuals who participated to assess representativeness. Non-response rates and other relevant data collection details should be mentioned here if they are an issue. However, you should not discuss the impact of these issues here - save that for the limitations section.
- **Variable creation**\
Detail the variables in your analysis and how they are defined (if necessary). For example, if you created a combined (frequency times quantity) drinking variable you should describe how. If you are talking about gender no further explanation is really needed.
- **Analytic Methods**\
Explain the statistical procedures that will be used to analyze your data. E.g. Boxplots are used to illustrate differences in GPA across gender and class standing. Correlations are used to assess the impacts of gender and class standing on GPA.
## Methods
::: {.callout-warning icon="false"}
## [{{< fa circle-question >}} Question]{.black}
Guess what makes for a good method section according to USRESP judging criteria?
:::
- Could the study be **repeated** based on the information given here? Is the material organized into logical categories \[like the ones in the previous slide\]?
## Methods
::: {.callout-note icon="false"}
## [{{< fa eye >}} Example]{.black}
Individually, read the snippet in `02-practice-methods.qmd` in the folder `practice`
[(5 min)]{.maroon}
:::
. . .
::: {.callout-tip icon="false"}
## [{{< fa feather >}} Practice]{.black}
Discuss with your team members:
::: nonincremental
- Parts of methods covered in the snippet
- What you understand well by reading the snippet
- What is unclear from the snippet
:::
[(5 min)]{.maroon}
:::
. . .
::: {.callout-important icon="false"}
## [{{< fa comments >}} Learn from each other]{.black}
Let's share our thoughts between all groups (if there is time).
[(5 min)]{.maroon}
:::
## Results
How USRESP guidelines suggest to frame the results section:
- typically, results sections **start with descriptive statistics**
- information presented must be relevant in helping to answer the research question(s) of interest
- typically, **inferential** (i.e. hypothesis tests) **statistics come next**.
- **Tables and figures** are useful in this section and should be labeled, embedded in the text, and **referenced appropriately**.
. . .
And here are the USRESP assessment questions:
- Is the content appropriate for a results section? Is there a clear description of the results?
- Are the results/data analyzed well? Given the data in each figure/table is the interpretation accurate and logical? Is the analysis of the data thorough (anything ignored?)
- Are the figures/tables appropriate for the data being discussed? Are the figure legends and titles clear and concise?
## Results
::: {.callout-note icon="false"}
## [{{< fa eye >}} Example]{.black}
Let's read this snippet together
(From the Results section in *Psychiatric Comorbidity In Opioid Use Treatment Outcomes* by Linda Tang, winner at 2021 Fall USRESP)
:::
> Addressing our primary analysis, we observed that having psychiatric comorbidity is associated with higher incidence of treatment dropout and lower incidence of treatment completion. More specifically, holding all else constant, a client with psychiatric comorbidity is expected to have 1.05 times the subdistribution hazard of dropping out of the treatment and 0.91 times the subdistribution hazard of completing a treatment compared to a client without psychiatric comorbidity. Although the effect size of this association is modest, it still highlights that the current treatment programs need to better accommodate the special needs of this subgroup of clients.
. . .
::: {.callout-tip icon="false"}
## [{{< fa feather >}} Practice]{.black}
Can you guess:
- one thing that I like about the snippet
- two things that I think could be improved
:::
## Discussion
Here are the USRESP assessment criteria:
- Does the author clearly state whether the results answer the question (support or disprove the hypothesis)?
- Were specific data cited from the results to support each interpretation? Does the author clearly articulate the basis for supporting or rejecting each hypothesis?
- Does the author adequately relate the results of the current work to previous research?
. . .
::: {.callout-note icon="false"}
## [{{< fa eye >}} Example]{.black}
Individually, read the snippet in `04-practice-discussion.qmd` in the folder `practice` [(2 min)]{.maroon}
:::
. . .
::: {.callout-tip icon="false"}
## [{{< fa feather >}} Practice]{.black}
Think about these questions: [(2 min)]{.maroon}
::: nonincremental
- Can you identify the strong elements in this snippets (according to USRESP assessment criteria)?
- Are there any weaker elements in this snippets (according to USRESP assessment criteria)? What would you suggest to change?
:::
:::
. . .
::: {.callout-important icon="false"}
## [{{< fa comments >}} Learn from each other]{.black}
Let's share our thoughts between all groups (if there is time).
[(5 min)]{.maroon}
:::
## Abstract
::: {.callout-warning icon="false"}
## [{{< fa circle-question >}} Question]{.black}
What is an abstract?
:::
. . .
> The abstract provides a brief summary of the entire paper (background, methods, results and conclusions). The suggested length is no more than 150 words. This allows you **approximately 1 sentence (and likely no more than two sentences) summarizing each of the following sections**. Typically, abstracts are the last thing you write.
**Assessment**: Are the main points of the paper described clearly and succinctly?
## Abstract
From *Investigation Of NCAA Basketball's Three Point Strategy Using Logistic Mixed Effects Regression Model* by (Che Hoon Jeong, winner of 2022 Fall USRESP):
::: {.callout-note icon="false"}
## [{{< fa eye >}} Example]{.black}
Several studies have presented the increase of three point shot attempts and its significance in winning games in the National Basketball Association (NBA). However, there are limited quantitative research on whether collegiate basketball reflects the three-point strategy of the NBA. This paper conducts a Seasonal Mann-Kendall test to present a statistically significant increase in three-point shot distance in NCAA Division 1 basketball, which reflects the trend of the NBA. The Logistic Mixed Effects Regression reveals that each three-point attempt decreases the probability of winning on average, whereas a made three-point shot increases the probability of winning the most out of the sixteen variables used in the model. Thus, designating three-point shots primarily for efficient three-point shooters may increase the chances of winning. Nevertheless, drafting sharpshooters and developing three-point shooting skills may benefit teams in the long-run, who may capitalize on the impact of successful three-point shots.
:::
## Abstract
From *Investigation Of NCAA Basketball's Three Point Strategy Using Logistic Mixed Effects Regression Model* by (Che Hoon Jeong, winner of 2022 Fall USRESP):
::: {.callout-note icon="false"}
## [{{< fa eye >}} Example]{.black}
[Several studies have presented the increase of three point shot attempts and its significance in winning games in the National Basketball Association (NBA). However, there are limited quantitative research on whether collegiate basketball reflects the three-point strategy of the NBA.]{.orange} [This paper conducts a Seasonal Mann-Kendall test]{.medium-green} [to present a statistically significant increase in three-point shot distance in NCAA Division 1 basketball, which reflects the trend of the NBA. The Logistic Mixed Effects Regression reveals that each three-point attempt decreases the probability of winning on average, whereas a made three-point shot increases the probability of winning the most out of the sixteen variables used in the model.]{.maroon} [Thus, designating three-point shots primarily for efficient three-point shooters may increase the chances of winning. Nevertheless, drafting sharpshooters and developing three-point shooting skills may benefit teams in the long-run, who may capitalize on the impact of successful three-point shots.]{.dark-pink}
:::
:::{.nonincremental}
1. [Introduction (aka Background)]{.orange}
2. [Methods]{.medium-green}
3. [Results]{.maroon}
4. [Discussion/Conclusion]{.dark-pink}
:::
# DA Report: *Tools*
## Demonstration
In the `template` folder in your research-team workshop repo.
- Quarto for DA reports
- child-documents
- section planning
- cross-references Sections, Equations, Figures and Tables
- Zotero for references
- how I look for literature
- Zotero [+]{.maroon} Quarto
# DA Report: *USRESP*
## USRESP
<br>
- **Next deadline:** [December 20th, 2023]{.maroon}
<br>
- **Everything** from USRESP shown today is on their website!
- [General Information](https://www.causeweb.org/usproc/usresp)
- [Template and evaluation criteria](https://www.causeweb.org/usproc/report-template-USRESP)
- [Past winners and honorable mentions](https://www.causeweb.org/usproc/projects/winners)
## About you: submitting to USRESP
![](figures/survey-submit-usresp.jpg){width="536"}
# DA Report: *Other remarks*
## Feedback
- **Give others** feedback
- Highlight strengths (so they know what is good, what they don't need to change)
- Identify potential weak points or issues and, when possible, suggest ways to improve
- We have been doing it all along!
- **Give yourself** feedback
- \[Hint\] Once you are done with putting your report together, leave it for a few days and then go back to it to review it, if you can.
- **Receive** feedback
- Remember: [Everything can be improved]{.medium-green}.
- Someone took time to read your work and tell you what they thought of it. You get the chance to see how your work reads in someone else's hands. Your goal is to understand from them what you could improve, e.g. make more clear.
- Sometimes we receive "bad" feedback (non-constructive, perhaps even offensive). Try and feel grateful anyways (see point above). Don't lose all your confidence, but also interrogate yourself. Say thank you anyways and ask follow up questions.
## How do you become a better writer?
<br>
- [Read]{.medium-green} good reports.
<br>
- [Develop a critical eye]{.medium-green}: ask yourself
- why do you like what you are reading? what makes it good?
- what could be improved? how would you improve it?
<br>
- Learn more tools, e.g. [tables]{.medium-green} in Quarto: check out [kable and kable extra vignette](https://cran.r-project.org/web/packages/kableExtra/vignettes/awesome_table_in_html.html)
## Questions?
<br>
.. on writing DA reports
<br>
.. on submitting to USRESP
<br>
.. something else you're curious about
<br>
. . .
For the source code of the slides, on the `isi-buds` GitHub organization:\
[program-2023 --> writing-workshop --> slides.qmd ](https://github.com/isi-buds/program-2023/blob/main/writing-workshop/slides.qmd)
## Your Feedback for me {{< fa handshake >}}
::: {.columns .v-center-container}
::: {.column width="50%"}
### Link
<br>
<br>
[bit.ly/isi-buds-writers](bit.ly/isi-buds-writers)
:::
::: {.column width="50%"}
### QR code
![](figures/feedback-qr-code.jpeg){width="536"}
:::
:::
# Thank you! <br> *Reach out to me for any questions*