Skip to content
NTID Reproducible Data Analysis Workshop (March 25-29 2019)
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
data
img
.gitignore
01_rstudio_setup.R
02_ggplot2_data_visualization.R
03_dplyr_data_manipulation.R
04_tidyr_data_tidying.R
05_exercise_solutions.R
2019-ntid-data-workshop.Rproj
99_data_cleaning.R
README.md
RStudio_Setup.md
presentation_slides.pdf

README.md

Launch Into The Tidyverse! Reproducible Data Analysis in R 🚀

March 25-29, 2019 at NTID, Rochester, NY

Objective

Scientists are increasingly required to make their data preparation, analysis, and statistics reproducible. ☝️ Reproducible resarch means anyone should be able to take your dataset and, using your analysis plan, arrive at exactly the same results as you did. That's good science! 🙌

However, in the data analysis stage of a science project, preparing the data takes up 90% of the time. 😱 In this workshop, you will learn about how to do reproducible data preparation and analysis using Rstudio and the tidyverse. With these tools, you'll do better science 😎, save a lot of time 😁, reduce errors 😅, and even make some pretty graphs. 😍

Schedule

Mon. March 25, 6:00-8:00 PM (2-Hour Workshop @ Rosica Conference Room)

  1. Introduction to reproducible data analysis and why scientists in every discipline need to engage in it
  2. Rstudio & R projects
  3. Data visualization with ggplot2

Tue. March 26, 6:00-8:00 PM (2-Hour Workshop @ Rosica Conference Room)

  1. Reproducible research recap
  2. Manipulating and filtering data with dplyr
  3. Tidying and reshaping data with tidyr

Wed. March 27 - Fri. March 29: Small Group Data Sprints (Rosica 1140)

This is going to be fun! 🎉 With your lab or research group and some data, we'll meet for one-hour intensive sessions where we'll review your research questions, data, and requirements. Then you'll put your newly-acquired reproducible data analysis skills to immediate use and together, we will do data preparation, analysis, visualization, or all of the above. Depending on how many groups there are, we might meet more than once. These sprints are especially helpful if you are looking to move your team from SPSS and finally take control of your data analysis!

Frequently Asked Questions

  • ANOVAs make my head hurt. Do I need to know how to do stats?

    • Nope! We won't discuss statistical testing in the workshops, but the datasets you've prepared using Rstudio and the tidyverse will make your statistical testing easier to run and interpret. We can go over stats in the small group data sprints, though.
  • I don't know how to code! Is that okay?

    • Coding experience is not required! The best way to learn is to just dive in. If you want to get your feet wet with some basic programming (e.g., get into the mindset of a computer and how it thinks), there's no better place to start than Codecademy's (free!) Python course.
  • I already know how to do some stuff in R. Will I be bored?

    • Our code discussions will primarily focus on manipulating data in the tidyverse. If you've been using R but aren't familiar with select(), %>%, or ggplot(), you will definitely find this a life-change-R! 🙈
  • What do I need to bring to the workshop? Do I need to install software on my laptop?

    • Your laptop! If you don't have one you can use, let me know.
    • We'll be using RStudio.cloud - an instance of RStudio that runs entirely within your browser. Like Google Docs, but for R! No installation needed. You'll probably want to eventually have a copy of R and RStudio on your computer anyway for the data sprints. I'll upload (or link to) installation instructions and some test code.
  • What do I need for the small group data sprints?

    • Review with your lab or research group and decide what project you'd like to cover. Prior to the workshop, you'll submit to me your dataset and a description of which magic tricks 🔮 you need to do on it. More information and a calendar will be shared soon.
  • I'm not a scientist or working in a research lab. Should this interest me?

    • 👊👊. If you're looking into data-oriented careers (e.g., data analyst, data scientist, or data engineer), the skills you learn here are highly desired in those jobs (one example - me!).
  • What's the language of the workshop?

    • The workshop will be presented in ASL without English interpretation. All slides will be in English.
  • How do I join?

    • Unfortunately, the workshop is already full. However, there's some wiggle room here. I'm especially open to more deaf participants because, let's face it, how many deaf/ASL-centric data science workshops are out there? 😉 Email me and we'll discuss.

Instructor: Adam Stone, PhD

I'm a deaf data scientist at Convo. I first cut my teeth on data programming and statistics while working on my Ph.D. in Educational Neuroscience at Gallaudet University. I used Matlab and R for my dissertation data analysis of infants' brain responses to visual linguistic and non-linguistic patterning. After that, I did a postdoc with Dr. Rain Bosworth at UCSD where I really dove into R 24/7, using it to analyze and report eye-tracking data of infants and adults watching sign language videos. At Convo, I still use R every day to do data analyses, and I've also branched out into SQL and building data stacks to make sure all our data is flowing into the right places and is easily retrieved, understood, and interpreted by different business teams. I live in Edinburgh, Scotland, where my husband and I enjoy the great view of a 1,000-year-old castle from our kitchen window.

Acknowledgments

Hosted by Dr. Matt Dye (Director) & Dr. Geo Kartheiser (Postdoctoral Researcher) from the deaf x lab. Funding provided by Academic Affairs at National Technical Institute for the Deaf.

You can’t perform that action at this time.