Skip to content

Latest commit

 

History

History
50 lines (33 loc) · 3.75 KB

week13.md

File metadata and controls

50 lines (33 loc) · 3.75 KB
title layout root venue
NRES 898 Week 13
page
.
University of Nebraska-Lincoln

Week 13: Data entry/organization in spreadsheets

See the bottom of this page for this weeks Challenge.

Preparation

This week's videos and exercises are based on educational materials from the DataONE project. We will do a little bit of R, so create a new project in RStudio for week 13. The data files needed for the challenge can be downloaded here as a zip archive. Extract the files into the data subdirectory of your RStudio project directory. These are Microsoft Excel files, so cannot be directly read into R (easily). Part of this weeks challenge is getting the data into R so you can make some plots!

Introduction to data management: why worry?

Tidy data and data validation

There are good ways to put your data in a spreadsheet. Hadley Wickham (author of dplyr and ggplot2) calls these "tidy data".

Quality Assurance / Quality Control for data

Assignment

This weeks assignment has some plain text writing. Please write this in a text editor, and copy/paste it into the Bb submission area. You have to click "Write Submission" next to the "text submission"; it is directly above the attach submission button. If you enter your assignment into the comments box it is difficult for me to extract.

You will also attach a commented R script. The file should be saved with a name like yourlastname_week13.R, and uploaded as an attachment. At the top of the file write a comment (start the line with #) with the week number and your name on one line. If none of that makes sense, go watch the start of the video for subsetting vectors.

  • Open the 3 Excel files downloaded above and inspect them. Based on what you have learned so far about data management, what are some problems in the way the data are currently organized, and how could you fix them?
  • Examine each of the files using the quality assurance strategies described above. Describe all the problems you identify in the text submission.
  • Choose one of the files, import it into R, and repeat the quality assurance steps using R code (e.g. looking at head() or tail() of data sorted by different columns, plotting). Document your actions in an R script that you attach to this weeks assignment.

I won't be able to run this week's script because the steps you took in Excel to make the data readable in R will differ from what I might have done. But I still want to see what you tried.

This challenge is due on Friday of Week 13 (April 8) at 5 pm. Late assignments will receive a score of zero unless prior approval is granted.