- Read/ annotate: Recipe #2. You can refer back to this document to help you at any point during this lab activity.
- Review using Markdown syntax for formatting, including tables, numbering sections, and citations and references
- Practice interleaving code and prose in a Quarto document
- Learn to read, inspect, and write structured data using R functions and Quarto code blocks
- Practice describing data using prose and code in a Quarto document
- Create a Quarto document using the RStudio toolbar
- Provide the title: "Lab 02: Dive into datasets"
- Provide the author: <Your Name>
- Render the Quarto document (without changes)
- Click the 'Render' button on the RStudio toolbar
- Save the Quarto file with the name
lab-02.qmd
.
In the repository for this lab, you will find three data files corresponding to the data origin, data dictionary, and the data itself.
data_origin.csv
data_dictionary.csv
data.csv
- Create the following sections in your
.qmd
document:
- About the data
- Inspect the data
- Subset the data
- Write the data
-
In the section "About the data", use strategies from Recipe 2 to load the necessary packages to read and inspect rectangular data and read the data origin and data dictionary files. Then, write a paragraph describing the data. Include the following information:
- What is the name of the data source?
- Where did it come from?
- What is the sampling frame?
-
In the section "Inspect the data", read and inspect the dataset. Write a paragraph describing the data. Include the following information:
- How many variables are included?
- What are the variable types?
- How many observations are included?
-
In the section "Subset the data", use strategies from Recipe 2 to subset the data to include some combination of columns and rows that you could find relevant to extract from the original dataset. Write a paragraph describing the data you have subsetted. Include the following information:
- How does the subsetted data differ from the original data?
- What are the dimensions of the subsetted data?
- What are the variable types?
Note: You may find the PENN 'pos' tagset useful to help you understand the values of the 'pos' variable. You can find the tagset here.
-
In the section "Write the data", use strategies from Recipe 2 to write the subsetted data to a file. Describe where the file is located and what format it is in.
-
Render the
.qmd
as a PDF. -
(optional) Explore adding a markdown table to your Quarto document and make a cross-reference in your summary prose. The
knitr
package provides a functionkable()
for creating markdown tables from rectangular data. You can read more about creating tables from R data frames on the Quarto website.
- Add a section which describes your learning in this lab.
Some questions to consider:
- What did you learn?
- What did you find most/ least challenging?
- What resources did you consult?
- Instructor? R or Quarto documentation, Websites (provide links)?
- What more would you like to know about reading, inspecting, and/ or writing data in R and/ or Quarto?
- Find potential resources you might consult to continue your learning. Provide links and a brief description of the resource.
- To prepare your lab report for submission you will need to render your Quarto document to PDF or Word.
- Download this file to your local computer.
- Submit your report as described by your instructor.
This work is licensed under a Creative Commons Attribution 4.0 International License.