<div class="alert alert-block alert-danger">

# 2A: Data: It All Starts With Measurement (COMPLETE)

**Lesson assumes students have read up through page: 2.5**

<div class="alert alert-block alert-warning">

#### Summary of Notebook:

Students take a brief survey that is an excerpt of the [Student Census](https://ww2.amstat.org/CensusAtSchool/index.cfm) survey as a way to collect some class data. Teachers can modify the survey to add/remove questions better suited for their class. This is a dataset that can be useful for referencing throughout the course in order to demonstrate various concepts because it provides a personalized set of data with a data point that actually represents each student individually. There is also an optional activity to participate in a memory experiment.

#### Includes:

- Practice thinking about where data come from
- Distinguishing measurement error from mistakes
- Practice operationalizing ideas such as the *best* way to measure your finger
- Taking part in an experiment (memory experiment)
- Thinking about how variation in one variable (e.g., level of processing) can affect another variable (e.g., memory recall)

In [None]:
# This code will load the R packages we will use
suppressPackageStartupMessages({
    library(coursekata)
})

## 1.0 - How do we get data?
We use data to try and understand variation in all kinds of things: climate change, poverty, politics, psychology, health, etc. But that means we need to measure all kinds of things that are hard to measure. How do we measure how smart, happy, or healthy someone is? How powerful, free, or secure a country is? Once we get that data, we want to use data science to figure out whether this stuff changes, improves, differs across people, or predicts some future outcome, and much more.

But it all starts with figuring out measurement -- how would we know how much of something exists? How would we know how that thing varies across people, animals, families, companies, countries?
 
Let’s start by measuring something easy: Yourself!  

We'll take some measurements and collect some data using a portion of the [Student Census](https://ww2.amstat.org/CensusAtSchool/index.cfm) survey to find out: How much do you and your classmates vary? How do you vary compared to other classes and past years?

## 2.0 - A Little Bit About You

Of course, we know it's hard to truly capture who you are as a person in just a few simple variables, and no one really likes being reduced to a "number" or a "category", but there are times when it can still be quite useful and informative to do so, and therein lies a challenge: What's the *best* way to measure something?

2.1 - Let's start by collecting some data with an excerpt from the Student Census [survey](https://docs.google.com/document/d/1lXXWdQQ8LmFNOtYXJr2mboXgh8-SxVJUoAO1jN7-Tjc/edit?usp=sharing). 


<div class="alert alert-block alert-warning">

**Note for Instructors**

Teachers can add/exchange questions from the [full version](https://docs.google.com/document/d/1ZfYeric-3WrdlQn0BgM-woXeLQhzSkw6U1itoWHBgJU/edit?usp=sharing) of the census, if needed.

2.2 - Think about your experience completing the survey. Did you find all of the questions in the survey easy to answer? Do you feel like you answered everything correctly? Is there a chance any of the information is slightly wrong, even though it is about yourself? Do you think you can get stuff wrong about yourself? Why or why not? Summarize your thoughts.

2.3 - Do you think there might have been a better way to ask any of the questions? Do you think that would have changed any of your answers?

For example, What's the ***best*** way to measure:
- someone's age?
- someone's favorite season?
- someone's height?
- the length of someone's index finger?

2.4 - Do you think any of your responses would be different if *someone else* measured you? Would another person measure your finger in the same way? Would they measure your height exactly the same? Could they measure which season is your favorite?

2.5 - Let's try it! Without showing them your own measurements, have a classmate measure your index finger length.

<div class="alert alert-block alert-warning">

**Note for Instructors**

If your classroom is remote, this can still be attempted during synchronous class time if students are able and willing to turn on thier cameras. The student can hold up their finger near the ruler in front of the camera and their partner can try their best to estimate the measurement--and likely experience an authentic case of measurement error).

2.6 - Did your partner come up with an identical measurement for your thumb? Why do you think that is?

2.7 - If you and another person measure the same thing but get different results, who is ultimately correct? Is one person's measurement a "mistake"?

## 3.0 - Measurement Error

3.1 - What are some things you could do if you wanted to make these measurements more similar? That is, how can you reduce the variability in your measurements?

3.2 - Do you think we can completely eliminate the variability in measurement? Is there a way to perfectly measure something? Why or why not?

3.3 - **Whole Class Discussion**: Small differences in thumb lengths may not seem like it matters a lot. One way of measuring gave us a length of 65 mm and another gave us 66 mm. Not a big deal. But later we are going to analyze hate crimes from the FBI database. What counts as a hate crime? Who decides? How do we decide? How is the determination of “hate crime” kind of like the measurement error we see in thumb lengths?

## 4.0 - Some Experimental Data

Aside from surveys, data can come from lots of sources, and for lots of different reasons. One reason might be to collect data for an experiment. Have you ever designed or participated in an experiment? What makes an experiment different from our survey or other types of data collection?

Let's add to our class data by participating in a short experiment. Your teacher will provide you with the instructions.

<div class="alert alert-block alert-warning">

**Note for Instructors**

*Feel free to skip this part, or to select from one of the suggested simple memory experiments below.*

#### Experiment Idea 1:

Does level of processing (deep vs shallow) affect memory?

Students review a word list and are randomly assigned to either count the number of vowels in each word (shallow processing) or rate how much they like the word from 1-5 (deep processing). After a distractor task, they are asked to recall the words (followed up with an optional word recognition task). Tally up the total number of words recalled and recognized and record which group they were in (Deep or Shallow).

*Materials:*
- [Levels of Processing Instructions](https://docs.google.com/document/d/1wsl2ail1vqSyGAWxfhqtUaD7LQPgpFadNeQhmI7YQO8/edit?usp=sharing)
- [Shallow Form](https://docs.google.com/document/d/1uRvXZL3l4vGVYImo11xSUGJnOfdNsdyhbpUNu6EoNHo/edit?usp=sharing)
- [Deep Form](https://docs.google.com/document/d/1ImdZezeb551Ddt7AEn3cSXxHl42pXUpbRq9MK4fYu-I/edit?usp=sharing)
- [Recognition Test Answers](https://docs.google.com/document/d/1xBiB1OPHcNyk8VSFtcuhzoL4e-ZXtaDIb_yTTgPo64c/edit?usp=sharing)
- [Recognition Test Form](https://docs.google.com/document/d/1xDo0U1SNyAL8bSfi8cZbxLraTCNNzJKTc1HNYxrJI58/edit?usp=sharing)

#### Experiment Idea 2: 

Does music affect memory?

An alternate experiment that is easy to do, if this one does not meet your needs, is to have students study the Gettysburg Address (or some other passage) while wearing headphones. Have half the students be randomly assigned to listen to music, and the other half to listen to silence (while still wearing headphones--to keep things constant) as they study the passage for about 5 minutes. Insert a distractor task, then ask students to recall as much of the passage as possible within 2 minutes. Tally up the total number of words recalled and record which group they were in (Music or No Music).

## 5.0 - Explaining Variation

In our next lesson, we will compile all of our survey data and import it into R so we can take a closer look at how our class varies. While it will be interesting to find out things like what type of music our class prefers, and which type of sports we like to play, what we also like to do is use our data to explain *why* these things vary the way that they do. 

5.1 - So what do you think? Why do you think some people prefer classical music? Or what are some reasons people might prefer playing tennis? Are any of these possibilities captured in our data (i.e., did we measure those possibilities?)? If not, should we? Could we?

5.2 - If I showed you the following 6 random heights (in inches) that I picked from our class and mixed up, could you tell which ones were the heights of the people who liked summer, and which ones were the heights of the people who liked spring, or winter?

**73, 65, 58, 60, 70, 63**

5.3 - If I showed you those same random heights, could you tell which ones were likely the heights of students who were male, and which ones were likely heights of females?

**73, 65, 58, 60, 70, 63**

5.4 - Which one of those things, season preference or gender, do you think would be better at explaining variation in height? Why? What is different about those two variables?

5.5 - Thinking about our experimental data, what is it about our experiment that can help us explain variation? What is the thing we are trying to explain anyway? How can the experimental conditions help us explain why that varies?

<div class="alert alert-block alert-warning">

**Note for Instructors**

*Cut this question if you skipped the experiment.*

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=e6f3baa5-0bb3-4d99-89e3-4504df0ec91a' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>