<div class="alert alert-block alert-danger">

# 8B: Does Laptop Use Distract Other Students? (COMPLETE)

**Use with textbook version 6.0+**

**Lesson assumes students have read up through page: 8.7**

</div>

<div class="alert alert-block alert-warning">

#### Summary of Notebook:

In this lesson, students will explore some data that researchers collected to determine whether using your laptop to multitask during class is distracting to you and/or other students. Students will generate models of their hypotheses, then evaluate the Data Generating Process (DGP) by considering whether the empty model could have produced the sample $b_1$ estimate. There is an emphasis on thinking about the empty model of the DGP as the model with only the mean (or $\beta_0$) plus error versus the explanatory model which includes the effect of the explanatory variable (where $\beta_1$ does not equal 0).

#### Highlighted Skills and Concepts:
- Fitting group models
- Visually estimating and interpreting $b_1$ estimates
- Thinking about and modeling the empty model of the DGP with `shuffle()`
- Thinking about the empty model of the DGP as $Y_i = \beta_0 + \epsilon_i$ or data = mean + error

***Lesson Handout:***

[This drawing handout](https://docs.google.com/document/d/1SJgiMoF-aUK7RaCjM6wI5zHmVRYvyymnDEZxeDEQMK0/edit#) should be printed and passed out to your students as part of a drawing activity they will be asked to do in section 3. It includes printouts of some of the visualizations in the lesson so students can draw on them to illustrate important concepts.

</div>

<div class="alert alert-block alert-success">

## Approximate time to complete Notebook: 55-70 Mins
    
</div>

In [None]:
# Load the CourseKata library
suppressPackageStartupMessages({
    library(coursekata)
    library(gridExtra)
})

<div class="alert alert-block alert-success">

### 1.0 - Approximate Time: 5-8 mins
    
</div>

## Laptop Distraction Study

Do you ever use your laptop to "multitask" during class (e.g., check social media, surf websites, shop)? Is it possible that your laptop use is affecting the other students around you? 

Some researchers decided to study this question with students trying to learn during a meteorology lecture. These students were randomly assigned to be seated right behind students using their laptop to multitask *or* right behind students who were taking paper notes. 

<img src="https://coursekata-course-assets.s3.us-west-1.amazonaws.com/UCLATALL/czi-stats-course/jnb_dJ1DDtFP-image.png">
 
At the end of the lecture, all participants were tested on their factual comprehension of the lecture (20 questions) and their ability to apply the knowledge they learned (20 questions).

#### Study information

Sana, F., Weston, T., & Cepeda, N. J. (2013) https://doi.org/10.1016/j.compedu.2012.10.003



## 1.0 - The Data & Hypothesis

**1.1:** What are your thoughts about viewing other students' multitasking? Do you think it would have an effect on learning? 


**1.2:** In the cell below, take a look at the data frame `laptops`. Each row in the data frame contains data from a student participant.

Which columns should we learn more about?

In [None]:
# Load the data frame
link <- "https://docs.google.com/spreadsheets/d/e/2PACX-1vQA7gzpSepI7u5zu7JaMpxdEDNpreOGuHQlPcPx_CsONMmqGSEoU2qePuOWnVh_kErQnbnJ_eCqIzzz/pub?gid=1809544437&single=true&output=csv"
laptops <- read.csv(link, header = TRUE)

# Take a look at the data frame

# COMPLETE
str(laptops)
head(laptops)
glimpse(laptops)

If you do have questions about the variables, take a look at these **variable descriptions**:

- `id` ID for the participant.
- `condition` Whether the student could see another student on a laptop multitasking (view) or not (no-view)
- `fact` The proportion of fact-based questions answered correctly by a student.
- `applied` The proportion of knowledge application questions answered correctly by a student.
- `total` The proportion of all questions answered correctly by a student.
- `age` The age of the student in years
- `gender` Whether the student was female (1) or male (2)
- `english_first` Whether English is the student’s first language (1) or not (2)
- `familiarity` How much familiarity the student had with the lecture content, self-rated from none (1), to somewhat (4), to very (7)
- `interesting` How interesting the student found the lecture content, self-rated from none (1), to somewhat (4), to very (7)
- `engaging` How engaging the student found the lecture content, self-rated from none (1), to somewhat (4), to very (7)
- `notes_preference` The method that the student generally prefers to take notes: pen (1), laptop (2), audio (3), or none (4)
- `distracted` How distracted the student felt by the confederates, self-rated from not applicable(0) or none (1),  to somewhat (4), to very (7)
- `distraction_effect` How detrimental the student felt the distraction was to their learning, self-rated from not applicable (0) or none (1), to somewhat (4), to very (7)

In addition to these variables, the quantity and quality of the notes were rated by the experimenter most familiar with the material. The experimenter did not know which condition the notes were from (i.e., they were *blind* to participants' condition) while scoring. 

- `notes_quantity` The amount of notes taken by the student during the lecture were rated by the most knowledgeable experimenter who did not know what condition the notes came from; experimenter-rated from few (1), to average (4), to a lot (7)
- `notes_quality` Similar to the notes quantity, the quality of the notest taken by the student during the lecture; experimenter-rated from poor (1), to average (4), to great (7)



**1.3:** If viewing other people's laptops indeed affect students, which of the variables above might be interesting outcomes to consider? 


<div class="alert alert-block alert-warning">

**Sample Response**

*Answers vary. Some possible responses include:*

Important variables might be:
- condition
- performance: fact, applied, total
- activity: notes_quantity
- metacog: distracted, distraction_effect

</div>

<div class="alert alert-block alert-success">

### 2.0 - Approximate Time: 8-10 mins
    
</div>

## 2.0 - Explore and Model Variation

**Hypothesis:** The researchers had predicted that `total` performance would differ based on which `condition` students were in.

**2.1:** Write a word equation and modify the jitter plot below to help us explore this hypothesis. What do you think of this hypothesis from the data that you see?

<div class="alert alert-block alert-warning">

**Sample Response**
 
**total = condition + other stuff**

</div>

In [None]:
#gf_jitter( ~ , data = laptops, width = .1) 


#COMPLETE 2.1
gf_jitter(total ~ condition, data = laptops, width = .1) 


#COMPLETE 2.2
condition_model <- lm(total ~ condition, data = laptops)
condition_model

gf_jitter(total ~ condition, data = laptops, width = .1) %>%
  gf_model(condition_model, color = "green4")

**2.2:** In the code cell *above*, find the best fitting model of your word equation and put it on the visualization above.

**2.3:** Write your best fitting model using GLM notation and interpret the parameter estimates. 

$$Y_i = ... X_i+ e_i$$

$$total_i = ... condition_i+ e_i$$

<div class="alert alert-block alert-warning">

**Sample Response**


GLM notation:
- $Y_i = .73 + -.18(condition) + e_i$
- $total = .73 + -.18(condition) + e_i$

Interpretation:
- $b_0 = .73$: The average `total` of the no-view group (the mean line for group `no-view` in the plot).
- $b_1 = -.18$: The adjustment to $b_0$ to get the mean `total` of the view group (the downward distance between the mean line for the `no-view` group and the mean line for the `view` group).    

</div>

<div class="alert alert-block alert-success">

### 3.0 - Approximate Time: 15-18 mins
    
</div>

## 3.0 - Evaluate Models

### Evaluate the Model Fit to the Sample Data

**3.1:** What are some ways you have learned to evaluate this model?

<div class="alert alert-block alert-warning">

**Sample Response**
 
- $$b_1$$: how big is the difference between the two groups, .18 (or 18%) lower average score for students in view of multi-taskers
- PRE: .37 reduction in error  
- SS Model: .300 SS Model compared to .800 SS Total (SS Model is 37% of SS Total)
- F: the variance in the predictions (between the groups) is 21 times the error variation (within the group)
- Cohen's d: 1.5 standard deviations 

</div>

In [None]:
# COMPLETE
b1(condition_model)
supernova(condition_model)
cohensD(total ~ condition, data = laptops)

### Evaluate the Models of the DGP

That's great that the best fitting model explains some of the variation in *this sample of data* (e.g., PRE = .37). But, what we care about is whether viewing multi-taskers explains some of the variation in the DGP!

**Word Equations (and GLM Models) of the DGP**

Here's a word equation and model in GLM notation for the idea that `condition` has *something* to do with the variation in `total` in the DGP.

- **total = condition + other stuff**
- $total_i = \beta_0 + \beta_1condition_i + \epsilon_i$

**3.1:** Write a word equation and GLM notation for the idea that `condition` has *nothing* to do with the variation in `total` (it's just other stuff).

<div class="alert alert-block alert-warning">

**Sample Response**

- **total = mean + other stuff**
- $total_i = \beta_0 + \epsilon_i$
- to make it explicit that condition has 0 effect, here we have made $\beta_1=0$ --> $total_i = \beta_0 + (0)condition_i + \epsilon_i$


</div>

**3.2:** Which DGP is being mimicked when we use the `shuffle()` function? Implement it in the plot provided below. 

<div class="alert alert-block alert-warning">

**Sample Responses (all equivalent)**

- **total = mean + other stuff**
- the empty model
- the model where $\beta_1=0$

</div>

In [None]:
# modify this code to mimic such a DGP
#gf_jitter(total ~ condition, data = laptops, width = .1, color = "navyblue") 


# COMPLETE VERSION - add shuffle() to the outcome variable
# what students should do
gf_jitter(shuffle(total) ~ condition, data = laptops, width = .1, color = "navyblue") 


In [None]:
# COMPLETE 
# what teacher might want to show after 3.3; side by side plots of original and shuffled plots
original_plot <- gf_jitter(total ~ condition, data = laptops, width = .1, size = 1) %>%
  gf_labs(title="original data") %>%
  gf_lims(y = c(.30, 1))

shuff_plot1 <- gf_jitter(shuffle(total) ~ condition, data = laptops, width = .1, color = "navyblue", size = 1) %>%
  gf_labs(title="shuffled data")%>%
  gf_lims(y = c(.30, 1))

shuff_plot2 <- gf_jitter(shuffle(total) ~ condition, data = laptops, width = .1, color = "navyblue", size = 1) %>%
  gf_labs(title="shuffled data")%>%
  gf_lims(y = c(.30, 1))

shuff_plot3 <- gf_jitter(shuffle(total) ~ condition, data = laptops, width = .1, color = "navyblue", size = 1) %>%
  gf_labs(title="shuffled data")%>%
  gf_lims(y = c(.30, 1))

grid.arrange(original_plot, shuff_plot1, shuff_plot2, shuff_plot3, ncol=2, nrow=2)


**3.3:** How is the data from the empty model of the DGP (aka shuffle) different from the original data? 



<div class="alert alert-block alert-warning">

**Sample Response**


- The "shift" down in the view group isn't as strongly present in any of the shuffle data. In the shuffled data, although the no-view and view groups aren't EXACTLY the same, one group isn't generally scoring higher than the other groups.
- Because shuffling is random, some students will get groups that do look a little different (either no view higher than view or vice versa). Ask them, hey! How did you generate THAT data? They should say -- uhm, shuffling (through randomness). Point out that even randomness can lead to a difference between groups sometimes. But most of the time, groups of shuffled data don't look that different.

</div>

**3.4, Draw:** On the graph handout your instructor has printed for you, visually estimate and draw in the means for the original and shuffled data. We know the sample $b_1$ (the difference between the means) is -0.18. How do the shuffled b1s compare to the original? 

<div class="alert alert-block alert-warning">

**Sample Drawing**

- To help students process the question, you might prompt: are the $b_1$s longer? smaller? closer to 0?
- Because $b_1$s can be negative and positive, it maybe helpful to use this "length" language to help them focus on the absolute value of the $b_1$.

<img src="https://coursekata-course-assets.s3.us-west-1.amazonaws.com/UCLATALL/czi-stats-course/jnb_nxN6MYTT-image.png" width = 80%>

</div>

<div class="alert alert-block alert-success">

### 4.0 - Approximate Time: 12-15 mins
    
</div>

## 4.0 - Focus on the DGP where $\beta_1=0$

We saw in the graphs that most of the shuffled data have a $b_1$ close to 0. 

Here we want you to consider an analogy: DGPs generate data just like parents give birth to kids. The $\beta_1$ in the DGP is the parent and the $b_1$s in these samples are the kids. The kids ($b_1$s) tend to be similar to the parent ($\beta_1=0$).

**4.1:** Although the graphs were helpful for us to see that the $b_1$s are close to 0, we can skip straight to looking at the shuffled $b_1$s.

Run the code below a few times. Why does one stay the same and the other change?

In [None]:
sample_b1 <- b1(total ~ condition, data = laptops)
sample_b1

b1(shuffle(total) ~ condition, data = laptops)

<div class="alert alert-block alert-warning">

**Sample Response**

The first number is our sample $b_1$ and since our sample doesn't change, it stays the same.

The second number changes because the `shuffle()` function mimics "no effect of laptops" and people are randomly put into the `condition` groups. Each time we get a different $b_1$ (based on this random DGP).

    
</div>

**4.2:** Modify the code below to generate a bunch of $b_1$s (like 10 or 20) from the DGP where $\beta_1=0$. 

Recall that the sample $b_1$ in this laptops experiment was approximately -0.18. Where does the sample $b_1$ fall in relation to these shuffled $b_1$s?


In [None]:
b1(shuffle(total) ~ condition, data = laptops)

# COMPLETE
do(10) * b1(shuffle(total) ~ condition, data = laptops)

# students might find it easier to look at these b1s if they are arranged in order
bunch_of_b1s <- do(10) * b1(shuffle(total) ~ condition, data = laptops)
arrange(bunch_of_b1s, b1)

<div class="alert alert-block alert-warning">

**Sample Response**

- A lot of these $b_1$s are between -0.10 to +0.10. There are some negative and some positive ones.
- The sample $b_1$ (-0.18) is more negative than all of these $b_1$s from shuffling.
    
</div>

**4.3:** So what do you think about the DGP where $\beta_1=0$? Could it have been the DGP that generated our sample $b_1$?

<div class="alert alert-block alert-warning">

**Sample Response**

- I haven't seen any $b_1$ that are *that negative* from this DGP.
- A more sophisticated answer is to acknowledge that it might be possible but it's really unlikely or rare to see from this DGP.
    
</div>

**4.4:** Going back to the researchers, what does this mean for their hypothesis? 

<div class="alert alert-block alert-warning">

**Sample Response**

It means that we don't think the empty model (**total = mean + other stuff**) generated our data. So maybe the condition model (**total = condition + other stuff**) is a better model.

This means that there really might be an effect of viewing multitasking on learning in the DGP. Students who saw multitasking got lower total scores than the no-view students and this decrease was more extreme than we would expect of two random groups.
    
</div>

<div class="alert alert-block alert-success">

### 5.0 - Approximate Time: 15-18 mins
    
</div>

## 5.0 - Exploring Other Outcome Variables

**5.1:** There are other outcome variables in this data frame that the researchers measured. Perhaps `condition` had an effect on other outcomes as well. Develop your own hypothesis and analyze the data. 


<div class="alert alert-block alert-warning">

**Note to Instructors**

Consider having students write/present these parts:
- Exploring Variation (explanation of your hypothesis, word equations, graphs)
- Modeling Variation (the model in GLM notation, a visualization of your model, interpreting the parameter estimates)
- Evaluating the Model (how good is this model? could the empty model of the DGP have produced this data?)
- Overall Conclusions (relate your analysis back to your hypothesis)

*Students may choose to pursue other hypotheses with `condition` as the explanatory variable and other variables as the outcome, or they may pursue other ideas (outside of `condition`) as well. The goal is to help them try the above process on their own, and the practice using text and R code to report findings.* 
    
</div>