Skip to content

Commit

Permalink
update spacing on week 3 R resources, double check Lab 4 before class
Browse files Browse the repository at this point in the history
  • Loading branch information
atheobold committed Apr 27, 2023
1 parent 145122d commit f6d47ef
Show file tree
Hide file tree
Showing 3 changed files with 16 additions and 14 deletions.
4 changes: 2 additions & 2 deletions _freeze/labs/lab-4/execute-results/html.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"hash": "ca76063e5e2b9809eb7ffac100237360",
"hash": "4957c7db2826ebbf22e81d4f6823c4fe",
"result": {
"markdown": "---\ntitle: \"Lab 4: Simple Linear Regression\"\nauthor: \"Your group's names here!\"\ndate: \"April 28, 2023\"\nformat: \n html:\n embed-resources: true\n standalone: true\neditor: visual\nexecute: \n eval: false\n echo: true\n warning: false\n message: false\n---\n\n\n## Old Packages\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(tidyverse)\nlibrary(lterdatasampler)\n```\n:::\n\n\n## New Packages!\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(moderndive)\n```\n:::\n\n\n## Data for Today\n\nToday we'll be working with data on lake ice duration for two lakes surrounding Madison, Wisconsin. This dataset contains information on the number of days of ice (ice duration) on each lake for years between 1851 and 2019. These data are stored in the `ntl_icecover` dataset, which lives in the **lterdatsampler** package.\n\nAccording to the EPA, lake ice duration can be an indicator of climate change. This is because lake ice is dependent on several environmental factors, so changes in these factors will influence the formation of ice on top of lakes. As a result, the study and analysis of lake ice formation can inform scientists about how quickly the climate is changing, and are critical to minimizing disruptions to lake ecosystems.\n\n## Inspecting the Data\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Code to answer question 1 goes here!\n```\n:::\n\n\n**Question 1 -- How large is the `ntl_icecover` dataset? (i.e. How many rows and columns does it have?)**\n\n## Visualize a Simple Linear Regression\n\nLet's start with tools to visualize and summarize linear regression.\n\n### Tools\n\n1. Visualize the relationship between x & y -- `geom_point()`\n2. Visualize the linear regression line -- `geom_smooth()`\n\nWe will be investigating the relationship between the `ice_duration` of each lake and the `year`.\n\n### Step 1\n\n**Question 2 -- Make a scatterplot of the relationship between the `ice_duration` (response) and the `year` (explanatory).** \n\n*Be sure to make the axis labels look nice, including any necessary units!*\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Code to answer question 2 goes here!\n```\n:::\n\n\n**Question 3 -- Describe the relationship you see in the scatterplot. Be sure to address the four aspects we discussed in class: form, direction, strength, and unusual points.**\n\n### Step 2\n\nTo add a regression line on top of a scatterplot, you add (`+`) a `geom_smooth()` layer to your plot. However, if you add a \"plain\" `geom_smooth()` to the plot, it uses a wiggly line. You need to tell `geom_smooth()` what type of smoother line you want for it to use! We can get a straight line by including `method = \"lm\"` **inside** of `geom_smooth()`.\n\n**Question 4 -- Add a linear regression line to the scatterplot you made in Question 3.**\n\n*No code goes here, you need to modify your scatterplot from Question 3!*\n\n## Fit a Simple Linear Regression Model\n\nNext, we are going to summarize the relationship between `ice_duration` and `year` with a linear regression equation.\n\n### Tools\n\n1. Calculate the correlation between x & y -- `get_correlation()`\n2. Model the relationship between x & y -- `lm()`\n3. Explore coefficient estimates -- `get_regression_table()`\n\n### Step 1\n\n**Question 5 -- Calculate the correlation between these variables, using the `get_correlation()` function.**\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Code to answer question 5 goes here!\n```\n:::\n\n\n### Step 2\n\nNext, we will \"fit\" a linear regression with the `lm()` function. Remember, the \"formula\" for `lm()` is `response_variable ~ explanatory_variable`. Also recall that you need to tell `lm()` where the data live using `data =`!\n\n**Question 6 -- Fit a linear regression modeling the relationship between between `ice_duration` and `year`. Save your linear regression into an object named `ice_lm` so you can use it later.**\n\n*To create an object, you need to use the assignment arrow (`<-`)!*\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Code to answer question 6 goes here!\n```\n:::\n\n\n### Step 3\n\nFinally, to get the regression equation, we need grab the coefficients out of the linear model object you made in Step 2. The `get_regression_table()` function is a handy tool to do just that!\n\n**Question 7 -- Use the `get_regression_table()` function to obtain the coefficient estimates for the `ice_lm` regression you fit in Question 6.**\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Code to answer question 7 goes here!\n```\n:::\n\n\n**Question 8 -- Using the coefficient estimates above, write out the estimated regression equation.**\n*Your equation needs to be in the context of the variables, not in generic $x$ and $y$ statements!*\n\n\n**Question 9 -- Interpret the value of the slope coefficient.**\n*Your interpretation needs to be in the context of the variables, not in generic $x$ and $y$ statements!*\n\n\n**Question 10 -- What do you expect to happen to the duration of ice if the number of years is increased by 100?**\n\n## A preview of what's to come\n\nIn our analysis above, we only looked at the relationship between ice duration and year, not accounting for which lake the durations were for. That is another explanatory variable we could include in our regression model!\n\n**Question 11 -- Using the code you wrote for Question 2 (with the regression line added), add a `color` for the name of the lake (`lakeid`).**\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Code to answer question 11 goes here!\n```\n:::\n",
"markdown": "---\ntitle: \"Lab 4: Simple Linear Regression\"\nauthor: \"Your group's names here!\"\ndate: \"April 27, 2023\"\nformat: \n html:\n embed-resources: true\n standalone: true\neditor: visual\nexecute: \n eval: false\n echo: true\n warning: false\n message: false\n---\n\n\n## Old Packages\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(tidyverse)\nlibrary(lterdatasampler)\n```\n:::\n\n\n## New Package!\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(moderndive)\n```\n:::\n\n\n## Data for Today\n\nToday we'll be working with data on lake ice duration for two lakes surrounding Madison, Wisconsin. This dataset contains information on the number of days of ice (ice duration) on each lake for years between 1851 and 2019. These data are stored in the `ntl_icecover` dataset, which lives in the **lterdatsampler** package.\n\nAccording to the EPA, lake ice duration can be an indicator of climate change. This is because lake ice is dependent on several environmental factors, so changes in these factors will influence the formation of ice on top of lakes. As a result, the study and analysis of lake ice formation can inform scientists about how quickly the climate is changing, and are critical to minimizing disruptions to lake ecosystems.\n\n## Inspecting the Data\n\n**Question 1 -- How large is the `ntl_icecover` dataset? (i.e. How many rows and columns does it have?)**\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Code to answer question 1 goes here!\n```\n:::\n\n\n## Visualize a Simple Linear Regression\n\nLet's start with tools to visualize and summarize linear regression.\n\n### Tools\n\n1. Visualize the relationship between x & y -- `geom_point()`\n2. Visualize the linear regression line -- `geom_smooth()`\n\nWe will be investigating the relationship between the `ice_duration` of each lake and the `year`.\n\n### Step 1\n\n**Question 2 -- Make a scatterplot of the relationship between the `ice_duration` (response) and the `year` (explanatory).**\n\n*Be sure to make the axis labels look nice, including any necessary units!*\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Code to answer question 2 goes here!\n```\n:::\n\n\n**Question 3 -- Describe the relationship you see in the scatterplot. Be sure to address the four aspects we discussed in class: form, direction, strength, and unusual points.**\n\n*Hint: You need to explicitly state __where__ the unusual observations are!*\n\n### Step 2\n\nTo add a regression line on top of a scatterplot, you add (`+`) a `geom_smooth()` layer to your plot. However, if you add a \"plain\" `geom_smooth()` to the plot, it uses a wiggly line. You need to tell `geom_smooth()` what type of smoother line you want for it to use! We can get a straight line by including `method = \"lm\"` **inside** of `geom_smooth()`.\n\n**Question 4 -- Add a linear regression line to the scatterplot you made in Question 3.**\n\n*No code goes here, you need to modify your scatterplot from Question 3!*\n\n## Fit a Simple Linear Regression Model\n\nNext, we are going to summarize the relationship between `ice_duration` and `year` with a linear regression equation.\n\n### Tools\n\n1. Calculate the correlation between x & y -- `get_correlation()`\n2. Model the relationship between x & y -- `lm()`\n3. Explore coefficient estimates -- `get_regression_table()`\n\n### Step 1\n\n**Question 5 -- Calculate the correlation between these variables, using the `get_correlation()` function.**\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Code to answer question 5 goes here!\n```\n:::\n\n\n### Step 2\n\nNext, we will \"fit\" a linear regression with the `lm()` function. Remember, the \"formula\" for `lm()` is `response_variable ~ explanatory_variable`. Also recall that you need to tell `lm()` where the data live using `data =`!\n\n**Question 6 -- Fit a linear regression modeling the relationship between between `ice_duration` and `year`. Save your linear regression into an object named `ice_lm` (using the `<-` assignment arrow) so you can use it later.**\n\n*Remember what order you need to put the response and explanatory variables!*\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Code to answer question 6 goes here!\n```\n:::\n\n\n### Step 3\n\nFinally, to get the regression equation, we need grab the coefficients out of the linear model object you made in Step 2. The `get_regression_table()` function is a handy tool to do just that!\n\n**Question 7 -- Use the `get_regression_table()` function to obtain the coefficient estimates for the `ice_lm` regression you fit in Question 6.**\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Code to answer question 7 goes here!\n```\n:::\n\n\n**Question 8 -- Using the coefficient estimates above, write out the estimated regression equation.** \n\n*Your equation needs to be in the context of the variables, not in generic* $x$ and $y$ statements!\n\n**Question 9 -- Interpret the value of the slope coefficient.** \n\n*Your interpretation needs to be in the context of the variables, not in generic* $x$ and $y$ statements!\n\n**Question 10 -- What do you expect to happen to the duration of ice if the number of years is increased by 100?**\n\n## A preview of what's to come\n\nIn our analysis above, we only looked at the relationship between ice duration and year, not accounting for which lake the duration was for. That is another explanatory variable we could include in our regression model!\n\n**Question 11 -- Using the code you wrote for Question 2 (with the regression line added), add a `color` for the name of the lake (`lakeid`).**\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Code to answer question 11 goes here!\n```\n:::\n",
"supporting": [],
"filters": [
"rmarkdown/pagebreak.lua"
Expand Down
26 changes: 14 additions & 12 deletions labs/lab-4.qmd
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: "Lab 4: Simple Linear Regression"
author: "Your group's names here!"
date: "April 28, 2023"
date: "April 27, 2023"
format:
html:
embed-resources: true
Expand All @@ -21,7 +21,7 @@ library(tidyverse)
library(lterdatasampler)
```

## New Packages!
## New Package!

```{r}
library(moderndive)
Expand All @@ -35,14 +35,14 @@ According to the EPA, lake ice duration can be an indicator of climate change. T

## Inspecting the Data

**Question 1 -- How large is the `ntl_icecover` dataset? (i.e. How many rows and columns does it have?)**

```{r dataset-info}
# Code to answer question 1 goes here!
```

**Question 1 -- How large is the `ntl_icecover` dataset? (i.e. How many rows and columns does it have?)**

## Visualize a Simple Linear Regression

Let's start with tools to visualize and summarize linear regression.
Expand All @@ -56,7 +56,7 @@ We will be investigating the relationship between the `ice_duration` of each lak

### Step 1

**Question 2 -- Make a scatterplot of the relationship between the `ice_duration` (response) and the `year` (explanatory).**
**Question 2 -- Make a scatterplot of the relationship between the `ice_duration` (response) and the `year` (explanatory).**

*Be sure to make the axis labels look nice, including any necessary units!*

Expand All @@ -68,6 +68,8 @@ We will be investigating the relationship between the `ice_duration` of each lak

**Question 3 -- Describe the relationship you see in the scatterplot. Be sure to address the four aspects we discussed in class: form, direction, strength, and unusual points.**

*Hint: You need to explicitly state __where__ the unusual observations are!*

### Step 2

To add a regression line on top of a scatterplot, you add (`+`) a `geom_smooth()` layer to your plot. However, if you add a "plain" `geom_smooth()` to the plot, it uses a wiggly line. You need to tell `geom_smooth()` what type of smoother line you want for it to use! We can get a straight line by including `method = "lm"` **inside** of `geom_smooth()`.
Expand Down Expand Up @@ -100,9 +102,9 @@ Next, we are going to summarize the relationship between `ice_duration` and `yea

Next, we will "fit" a linear regression with the `lm()` function. Remember, the "formula" for `lm()` is `response_variable ~ explanatory_variable`. Also recall that you need to tell `lm()` where the data live using `data =`!

**Question 6 -- Fit a linear regression modeling the relationship between between `ice_duration` and `year`. Save your linear regression into an object named `ice_lm` so you can use it later.**
**Question 6 -- Fit a linear regression modeling the relationship between between `ice_duration` and `year`. Save your linear regression into an object named `ice_lm` (using the `<-` assignment arrow) so you can use it later.**

*To create an object, you need to use the assignment arrow (`<-`)!*
*Remember what order you need to put the response and explanatory variables!*

```{r lm}
# Code to answer question 6 goes here!
Expand All @@ -122,19 +124,19 @@ Finally, to get the regression equation, we need grab the coefficients out of th
```

**Question 8 -- Using the coefficient estimates above, write out the estimated regression equation.**
*Your equation needs to be in the context of the variables, not in generic $x$ and $y$ statements!*
**Question 8 -- Using the coefficient estimates above, write out the estimated regression equation.**

*Your equation needs to be in the context of the variables, not in generic* $x$ and $y$ statements!

**Question 9 -- Interpret the value of the slope coefficient.**
*Your interpretation needs to be in the context of the variables, not in generic $x$ and $y$ statements!*
**Question 9 -- Interpret the value of the slope coefficient.**

*Your interpretation needs to be in the context of the variables, not in generic* $x$ and $y$ statements!

**Question 10 -- What do you expect to happen to the duration of ice if the number of years is increased by 100?**

## A preview of what's to come

In our analysis above, we only looked at the relationship between ice duration and year, not accounting for which lake the durations were for. That is another explanatory variable we could include in our regression model!
In our analysis above, we only looked at the relationship between ice duration and year, not accounting for which lake the duration was for. That is another explanatory variable we could include in our regression model!

**Question 11 -- Using the code you wrote for Question 2 (with the regression line added), add a `color` for the name of the lake (`lakeid`).**

Expand Down
Binary file modified resources/week3.docx
Binary file not shown.

0 comments on commit f6d47ef

Please sign in to comment.