Complete the exercises below For **Assignment #3**.

## Linear model with one numerical variable

Execute the following cell to load the [ISLR2](https://cran.rstudio.com/web/packages/ISLR2/index.html) and [Tidverse](https://www.tidyverse.org/) packages.

In [None]:
library('tidyverse')
library('ISLR2')

The `ISLR2` package provides a dataset called `Boston` that we will use in this assignment.

In [None]:
Boston |> glimpse()

🚨 Use the **Jupyter Lab Contextual Help Feature** to see the documentation for this data. 

❓ **In the markdown cell below, add definitions for the `lstat` and `medv` variables in the `Boston` data.** 

- `lstat`: *WRITE DEFINITION HERE*
- `medv`: *WRITE DEFINITION HERE*

Let's plot `medv` (y-axis) versus `lstat`.

In [None]:
p = ggplot(Boston, aes(x = lstat, y = medv)) + geom_point()

p

❓Does the relationship appear to be positive or negative? Does it look to be reasonably linear?

**Answer:**

### Let's build a model!

We need to first load the [Tidymodels]() package.

In [None]:
library('tidymodels')

First we specify our model as `linear regression` using the `lm` engine.

In [None]:
mod = linear_reg() |> set_engine("lm")

mod

Next we "fit" our model by supplying the `formula` and the data.

In [None]:
mod_fit = mod |> fit(medv ~ lstat, data = Boston)

mod_fit

### Get the regression table

In [None]:
# We can use the tidy function to get a table of our model information
tidy(mod_fit)

❓Comparing on the chart above and the regression table, does the **sign** of the `estimate` for the `lstat` term coefficient fit your expectations?

**Answer:**

### Making predictions and visualizing the model

We can use the `augment` function to "predict" `medv` for all the values in our original dataset. We will capture these predictions in a new data frame called `Boston2`. The predicted values are found in the `.pred` column.

In [None]:
Boston2 = augment(mod_fit, Boston)

glimpse(Boston2)

Let's visualize our model.

In [None]:
p = ggplot(Boston2, aes(x = lstat)) +
    geom_point(aes(y = medv)) + 
    geom_line(aes(y = .pred), color = 'coral', linewidth = 1.5)

p

### Put your skills to practice independently!

In cells below, build a model of `medv` with the `rm` variable as a predictor.

**Include the following:**
- Show a regression table of your model parameters.
- Visualize the model with `ggplot2`.

❓Does your model indicate a positive relationship beteen number of rooms and home value? 

**Answer:**

## Linear model with one categorical variable

We will use the `Carseats` data from the `ISLR2` package for the following exercise.

In [None]:
Carseats |> glimpse()

Below is a plot of `sales` versus `ShelveLoc`.

In [None]:
p = ggplot(Carseats, aes(x = ShelveLoc, y = Sales)) + 
    geom_point(position = position_jitter(width = 0.3, height = 0))

p

❓Does it look like a "Good" shelf location is associated with more car seat sales?

**Answer:**

In cells below, using the `Carseats` data build a model of `Sales` with the `ShelveLoc` variable as a predictor.

**Include the following:**
- Show a regression table of your model parameters.
- Visualize the model with `ggplot2`.

📊 *Here is some example code for plotting your model.*

```r
# The code below assumes your predictions column is called ".pred" and is in a
# data frame called "Carseats2"

ggplot(Carseats2, aes(x = ShelveLoc)) + 
    geom_point(aes(y = Sales), 
               na.rm = T, position = position_jitter(height = 0, width = 0.2, seed = 42)) +
    geom_crossbar(aes(y = .pred, ymin = .pred, ymax = .pred), 
                  color = 'coral')
```

Which `ShelveLoc` category does your intercept term represent? 

**Answer:**