Just run this code block to set things up:

In [None]:
options(repr.plot.width=4, repr.plot.height=4)
options("scipen"=100, "digits"=4)
if(!require("readr")) install.packages("readr")
library("readr")
url<- "https://docs.google.com/spreadsheets/d/e/2PACX-1vT_oNaK_QKTtFQYdh7Pl17_prektSRuDVwRD71Vo8daBd0biyeG-Oiic4dMN_EL--voDWHAc5MmXNYH/pub?gid=0&single=true&output=csv"
df<-read.csv(url)

As a result of the above, the dataframe df has the Sales, Radio and
Newspaper Ads data loaded in

Just for convenience, lets define three variables to hold the columns:

In [None]:
Radio<-df$Radio
Newspaper<-df$Newspaper
Sales<-df$Sales

str(df)

Lets look at the correlation coefficient R for Newspaper and Sales

In [None]:
R = cor(Newspaper, Sales)
R

What about the scatterplot between them:

In [None]:
plot(Newspaper,Sales)

You can fix up the x-axis label by using `xlab` and the y-axis label by
using `ylab`. Also you can give the graph a title.

You can give the whole graph a title using `main`

You can also put in that they are filled dots instead of hollow by using
`pch=19`

In [None]:
plot(Newspaper,Sales, main="Newspaper Expenditures and Sales", xlab="Newspaper Ad Spending ($000)", ylab="Sales in Dollars ($000)", pch=19)

Here is the correlation coefficient R for Radio and Sales

In [None]:
R = cor(Radio, Sales)
R

And here is the scatterplot for them:

In [None]:
plot(Radio,Sales, main="Radio Ad Expenditures and Sales", xlab="Radio Ad Spending ($000)", ylab="Sales in Dollars ($000)", pch=19)

The following shows the correlation matrix and the scatterplots for all
variables

In [None]:
cor(df)
plot(df)

Lets fix up the plots a little just to look at the lower ones:

In [None]:
plot(df, upper.panel = NULL, pch = 19, cex.labels=1.25)

Lets run our regression model:

In [None]:
model = lm(Sales ~ Radio + Newspaper, data=df)
summary(model)

Test for overall significant of linear relationship:

$$ \text{p-value} = 0.00000015 $$

p-value for Radio:

$$ \text{p-value} = 0.00000049 $$

p-value for Newspaper:

$$ \text{p-value} = 0.00001831 $$

The regression equation is this:

$$ sales = 13.08(radio) + 16.80(newspaper) + 156.43 $$

The standard error is:

$$ \text{standard error} = 159 $$

Make some predictions for:

-   spending 30,000 on Radio ads and 35,000 on Newspaper
-   spending 35,000 on Radio ads and 45,000 on Newspaper

In [None]:
predictors = data.frame(Radio=c(30,40), Newspaper=c(35,45))
predictions<-predict(model, predictors)
round(predictions)