Regression Methods

Learn Regression Methods Once For All

These scripts are based on the lecture notes from STAT 501 - Regression Methods. This is a course from Pennsylvania State University ( Penn State ).

Lesson 01 - Simple Linear Regression

This notes describes the basic concepts about a simple linear regression. How to measure the strength of linear association between the features and How to interpret the coefficient of determination ( Rsquare )

What is the "Best Fitting Line"??

The Best Line would be the line the presents the prediction errors as small as possible in some overall sense
Least Squares Criterios: Minimize the sum of the squared prediction errors.
Error ( Residual ): y - yhat

The Simple Linear Regression Model

The Simple linear regression model fits the four conditions below:

The mean of the response, at each value of the predictor, is a Linear function of the x
The errors are Independent
The errors, at each value of the predictor, are Normallly Distribuited
The errors, at each value of the predictor, have Equal Variance

The Coefficient of Determination, r-squared

Measure the strenght of the relationship

SSR ( Reg Sum of Squares ): Quantifies how far the estimated sloped regression line, yhat, is from the horizontal, ybar (no relationship line).

.

SSE ( Error Sum of Squares ): Quantifies how much the data points, yi, vary around the estimated regression line, yhat. .
SSTO ( Total Sum of Squares ): Quantifies how much the data points, yi, vary around their mean, ybar. .
SSTO = SSR + SSE

Coefficient of determination or R-squared:

Percentage of the variation in y is reduced by taking into account predictor x. Percentage of the variation in y is 'explained by' the variation in predictor x.

r-squared is a number between 0 and 1.
If r-squared = 1, all of the data points fall perfectly on the regression line. The predictor x accounts for all the variation in y!.
If r-squared = 0, the estimated regression line is perfectly horizontal. The predictor x accounts for none of the variation in y!.

(Pearson) Correlation Coefficient r

Measure the sign of the relationship

R-squared Cautions

The r-squared quantifies the strength of a linear relationship. If r-squared = 0, tells us that if there is a relationship between x and y, it's not linear.
A large r-squared value should not be interpreted as meaning that the estimated regression line fits the data well. Its large values does suggest that taking into account the predictor is better than not doing so. It just doesn't tell us that we could still do better.
The r-squared can be greatly affected by just one data point (or a few data points).
Correlation (or association) does not imply causation.
Ecological correlations, correlations that are based on rates or averages, tend to overstate the strength of an association.
A statistically significant r-squared does not imply that the slope Beta1 is meaninfully different from 0.
A large r-squared value does not necessarily mean that an useful prediction of the response can be made. It's still possible to get prediction intervals or confidence intervals that are too wide to be useful.

Lesson 02 - Simple Linear Regression Evaluation

This lesson presents two alternatives methods for testing if a linear association exists between the predictor x and the response y in a simple linear regression model. versus (https://latex.codecogs.com/gif.latex?H_%7BA%7D%3A%20%5Cbeta_%7B1%7D%20%5Cneq%200)

The t-test for the slope
Analysis of variance (ANOVA) F-test

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
dataset		dataset
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
linearRegression_lesson01.R		linearRegression_lesson01.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Regression Methods

Learn Regression Methods Once For All

Lesson 01 - Simple Linear Regression

What is the "Best Fitting Line"??

The Simple Linear Regression Model

The Coefficient of Determination, r-squared

Coefficient of determination or R-squared:

(Pearson) Correlation Coefficient r

R-squared Cautions

Lesson 02 - Simple Linear Regression Evaluation

About

Releases

Packages

Languages

License

Meigarom/RegressionMethods

Folders and files

Latest commit

History

Repository files navigation

Regression Methods

Learn Regression Methods Once For All

Lesson 01 - Simple Linear Regression

What is the "Best Fitting Line"??

The Simple Linear Regression Model

The Coefficient of Determination, r-squared

Coefficient of determination or R-squared:

(Pearson) Correlation Coefficient r

R-squared Cautions

Lesson 02 - Simple Linear Regression Evaluation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages