Skip to content

ayoakin/LinearRegression

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

Linear Regression From Scratch Using Matrices

Project Overview

This project recreates a linear regression algorithm from scratch using two python libraries - pandas and numpy for matrix operations. It then compares it to the linear regression implementation from scikit-learn. The dataset used is a two column salary dataset with 30 rows. The independent variable "YearsExperience" details the number of years of job experience. The dependent variable "Salary" details the salary earned.

Recreating the linear regression algorithm gives the benefit of

  • understanding what happens under the hood of the algorithm
  • understanding why and if a prediction for a dataset is best using linear regression
  • model customization

For a detailed explanation on the theory used for this computation, check out the accompanying article on medium

Code

You can find the code for this project here.

File overview:

  • LinearRegressionScratch.ipynb - the full code from this project

Environment Setup

Installation

To follow this project, please install the following locally:

  • Python 3.8+
  • Python packages
    • pandas
    • numpy
    • scikit-learn

Data

The data used for this implementation is the salary data originally on Kaggle.

You can download the file we'll use in this project here:

About

Build a linear regression model from the scratch with matrices in numpy

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors