Skip to content

Predicting employee salary based on their experience using linear regression.

Notifications You must be signed in to change notification settings

masonrahmani/Simple-Linear-Regression-using-statsmodel

Repository files navigation

Simple-Linear-Regression-using-statsmodel

Predicting employee salary based on their experience using linear regression.

SIMPLE LINEAR REGRESSION

# Import th relevant libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.api as sm
import seaborn as snb
snb.set()

Load the dataset

data=pd.read_csv("Salary_Data.csv")
data
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
YearsExperience Salary
0 1.1 39343.0
1 1.3 46205.0
2 1.5 37731.0
3 2.0 43525.0
4 2.2 39891.0
5 2.9 56642.0
6 3.0 60150.0
7 3.2 54445.0
8 3.2 64445.0
9 3.7 57189.0
10 3.9 63218.0
11 4.0 55794.0
12 4.0 56957.0
13 4.1 57081.0
14 4.5 61111.0
15 4.9 67938.0
16 5.1 66029.0
17 5.3 83088.0
18 5.9 81363.0
19 6.0 93940.0
20 6.8 91738.0
21 7.1 98273.0
22 7.9 101302.0
23 8.2 113812.0
24 8.7 109431.0
25 9.0 105582.0
26 9.5 116969.0
27 9.6 112635.0
28 10.3 122391.0
29 10.5 121872.0
data.describe()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
YearsExperience Salary
count 30.000000 30.000000
mean 5.313333 76003.000000
std 2.837888 27414.429785
min 1.100000 37731.000000
25% 3.200000 56720.750000
50% 4.700000 65237.000000
75% 7.700000 100544.750000
max 10.500000 122391.000000

Create regression

spliting the data into independent variable x and dependent y.

X=data['YearsExperience']
y=data['Salary']
X
0      1.1
1      1.3
2      1.5
3      2.0
4      2.2
5      2.9
6      3.0
7      3.2
8      3.2
9      3.7
10     3.9
11     4.0
12     4.0
13     4.1
14     4.5
15     4.9
16     5.1
17     5.3
18     5.9
19     6.0
20     6.8
21     7.1
22     7.9
23     8.2
24     8.7
25     9.0
26     9.5
27     9.6
28    10.3
29    10.5
Name: YearsExperience, dtype: float64
y
0      39343.0
1      46205.0
2      37731.0
3      43525.0
4      39891.0
5      56642.0
6      60150.0
7      54445.0
8      64445.0
9      57189.0
10     63218.0
11     55794.0
12     56957.0
13     57081.0
14     61111.0
15     67938.0
16     66029.0
17     83088.0
18     81363.0
19     93940.0
20     91738.0
21     98273.0
22    101302.0
23    113812.0
24    109431.0
25    105582.0
26    116969.0
27    112635.0
28    122391.0
29    121872.0
Name: Salary, dtype: float64

Explor the data

plt.scatter(X,y,color="blue")
plt.xlabel("YEAR OF EXPERERIENCE", fontsize=20,color="blue")
plt.ylabel("SALARY",fontsize=20,color="blue")
plt.show()

png

Creating the regresion with statsmodel

x=sm.add_constant(X)
results=sm.OLS(y,x).fit()
results.summary()
OLS Regression Results
Dep. Variable: Salary R-squared: 0.957
Model: OLS Adj. R-squared: 0.955
Method: Least Squares F-statistic: 622.5
Date: Thu, 12 Nov 2020 Prob (F-statistic): 1.14e-20
Time: 17:21:23 Log-Likelihood: -301.44
No. Observations: 30 AIC: 606.9
Df Residuals: 28 BIC: 609.7
Df Model: 1
Covariance Type: nonrobust
coef std err t P>|t| [0.025 0.975]
const 2.579e+04 2273.053 11.347 0.000 2.11e+04 3.04e+04
YearsExperience 9449.9623 378.755 24.950 0.000 8674.119 1.02e+04
Omnibus: 2.140 Durbin-Watson: 1.648
Prob(Omnibus): 0.343 Jarque-Bera (JB): 1.569
Skew: 0.363 Prob(JB): 0.456
Kurtosis: 2.147 Cond. No. 13.2


Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

plot the regresion

plt.scatter(X,y ,color="red")
y_pred=2.579e+04+X*9449.9623
fig=plt.plot(X,y_pred,color="black")
plt.xlabel("YEAR EXPERIENCE",fontsize="20")
plt.ylabel("SALARY",fontsize="20")
plt.show()

png

About

Predicting employee salary based on their experience using linear regression.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published