## Simple Linear Regression Model


### Statsmodels Code:

**Method 1: Using ```sm.OLS()```**

``` Python
import statsmodels.api as sm

# Prepare data
y = data['Y']
X = data['X']

# Using the sm.OLS() method, we need to manually add constant to the model
X = sm.add_constant(X)

# Create the model object
model = sm.OLS(y, X)

# Fit the model to the dataset
result = model.fit()

# Print the summary output
result.summary()
```

**Method 2: Using ```sm.formula.ols()```, R-style formulas**

``` Python
import statsmodels.api as sm

# No need to prepare data
df # contain Y and X columns

# Create the model object
model = sm.formula.ols(formula = 'Y ~ X', data = df)

# Fit the model to the dataset
result = model.fit()

# Print the summary output
result.summary()
```

### SKLearn Code:

``` Python
from sklearn.linear_model import LinearRegression

# Prepare data
y = data['y']
X = data['X']

# Create the model object
reg = LinearRegression()

# Fit the model to the dataset
result = reg.fit(X, y)

# Retrieve estimated slope coefficient
result.coef_

# Retrieve estimated y-intercept coefficient
result.intercept_
```

In [None]:
import statsmodels.api as sm
import numpy as np
duncan_prestige = sm.datasets.get_rdataset("Duncan", "carData")
Y = duncan_prestige.data['income']
X = duncan_prestige.data['education']

In [None]:
# Try to build the model with sm.OLS()
# Using the sm.OLS() method, we need to manually add constant to the model
X = sm.add_constant(X)

# Create the model object
model = sm.OLS(Y, X)

# Fit the model to the dataset
result = model.fit()

# Print the summary output
result.summary()


In [None]:
# Try to build the model with sm.formula.ols()



In [None]:
# Try to build the model with sklearn LinearRegression()

