# Applied Simple Linear Regression Model using Python for Beginners
#### Follow this link for full post: [https://www.gettingstarted.ai/applied-simple-linear-regression-model-using-python-for-beginners/](https://www.gettingstarted.ai/applied-simple-linear-regression-model-using-python-for-beginners/)
For this project, we will be putting into practice what we learned in the introductory linear regression article. Using Python, we will construct a basic regression model to make predictions on house prices.

## Import required modules

In [1]:
# Import required libraries

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import MinMaxScaler

## Create DataFrame from Dataset

In [11]:
# Create the DataFrame
data = pd.DataFrame({
    'House': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
    'SquareFootage': [1000, 1200, 1500, 1800, 2200, 1350, 2000, 1750, 1650, 1900, 1300, 2500, 1400, 2050, 2250, 1600, 1950, 2200, 1800, 1250],
    'Price': [100000, 150000, 200000, 250000, 300000, 175000, 225000, 210000, 195000, 240000, 160000, 325000, 170000, 235000, 275000, 190000, 230000, 300000, 250000, 145000]
})

## Demo Data Wrangling

In [12]:
# Drop the 'House' column
data = data.drop('House', axis=1)

# Perform other data wrangling operations if necessary...

In [13]:
# Print first five rows of data
data.head()

Unnamed: 0,SquareFootage,Price
0,1000,100000
1,1200,150000
2,1500,200000
3,1800,250000
4,2200,300000


In [14]:
# Describe the dataframe (statistical summary of dataset)
data.describe()

Unnamed: 0,SquareFootage,Price
count,20.0,20.0
mean,1732.5,216250.0
std,405.642765,58125.88426
min,1000.0,100000.0
25%,1387.5,173750.0
50%,1775.0,217500.0
75%,2012.5,250000.0
max,2500.0,325000.0


## Split the Dataset for Training, and Testing

In [15]:
from sklearn.model_selection import train_test_split

# Split the data into training and testing sets
X = data['SquareFootage'].values.reshape(-1, 1)  # Independent variable
y = data['Price'].values  # Dependent variable

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

## Fit the Model

In [16]:
from sklearn.linear_model import LinearRegression

# Create the regression model object
model = LinearRegression()

# Fit the model to the training data
model.fit(X_train, y_train)

## Print Intercept and Slope

In [17]:
# Print the values of β₀ and β₁
print("Intercept (β₀):", model.intercept_)
print("Slope (β₁):", model.coef_[0])

Intercept (β₀): -7623.808416647284
Slope (β₁): 129.38851429900024


In [18]:
# Calculate R-squared on the testing set
r_squared = model.score(X_test, y_test)
print(f'R-squared: {r_squared:.2f}')

R-squared: 0.95


## Make a Prediction

In [20]:
# Prediction time!
new_sqft = [[1100], [1700], [2000]]
predicted_prices = model.predict(new_sqft)

# Print the predicted prices
for sqft, price in zip(new_sqft, predicted_prices):
    print(f'sqft: {sqft[0]} => price: {price:.2f}')

sqft: 1100 => price: 134703.56
sqft: 1700 => price: 212336.67
sqft: 2000 => price: 251153.22


### Author
jeff @ [gettingstarted.ai](https://www.gettingstarted.ai) &copy; 2023