# Introduction to Linear Regression for K-12 Students

Linear regression is like a detective tool in math and science. It helps us find out how one thing (like how much you study) can affect another thing (like your grades). 

## Simple Linear Regression

Think of it like drawing a straight line through some dots on a graph. Each dot is a piece of information. For example, the dots could show how many hours students study and what grades they get. The line tries to be as close as possible to all these dots and shows us a pattern.

## Multiple Linear Regression

Sometimes, we need to look at more than one thing. For example, not just how much you study, but also how much you sleep before a test. This time, instead of a line, we might think of a flat surface that fits over all our dots.

## Why It's Useful

Linear regression helps us make predictions. For instance, it can help teachers understand how different factors like sleep and study time can affect a student's grades.


# Python Example 1: Predicting Grades Based on Study Hours

```python

```
This code plots how grades increase with more study hours. The dots are actual grades, and the line is what our model predicts.


In [5]:
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
import numpy as np

# Simple Linear Regression: Predicting Grades based on Study Hours
study_hours = np.array([1, 2, 3, 4, 5, 6, 7, 8]).reshape(-1, 1)
grades = np.array([50, 55, 60, 65, 70, 75, 80, 85])

model = LinearRegression()
model.fit(study_hours, grades)
predicted_grades = model.predict(study_hours)

plt.scatter(study_hours, grades, color='green', label='Actual Grades')
plt.plot(study_hours, predicted_grades, color='orange', label='Predicted Line')
plt.title('Study Hours vs Grades')
plt.xlabel('Hours of Study')
plt.ylabel('Grades')
plt.legend()
plt.show()
#model.predict( np.array([20, 2.5]).reshape(-1, 1))

array([90., 95.])

# Python Example 2: Predicting Test Scores Based on Study and Sleep Hours

```python

```
In this example, we predict test scores based on two factors: how much students study and how much they sleep. It shows that linear regression can consider multiple factors at once.


In [2]:
study_hours = np.array([2, 3, 4, 5, 6, 7, 8, 9])
sleep_hours = np.array([6, 7, 5, 8, 7, 7, 8, 6])
test_scores = np.array([75, 80, 85, 90, 95, 90, 95, 90])

X = np.column_stack((study_hours, sleep_hours))
print(X.shape)
model = LinearRegression()
model.fit(X, test_scores)
predicted_scores = model.predict(X)

for actual, predicted, study, sleep in zip(test_scores, predicted_scores, study_hours, sleep_hours):
    print(f"Study: {study} hrs, Sleep: {sleep} hrs, Actual Score: {actual}, Predicted Score: {round(predicted, 2)}")

model.predict( np.array([[6.5, 8],[10,10]]))

(8, 2)
Study: 2 hrs, Sleep: 6 hrs, Actual Score: 75, Predicted Score: 78.53
Study: 3 hrs, Sleep: 7 hrs, Actual Score: 80, Predicted Score: 82.59
Study: 4 hrs, Sleep: 5 hrs, Actual Score: 85, Predicted Score: 80.95
Study: 5 hrs, Sleep: 8 hrs, Actual Score: 90, Predicted Score: 88.79
Study: 6 hrs, Sleep: 7 hrs, Actual Score: 95, Predicted Score: 89.05
Study: 7 hrs, Sleep: 7 hrs, Actual Score: 90, Predicted Score: 91.21
Study: 8 hrs, Sleep: 8 hrs, Actual Score: 95, Predicted Score: 95.26
Study: 9 hrs, Sleep: 6 hrs, Actual Score: 90, Predicted Score: 93.62


array([ 92.02586207, 103.36206897])