# Amr Hacoglu - #GRIPAUGUST2024

# 📝 #1 Linear Regression: Predicting Student Scores

## 📋 Overview
This notebook demonstrates a simple linear regression model to predict the percentage score of a student based on the number of study hours. The following steps will be covered:

1. Importing Libraries
2. Loading and Displaying the Dataset
3. Plotting the Data
4. Preparing the Data
5. Splitting the Data into Training and Test Sets
6. Training the Algorithm
7. Making Predictions
8. Plotting the Regression Line
9. Predicting the Score for a Given Input
10. Evaluating the Model

Let's start! 🚀


# Importing Libraries

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn import metrics

# Loading and Displaying the Dataset

In [None]:
data = {
    'Hours': [2.5, 5.1, 3.2, 8.5, 3.5, 1.5, 9.2, 5.5, 8.3, 2.7, 7.7, 5.9, 4.5, 3.3, 1.1, 8.9, 2.5, 1.9, 6.1, 7.4, 2.7, 4.8, 3.8, 6.9, 7.8],
    'Scores': [21, 47, 27, 75, 30, 20, 88, 60, 81, 25, 85, 62, 41, 42, 17, 95, 30, 24, 67, 69, 30, 54, 35, 76, 86]
}
s_data = pd.DataFrame(data)

In [None]:
s_data.head()

# Plotting the Data

In [None]:
plt.scatter(s_data['Hours'], s_data['Scores'], color='blue')
plt.title('Hours vs Percentage')
plt.xlabel('Hours Studied')
plt.ylabel('Percentage Score')
plt.show()

# Preparing the Data

In [None]:
X = s_data.iloc[:, :-1].values
y = s_data.iloc[:, 1].values

# Splitting the Data into Training and Test Sets

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

# Training the Algorithm

In [None]:
regressor = LinearRegression()
regressor.fit(X_train, y_train)

# Making Predictions

In [None]:
y_pred = regressor.predict(X_test)

# Plotting the Regression Line

In [None]:
plt.scatter(X_train, y_train, color='blue')
plt.plot(X_train, regressor.predict(X_train), color='red')
plt.title('Hours vs Percentage (Training set)')
plt.xlabel('Hours Studied')
plt.ylabel('Percentage Score')
plt.show()

#  Predicting the Score for a Given Input

In [None]:
hours = np.array([[9.25]])
predicted_score = regressor.predict(hours)[0]
print(f"Predicted Score if a student studies for 9.25 hours/day: {predicted_score}")

# Evaluating the Model

In [None]:
mae = metrics.mean_absolute_error(y_test, y_pred)
print(f"Mean Absolute Error: {mae}")