# Scikit-learn for Beginners

This notebook teaches **Scikit-learn** using **one simple real-life example**: predicting student results.

Think of Scikit-learn as a **teaching machine how to learn patterns from examples**.

You will learn:
- What Machine Learning is
- What Scikit-learn does
- Training a simple model
- Making predictions
- Checking accuracy

üëâ This notebook is **Google Colab ready**.


In [None]:
# Step 1: Import required libraries
# Scikit-learn helps machines learn from data
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error

print("Libraries imported successfully")


## üìä Our One Example: Student Study Hours vs Marks

Imagine:
- Students study for certain hours
- They get marks based on study time

We will teach the computer this pattern.


In [None]:
# Create a simple dataset
data = {
    "Study_Hours": [1, 2, 3, 4, 5, 6, 7, 8],
    "Marks": [35, 40, 50, 55, 65, 70, 78, 85]
}

df = pd.DataFrame(data)
df


## üß† What is Machine Learning?

Instead of writing rules like:
> IF study hours = 5 THEN marks = 65

We let the computer **learn the rule by itself**.


## ‚úÇÔ∏è Split Data into Training and Testing

- Training data ‚Üí used to teach the model
- Testing data ‚Üí used to check learning


In [None]:
X = df[["Study_Hours"]]  # input
y = df["Marks"]          # output

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.25, random_state=42
)

print("Training size:", len(X_train))
print("Testing size:", len(X_test))


## üèãÔ∏è Train the Model (Linear Regression)

Linear Regression learns a **straight-line relationship**.


In [None]:
model = LinearRegression()
model.fit(X_train, y_train)

print("Model trained successfully")


## üîÆ Make Predictions

Now the model will predict marks for new study hours.


In [None]:
# Predict marks for students who study 5 and 7 hours
new_hours = pd.DataFrame({"Study_Hours": [5, 7]})
predicted_marks = model.predict(new_hours)

pd.DataFrame({
    "Study_Hours": new_hours["Study_Hours"],
    "Predicted_Marks": predicted_marks
})


## üìè Check Model Accuracy (Simple Way)

We calculate how wrong the predictions are (average error).


In [None]:
y_pred = model.predict(X_test)
error = mean_absolute_error(y_test, y_pred)

print("Average prediction error:", error)


## ‚úÖ Key Takeaway

üëâ **Scikit-learn helps machines learn from examples and make predictions**.

Used in:
- Machine Learning
- AI systems
- Prediction problems
