# 🤖 Scikit-learn — Machine Learning in Python

Scikit-learn (`sklearn`) is a powerful library for machine learning.

It provides tools for:

- Classification (e.g. spam or not spam)
- Regression (e.g. predict house prices)
- Clustering (e.g. group similar customers)
- Model evaluation & preprocessing

To install:
```bash
pip install scikit-learn

✅ Example: Predicting Student Pass/Fail

In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

# Step 1: Sample data
data = {
    "Hours_Studied": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
    "Passed": [0, 0, 0, 0, 0, 1, 1, 1, 1, 1]  # 1 = Passed, 0 = Failed
}
df = pd.DataFrame(data)

# Step 2: Features and label
X = df[["Hours_Studied"]]  # input
y = df["Passed"]           # output

# Step 3: Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Step 4: Choose model
model = LogisticRegression()

# Step 5: Train
model.fit(X_train, y_train)

# Step 6: Predict
predictions = model.predict(X_test)

# Step 7: Evaluate
accuracy = accuracy_score(y_test, predictions)
print("Accuracy:", accuracy)

# Predict for new data
print("Prediction for 7.5 hours studied:", model.predict([[7.5]]))



Accuracy: 0.6666666666666666
Prediction for 7.5 hours studied: [1]




In [8]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score


data = {
    "Hours_Slept": [3, 4, 5, 6, 6.5, 7, 7.5, 8, 8.5, 9],
    "Focused":     [0, 0, 0, 0, 1, 1, 1, 1, 1, 1] 
}

df = pd.DataFrame(data)

x = df[["Hours_Slept"]]  
y = df["Focused"]  

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

model = LogisticRegression()

model.fit(X_train, y_train)

predictions = model.predict(X_test)

accuracy = accuracy_score(y_test, predictions)
print("Accuracy:", accuracy)

print("Prediction for 5.5 hours slept:", model.predict([[5.5]]))

Accuracy: 1.0
Prediction for 5.5 hours slept: [1]


