<a href="https://colab.research.google.com/github/Sam-Wadmare/ML-LAB/blob/main/lab/cheatsheet.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>



# ‚úÖ **1. BASIC IMPORTS (Remember these 5 only)**

```python
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score
```

These alone cover **80%** of all experiments.

---

# ‚úÖ **2. UNIVERSAL DATA TEMPLATE (works for any faculty dataset)**

```python
df = pd.read_csv("data.csv")   # dataset from lab PC
X = df.iloc[:, :-1]
y = df.iloc[:, -1]

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)
```

---

# ‚úÖ **3. FEATURE SCALING TEMPLATE (use only if needed)**

**Scale for:**

* Logistic Regression
* SVM
* KNN
* Neural Networks (MLP)
* PCA

**Not required for:**

* Decision Tree
* Random Forest
* Naive Bayes
* K-Means (optional but helpful)

```python
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
```

---

# ‚úÖ **4. MODEL IMPORTS + ONE-LINE MODEL CREATION**

## **Classification Algorithms**

| Task                | Import                                                | Model                              |
| ------------------- | ----------------------------------------------------- | ---------------------------------- |
| Logistic Regression | `from sklearn.linear_model import LogisticRegression` | `model = LogisticRegression()`     |
| SVM                 | `from sklearn.svm import SVC`                         | `model = SVC()`                    |
| KNN                 | `from sklearn.neighbors import KNeighborsClassifier`  | `model = KNeighborsClassifier(k)`  |
| Decision Tree       | `from sklearn.tree import DecisionTreeClassifier`     | `model = DecisionTreeClassifier()` |
| Random Forest       | `from sklearn.ensemble import RandomForestClassifier` | `model = RandomForestClassifier()` |
| Naive Bayes         | `from sklearn.naive_bayes import GaussianNB`          | `model = GaussianNB()`             |

---

# ‚ùó **ID3 (Entropy) = Decision Tree + criterion='entropy'**

```python
model = DecisionTreeClassifier(criterion='entropy')
```

---

# ‚úÖ **5. UNIVERSAL TRAIN‚ÄìPREDICT TEMPLATE (works for everything)**

```python
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))
```

---

# ‚úÖ **6. SIMPLEST CONFUSION MATRIX**

```python
from sklearn.metrics import ConfusionMatrixDisplay
ConfusionMatrixDisplay.from_predictions(y_test, y_pred, cmap="Blues")
plt.show()
```

---

# ‚úÖ **7. K-MEANS CLUSTERING**

**Import**

```python
from sklearn.cluster import KMeans
```

**Model**

```python
model = KMeans(n_clusters=3)
labels = model.fit_predict(X)
```

**Simple Visual (2D only)**

```python
plt.scatter(X[:,0], X[:,1], c=labels)
plt.show()
```

---

# ‚úÖ **8. PCA (Dimensionality Reduction)**

**Import**

```python
from sklearn.decomposition import PCA
```

**Model**

```python
pca = PCA(n_components=2)
X_pca = pca.fit_transform(X)
```

**Plot**

```python
plt.scatter(X_pca[:,0], X_pca[:,1], c=y)
plt.show()
```

---

# ‚úÖ **9. Apriori & Association Rules**

**Imports**

```python
from mlxtend.frequent_patterns import apriori, association_rules
```

**Template**

```python
freq = apriori(df, min_support=0.2, use_colnames=True)
rules = association_rules(freq, metric="lift", min_threshold=1)
print(rules)
```

---

# ‚úÖ **10. Train/Test WITHOUT SCALING (for Tree-based models)**

```python
model = DecisionTreeClassifier()
# or
model = RandomForestClassifier()
model.fit(X_train, y_train)
```

These models **do not need scaling**.

---

# ‚úÖ **11. Train/Test WITH SCALING (for distance/gradient models)**

```
Logistic Regression  
SVM  
KNN  
Neural Networks  
PCA
```

Always scale before training.

---

# üî• **12. SUPER-SHORT ‚ÄúWhen to Scale‚Äù Memory Trick**

* If the algorithm uses **distance or gradient**, scale.
* If it uses **rules/trees**, don‚Äôt scale.

**Distance / Gradient ‚Üí SCALE**

* Logistic Regression
* SVM
* KNN
* Neural Networks
* PCA

**Tree / Rule ‚Üí NO SCALE**

* Decision Tree
* Random Forest
* Naive Bayes (optional)
* Bagging
* Boosting

---

# üß† **13. 4-Line Complete ML Template (can write from memory!)**

```python
df = pd.read_csv("data.csv")
X, y = df.iloc[:, :-1], df.iloc[:, -1]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

model = SVC()   # change model here
model.fit(X_train, y_train)
print("Accuracy:", model.score(X_test, y_test))
```

This alone works for 90% of lab experiments.

---

# ‚≠ê If you want, I can also create:

* A **1-page PDF version** of this cheatsheet
* A **laminated-style extremely small table**
* A version tailored to **your exact 15 experiments**

Just tell me **"PDF cheatsheet"** or **"custom for my lab manual"**.
