##‚úÖ FEATURE SELECTION

üåü What Is Feature Selection?

Feature = Column in your dataset.
Feature selection means:

üëâ Choosing the useful columns
üëâ Removing the useless columns

Just like studying for an exam:

You keep important topics

You ignore unimportant topics

Machine learning works the same way.

‚ùì Why Do We Need Feature Selection?

Because:

Some columns don‚Äôt help the model

Some columns confuse the model

Too many columns = model becomes slow & less accurate

‚úÖ Fewer features =

Faster training

Better accuracy

Easier to understand

Less overfitting

# üåº Simple Real-Life Example

Imagine you want to predict salary.

| Feature             | Useful?                | Why                                             |
| ------------------- | ---------------------- | ----------------------------------------------- |
| Name                | ‚ùå No                   | Name has no effect on salary                    |
| Gender              | ‚úÖ Maybe                | If there is variation, model can learn patterns |
| Years of Experience | ‚úÖ Yes                  | Strong factor in salary                         |
| Country             | ‚ùå No (if same for all) | No variation ‚Üí no learning                      |


üëâ A feature with no variation = totally useless.

üéØ Golden Rule

Models learn from patterns, not from random or constant values.

If a column has:

Same value everywhere ‚Üí Remove it

Random/unrelated values ‚Üí Remove it

Strong relation with target ‚Üí Keep it

##üß† Parameters vs Hyperparameters

üëã Imagine You‚Äôre a Baker üçû

Think of training a machine learning model like baking bread.

The recipe = your model (for example, Linear Regression, Random Forest, etc.)

The ingredients = data you feed it

The kneading, baking, and mixing process = training the model

Now, in this process, there are two kinds of adjustable things:

1Ô∏è‚É£ Model Parameters ‚Äî The ‚ÄúLearned‚Äù Settings
üß© Analogy:

When you bake bread, the dough texture automatically changes as you knead it.
You don‚Äôt set that texture by hand ‚Äî it‚Äôs learned naturally during kneading.

Similarly, model parameters are the internal values the model learns by itself during training.

üí° Examples:

In Linear Regression, the model learns two numbers:

slope (m)

intercept (c)
So the line equation is:

y=mx+c

The model automatically figures out which m and c make the line fit the data best.

In a Neural Network, the weights and biases between layers are parameters.
They‚Äôre updated every time the model trains on more data.

üß† Key idea:

Parameters are what the model learns to make predictions.

2Ô∏è‚É£ Model Hyperparameters ‚Äî The ‚ÄúBefore Training‚Äù Settings

üß© Analogy:

Before you start baking, you decide:

How hot the oven should be üî•

How long to bake üçû

How much yeast to use

These are things you choose beforehand ‚Äî not something the dough decides.

In machine learning, hyperparameters are the settings you choose before training begins.
The model doesn‚Äôt learn them; you set them manually (or tune them using search methods).

üí° Examples:

Learning rate in Gradient Descent ‚Üí how fast the model learns

Number of trees in a Random Forest

Number of clusters (k) in K-Means

Number of epochs in a Neural Network ‚Üí how many times to train on the full dataset

‚ö†Ô∏è Important:

If you set a bad hyperparameter:

Too low learning rate ‚Üí model learns very slowly (underfitting)

Too high learning rate ‚Üí model overshoots and never learns properly (overfitting)

üßÆ sklearn Example

Let‚Äôs see this with a  scikit-learn example.

Example: Linear Regression

In [3]:
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_regression

# Create sample data
X, y = make_regression(n_samples=100, n_features=1, noise=10, random_state=42)

# Split into train/test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and train model
model = LinearRegression()  # no hyperparameters changed yet
model.fit(X_train, y_train)

print("Model Parameters:")
print("Slope (coefficient):", model.coef_)
print("Intercept:", model.intercept_)


Model Parameters:
Slope (coefficient): [44.24418216]
Intercept: 0.09922221422587718


What‚Äôs happening:

model.coef_ and model.intercept_ ‚Üí are parameters learned from data.

LinearRegression() itself has some hyperparameters, e.g.:

In [4]:
LinearRegression(fit_intercept=True, copy_X=True)


These are set before training.

##‚öñÔ∏è Quick Comparison Table

| Feature                         | Parameters                    | Hyperparameters                                   |
| ------------------------------- | ----------------------------- | ------------------------------------------------- |
| **When set**                    | During training               | Before training                                   |
| **Who sets them**               | Model learns automatically    | You (the user)                                    |
| **Example (Linear Regression)** | slope `m`, intercept `c`      | `fit_intercept=True`                              |
| **Example (Neural Network)**    | weights, biases               | learning rate, epochs                             |
| **Example (Random Forest)**     | tree splits                   | number of trees, max depth                        |
| **Can be tuned automatically?** | Yes, learned via optimization | Yes, tuned via GridSearchCV or RandomizedSearchCV |


üß≠ In Short

| Concept             | Simple Description                                                                              |
| ------------------- | ----------------------------------------------------------------------------------------------- |
| **Parameters**      | Learned automatically from data                                                                 |
| **Hyperparameters** | Chosen manually before training                                                                 |
| **Why they matter** | Both affect performance ‚Äî parameters decide predictions, hyperparameters decide *how* it learns |


‚úÖ Summary Analogy:
| Baking Analogy                         | ML Concept                          |
| -------------------------------------- | ----------------------------------- |
| Dough texture changing while kneading  | Model learns parameters             |
| Oven temperature, baking time          | Hyperparameters set before training |
| Adjusting these improves bread quality | Tuning improves model performance   |


üß† GOAL

We want to find the best hyperparameters for a model ‚Äî here, a Random Forest Classifier ‚Äî using GridSearchCV from scikit-learn.

üí° Analogy: ‚ÄúTrying All Recipes to Find the Best Cookies üç™‚Äù

Imagine you‚Äôre baking cookies.

You try:

10 minutes, 15 minutes, 20 minutes (different baking times)

170¬∞C, 180¬∞C (different temperatures)

You bake batches with every combination of time and temperature
‚Üí taste each batch ‚Üí choose the one that tastes best.

That‚Äôs exactly what GridSearchCV does!
It tries all combinations of hyperparameters and picks the best.

üß© Step 1: Import Libraries

In [5]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.ensemble import RandomForestClassifier


Explanation:

load_iris() ‚Üí gives us a simple flower dataset (used for classification).

train_test_split() ‚Üí splits data into training and testing parts.

GridSearchCV ‚Üí helps us try many combinations of hyperparameters.

RandomForestClassifier ‚Üí the model we want to tune.

üß© Step 2: Load the Sample Data

In [6]:
# Load example dataset
data = load_iris()
X = data.data       # features (flower measurements)
y = data.target     # labels (flower species)

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)


What it means:

X ‚Üí the input data (flower features: sepal length, petal width, etc.)

y ‚Üí the target output (flower type)

train_test_split ‚Üí we train on 70% of the data, test on 30%.

üß© Step 3: Create a Parameter Grid

In [7]:
param_grid = {
    'n_estimators': [10, 50, 100],  # number of trees
    'max_depth': [None, 3, 5]       # how deep each tree can grow
}


Explanation:

This is like your list of recipes.
We‚Äôre telling GridSearchCV:

‚ÄúTry all combinations of these values.‚Äù

That means it will train:

(10 trees, depth=None)

(10 trees, depth=3)

(10 trees, depth=5)

(50 trees, depth=None)

(50 trees, depth=3)

(50 trees, depth=5)

(100 trees, depth=None)

(100 trees, depth=3)

(100 trees, depth=5)

That‚Äôs 9 combinations total.

üß© Step 4: Create the GridSearchCV Object

In [8]:
grid = GridSearchCV(RandomForestClassifier(), param_grid, cv=3)


Explanation:

RandomForestClassifier() ‚Üí the model we‚Äôre testing.

param_grid ‚Üí all combinations of hyperparameters we want to try.

cv=3 ‚Üí ‚Äú3-fold cross-validation.‚Äù

üß© What ‚Äúcv=3‚Äù means:

It splits your training data into 3 parts:

Train on 2 parts

Test on 1 part

Repeat 3 times and take the average accuracy

So each parameter combo is tested 3 times for fairness.

üß© Step 5: Fit the Grid Search

In [9]:
grid.fit(X_train, y_train)


What happens under the hood:

GridSearchCV now:

Takes your 9 combinations of hyperparameters.

For each one:

Trains the model using cross-validation (cv=3)

Measures accuracy

Keeps track of which combo works best.

‚è≥ This may take time if your dataset is large or your parameter grid is big.

üß© Step 6: Get the Best Parameters

In [10]:
print("Best Hyperparameters:", grid.best_params_)


Best Hyperparameters: {'max_depth': 3, 'n_estimators': 100}


Output Example:

You might see something like:

Best Hyperparameters: {'max_depth': 3, 'n_estimators': 50}


That means:

The model with 50 trees (n_estimators=50)

and depth 3 (max_depth=3)
gave the best accuracy during cross-validation.

üß© Step 7: Evaluate the Best Model

In [11]:
best_model = grid.best_estimator_
print("Test Accuracy:", best_model.score(X_test, y_test))


Test Accuracy: 1.0


So our model with the best hyperparameters correctly predicts about 97.7% of the flower types in the test data.

üîÅ Summary of What Happened
| Step | Description           | Analogy                                |
| ---- | --------------------- | -------------------------------------- |
| 1    | Import libraries      | Get your baking tools ready            |
| 2    | Load and split data   | Gather ingredients                     |
| 3    | Create `param_grid`   | Write down different recipes           |
| 4    | Create `GridSearchCV` | Prepare taste tester                   |
| 5    | `fit()`               | Bake and taste all combinations        |
| 6    | `best_params_`        | Pick the best-tasting cookies          |
| 7    | Test accuracy         | Serve to guests and check satisfaction |


üéì In Plain English:

GridSearchCV = automatic hyperparameter tester

param_grid = dictionary of all settings you want to try

cv = how many times to split and re-test each combination

best_params_ = gives you the winning combination

best_estimator_ = the ready-to-use best model