#Feature Engineering

**1. What is a parameter?**
   - In machine learning, a parameter refers to a variable that the model learns from the training data.

For example, in a linear regression model:

y = mx + b

- m (slope) and b (intercept) are parameters.
- These values are learned during training by minimizing the error between predictions and actual values.

Key points:
- Parameters are internal to the model.
- They are automatically optimized during training.
- Examples:
  - Weights in a neural network
  - Coefficients in linear or logistic regression
---
**2. What is correlation? What does negative correlation mean?**
  - Correlation is a statistical measure that shows how two variables are related to each other — whether they increase or decrease together, or are not related at all.
- It’s usually measured using Pearson’s correlation coefficient (r), which ranges from -1 to +1:
       - +1 means perfect positive correlation  
       - 0 means no correlation  
       - -1 means perfect negative correlation

- Negative correlation means that as one variable increases, the other decreases, and vice versa.

For example:
- As the number of hours watching TV increases, exam scores might decrease.
- As temperature drops, sales of jackets may increase.
---
**3. Define Machine Learning. What are the main components in Machine Learning?**
 - Machine Learning is a branch of artificial intelligence that allows systems to learn from data and make decisions or predictions without being explicitly programmed.

- In other words, instead of writing fixed rules, the system learns patterns from data to perform tasks.



 - Main Components in Machine Learning:
- Data  
    - The foundation of any machine learning system. It can be labeled (for supervised learning) or unlabeled (for unsupervised learning).
- Model  
    - A mathematical representation that learns from the data and makes predictions or decisions.
- Features  
    - The input variables or characteristics used to train the model.
- Algorithm  
    - A method or procedure, like linear regression or decision trees, that the model uses to learn from data.
- Training  
    - The process where the model learns patterns from the training dataset.
- Evaluation  
    - Testing the model on unseen data to check how well it performs.
- Prediction  
    - Once trained, the model is used to make predictions on new data.

---
**4.  How does loss value help in determining whether the model is good or not?**
- The loss value helps in understanding how well or poorly a machine learning model is performing during training or testing.


 - Workings :
- The loss is a number that represents the difference between the model’s prediction and the actual value. It is calculated using a loss function, such as Mean Squared Error or Cross-Entropy.
- A low loss value means the model's predictions are close to the actual results, so the model is doing well.  
- A high loss value means the predictions are far from the actual results, indicating poor performance.


 - It helps determine model quality:
- Monitors training progress  
   If the loss keeps decreasing during training, it shows that the model is learning correctly.
- Helps in tuning  
   A consistently high loss suggests that the model may need better features, more data, or different hyperparameters.
- Guides optimization  
   Loss is what the training algorithm, like gradient descent, tries to minimize. So, lower loss means a better model.
- Comparison tool  
   You can compare loss values across different models or configurations to choose the best one.

---
**5. What are continuous and categorical variables?**
  - Continuous variables are numerical values that can take any value within a range. They are often measured and can have decimals.

Examples:  
- Height  
- Weight  
- Temperature  
- Age (if measured precisely)

These values can be split infinitely (like 22.5, 22.55, 22.555, etc.).


 - Categorical variables represent categories or groups. They are not measured but labeled or classified. These can be:
- Nominal: No specific order (like colors: red, green, blue)  
- Ordinal: Have a meaningful order (like low, medium, high)

Examples:  
- Gender (male, female)  
- City (Mumbai, Delhi, Chennai)  
- Education level (High School, Graduate, Postgraduate)

---
**6. How do we handle categorical variables in Machine Learning? What are the common techniques?**
 - Handling categorical variables in machine learning is important because most algorithms require numerical inputs. There are several techniques for converting categorical data into a format that models can understand.


 - Common techniques to handle categorical variables:
- Label Encoding  
    - This technique assigns a unique integer value to each category. It’s simple but can introduce an implicit ordinal relationship where none exists.

   Example:  
   Colors: Red = 0, Green = 1, Blue = 2  
   This method is suitable for ordinal variables, where there is a meaningful order.
- One-Hot Encoding  
    - One-hot encoding creates new binary columns for each category. If the category exists, it’s marked with a 1; otherwise, it’s 0.

   Example:  
   Colors:  
   Red → [1, 0, 0]  
   Green → [0, 1, 0]  
   Blue → [0, 0, 1]  
   This method works well for nominal variables, where there's no natural order.
- Binary Encoding  
    - This method first converts categories into numbers, then encodes those numbers in binary format. It’s more memory-efficient than one-hot encoding for variables with many categories.

   Example:  
   If categories are Red, Green, and Blue, you convert them to 1, 2, 3 and then into binary:  
   Red → 01  
   Green → 10  
   Blue → 11
- Count (Frequency) Encoding  
    - In this technique, each category is replaced with the frequency or count of its occurrences in the dataset. This is helpful when some categories are more frequent than others.

   Example:  
   Colors:  
   Red → 5  
   Green → 10  
   Blue → 15
- Target Encoding (Mean Encoding)  
   - This method replaces categories with the mean of the target variable for that category. It can lead to better model performance but should be used carefully to avoid data leakage.

   Example:  
   If the target variable is sale amount, then the encoding for a color might be the average sales of that color.

---
**7. What do you mean by training and testing a dataset?**
   -  Training and testing a dataset are two important steps in building and evaluating a machine learning model.


 - Training a dataset means using a portion of the data to teach the model. The model learns patterns, relationships, and structures from this data to make predictions.
- Example: If you're training a model to predict house prices, the training data would include house features like area, location, number of rooms, and their actual prices.


 - Testing a dataset means using a separate portion of the data (not seen during training) to evaluate how well the model performs on new, unseen data. This helps check if the model has really learned or just memorized the training data.
- Continuing the house price example: The testing data would have similar house features, and the model will try to predict the price. Then, we compare the prediction to the actual price to measure accuracy.

---

**8. What is sklearn.preprocessing?**
  - "sklearn.preprocessing" is a module in the Scikit-learn library that provides tools for scaling, transforming, and encoding data before feeding it into a machine learning model.
- Since most models work better with properly formatted and scaled data, this module helps prepare your dataset in a way that improves model performance.


 - Some commonly used functions in "sklearn.preprocessing" :
- StandardScaler  
    - Scales features to have mean 0 and standard deviation 1.
- MinMaxScaler  
    - Scales features to a fixed range, usually 0 to 1.
- LabelEncoder  
    - Converts categorical labels (like "yes", "no") into numeric values (like 1, 0).
- OneHotEncoder  
    - Converts categorical features into a set of binary columns.
- OrdinalEncoder  
    - Assigns ordered numerical values to categories, useful for ordinal data.
- Binarizer  
    - Converts numerical values into binary (0 or 1) based on a threshold.
- PolynomialFeatures  
    - Generates polynomial combinations of features, useful for feature engineering.

---

**9. What is a Test set?**
  - A test set is a portion of your dataset that is used to evaluate the performance of your machine learning model after it has been trained.

- It is not used during training, so it acts like new, unseen data, helping you understand how well the model might perform in the real world.

We use a test set :

- To check if the model has truly learned patterns or just memorized the training data  
- To estimate how the model will perform on actual, future data  
- To compare different models or tuning settings fairly

Example:

Imagine you have 1,000 rows of data. You might split it like this:

- 800 rows for training  
- 200 rows for testing  

The model is trained on the 800 rows and then tested on the remaining 200 to check its accuracy, precision, recall, and so on.

---
**10.  How do we split data for model fitting (training and testing) in Python?  How do you approach a Machine Learning problem?**
  - In Python, we commonly use Scikit-learn’s `train_test_split` function to split the dataset.

Example:

```python
from sklearn.model_selection import train_test_split

# X is the feature set, y is the target variable
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
```

test_size=0.2 means 20 percent of the data is reserved for testing, and 80 percent is used for training.  

random_state=42 ensures reproducibility so you get the same split every time you run the code.


 - Here’s a typical step-by-step approach:
- Understand the problem  
    - Clearly define the problem you are solving, like classification or regression.
- Collect and explore data  
    - Gather the dataset, explore it using summaries and visualizations, and check for missing values.
- Preprocess the data  
    - Handle missing values, encode categorical variables, scale features, and split the data.
- Choose a model  
    - Select a model depending on your problem, such as linear regression or decision tree.
- Train the model  
    - Fit the model using the training data.
- Evaluate the model  
    - Test it using the test set and calculate evaluation metrics such as accuracy, precision, recall, or RMSE.
- Tune the model  
    - Improve performance using cross-validation, grid search, or hyperparameter tuning.
- Deploy the model  
    - Once satisfied with performance, deploy it to make real-world predictions.
- Monitor and maintain  
    - Keep track of the model’s performance over time and retrain when necessary.

---
**11. Why do we have to perform EDA before fitting a model to the data?**
  - EDA, or Exploratory Data Analysis, is a very important step before fitting a machine learning model because it helps you understand the data better and make informed decisions during preprocessing and modeling.


 - EDA is necessary before model fitting:
- Understand the structure of the data  
    - EDA gives you an overview of the dataset: number of rows, columns, data types, and feature distributions.
- Detect missing or inconsistent data  
    - You can find missing values, duplicates, or incorrect data entries that could affect model performance.
- Identify relationships between variables  
    - EDA helps reveal how features relate to each other and to the target variable. This helps in feature selection and engineering.
- Spot outliers or unusual patterns  
    - Outliers can negatively impact model accuracy. EDA helps detect and handle them appropriately.
- Choose the right encoding and scaling techniques  
    - By understanding variable types (categorical or continuous), you can decide how to encode or scale them.
- Avoid wrong assumptions  
    - Sometimes assumptions like "more data means better model" don’t hold if the data quality is bad. EDA helps verify quality.
- Decide modeling strategy  
    - Based on your insights from EDA, you can choose which models or techniques might work best.

---

**12. What is correlation?**
  - Correlation is a statistical measure that describes the strength and direction of a relationship between two variables.

- In machine learning and data analysis, correlation helps us understand how two features move in relation to each other. It’s especially useful during feature selection to identify redundant or strongly related variables.


 - Correlation tells us:
- If two variables increase or decrease together, they have a positive correlation.  
- If one variable increases while the other decreases, they have a negative correlation.  
- If the variables have no predictable pattern, the correlation is close to zero.


  - Correlation values range between -1 and +1:
- +1 means a perfect positive relationship  
- -1 means a perfect negative relationship  
- 0 means no relationship at all


 - Example:  
- If study time and exam score have a correlation of 0.85, it means students who study more tend to score higher.  
- If age and interest in video games have a correlation of -0.60, it suggests older people are less likely to play video games.

---

**13. What does negative correlation mean?**
  - Negative correlation means that as one variable increases, the other decreases. In other words, the two variables move in opposite directions.


- Example:  
 - If the number of hours spent watching TV goes up, and grades in school go down, these two might have a negative correlation.  
 -If the temperature decreases and the sales of hot coffee increase, that's also a negative correlation.


 - Numeric value:  
- A negative correlation is represented by a correlation coefficient between 0 and -1.  
- A value close to -1 means a strong negative correlation.  
- A value close to 0 means the relationship is weak or there is no correlation.

 - So, when you see a negative correlation in your data, it means that higher values of one feature are generally associated with lower values of another.

---

**14. How can you find correlation between variables in Python?**
  - Basic Example using `pandas`:

```python
import pandas as pd

# Sample DataFrame
data = {
    'Hours_Studied': [2, 4, 6, 8, 10],
    'Exam_Score': [50, 60, 70, 80, 90],
    'TV_Watched': [8, 6, 4, 2, 1]
}

df = pd.DataFrame(data)

# Calculate correlation
correlation_matrix = df.corr()

print(correlation_matrix)
```

- Output will look like this:

```
               Hours_Studied  Exam_Score  TV_Watched
Hours_Studied         1.000      1.000     -0.984
Exam_Score            1.000      1.000     -0.984
TV_Watched           -0.984     -0.984      1.000
```

This shows:
- A strong "positive correlation" between `Hours_Studied` and `Exam_Score`
- A strong "negative correlation" between `TV_Watched` and both `Hours_Studied` and `Exam_Score`

---
**15. What is causation? Explain difference between correlation and causation with an example ?**
- Causation means that one variable directly affects or causes a change in another.

- In other words, if variable A causes variable B to change, then there's causation between A and B.

- Difference Between Correlation and Causation:

| Feature       | Correlation                              | Causation                                  |
|---------------|------------------------------------------|---------------------------------------------|
| Meaning       | Two variables move together              | One variable causes the change in another   |
| Direction     | Can be positive, negative, or zero       | Always has a cause-effect direction         |
| Implication   | Does not imply cause and effect          | Implies direct influence or effect          |

- Example:

- Correlation (No Causation):  
  Ice cream sales and drowning cases are positively correlated.  
  But eating ice cream doesn’t cause drowning.  
  The real cause is hot weather, which increases both swimming and ice cream sales.

- Causation:  
  Smoking causes lung damage.  
  This is backed by scientific evidence, so smoking and lung disease have causation.

---
**16. What is an Optimizer? What are different types of optimizers? Explain each with an example ?**
  - An optimizer is a technique used in machine learning to adjust the model’s parameters (like weights and biases) in order to minimize the loss function. The goal is to make the model's predictions as accurate as possible.
- In simple terms, the optimizer helps the model learn by improving how well it performs with each step.


- Types of Optimizers:
 - Gradient Descent  
It is the most basic and widely used optimizer. It calculates the gradient (slope) of the loss function and updates the parameters in the direction that reduces the loss.

Formula:  
new_weight = old_weight - learning_rate * gradient

Example:
```python
optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)
```

2. Stochastic Gradient Descent (SGD)  
Unlike batch gradient descent (which uses all data), SGD updates weights using one data point at a time. It’s faster but more noisy.

Example:
```python
optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)
```

3. Mini-Batch Gradient Descent  
A combination of both batch and SGD — it updates weights using small chunks of data. This balances speed and stability.

Used internally in most training loops.

4. Momentum  
Momentum adds a "memory" of past gradients to speed up training and avoid oscillations.

Example:
```python
optimizer = tf.keras.optimizers.SGD(learning_rate=0.01, momentum=0.9)
```

5. RMSprop (Root Mean Square Propagation)  
It adapts the learning rate for each parameter by dividing the gradient by a moving average of past squared gradients. Works well for RNNs.

Example:
```python
optimizer = tf.keras.optimizers.RMSprop(learning_rate=0.001)
```

6. Adam (Adaptive Moment Estimation)  
One of the most popular optimizers. It combines Momentum and RMSprop. It adjusts learning rates for each parameter and maintains running averages of both gradients and their squares.

Example:
```python
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
```

---

**17. What is sklearn.linear_model?**
   - `sklearn.linear_model` is a module in the scikit-learn library that provides classes and functions to implement linear models for regression and classification tasks.

- It includes machine learning models that try to draw a straight line (or a hyperplane) through the data to make predictions.

- Common Models in `sklearn.linear_model`:
- LinearRegression  :
Used for predicting continuous values using a straight-line relationship.  
Example: Predicting house prices based on size.

```python
from sklearn.linear_model import LinearRegression

model = LinearRegression()
model.fit(X_train, y_train)
predictions = model.predict(X_test)
```

- LogisticRegression  :
Used for binary or multiclass classification problems.  
Example: Predicting if an email is spam or not.

```python
from sklearn.linear_model import LogisticRegression

model = LogisticRegression()
model.fit(X_train, y_train)
predictions = model.predict(X_test)
```

- Ridge Regression (Ridge)  :
Linear regression with L2 regularization (helps reduce overfitting).

```python
from sklearn.linear_model import Ridge

model = Ridge(alpha=1.0)
model.fit(X_train, y_train)
```

- Lasso Regression (Lasso) :
Linear regression with L1 regularization (can shrink some coefficients to 0, useful for feature selection).

```python
from sklearn.linear_model import Lasso

model = Lasso(alpha=0.1)
model.fit(X_train, y_train)
```

---

**18.  What does model.fit() do? What arguments must be given?**
   - The ".fit()" function in machine learning is used to train a model on the provided data. It allows the model to learn the relationship between the input features (X) and the target/output (y).

- Syntax :

```python
model.fit(X, y)
```

- Arguments:

- X – Input features (independent variables)  
   - Type: array-like (2D), shape: (n_samples, n_features)  
   - Example: data like age, income, number of hours studied, etc.

- y – Target values (dependent variable)  
   - Type: array-like (1D or 2D depending on the task)  
   - Example: labels like price, category, or pass/fail

- Example:

```python
from sklearn.linear_model import LinearRegression

X = [[1], [2], [3], [4], [5]]
y = [2, 4, 6, 8, 10]

model = LinearRegression()
model.fit(X, y)
```

In this example, the model learns the pattern between the input X and the output y.

---

**19. What does model.predict() do? What arguments must be given?**
   - The `model.predict()` function is used after training the model (using `.fit()`) to make predictions on new or unseen data.

- It takes input features (X) and returns the predicted output (ŷ) based on the patterns the model has learned.

- Syntax:

```python
predictions = model.predict(X_new)
```

- Arguments:

- X_new – The input data for which you want to make predictions.  
  - Type: array-like or DataFrame  
  - Shape: (n_samples, n_features)  
  - It must have the same number of features as the data used during training.

- Example:

```python
from sklearn.linear_model import LinearRegression

X_train = [[1], [2], [3]]
y_train = [2, 4, 6]

X_test = [[4], [5]]

model = LinearRegression()
model.fit(X_train, y_train)

predictions = model.predict(X_test)
print(predictions)  # Output: [8. 10.]
```


 - Output:
- Returns an array of predicted values
- Output shape depends on the model and type of task (regression or classification)

---
**20. What are continuous and categorical variables?**
  - In machine learning and statistics, variables (also called features) are typically classified into two main types: continuous and categorical.


 - Continuous Variables:
- These are numeric variables that can take any value within a range.
- The values are measurable and infinite (in theory).
- Examples:
  - Height (e.g., 165.4 cm)
  - Temperature (e.g., 36.6°C)
  - Age, weight, income, time, etc.

Example:
```python
age = [23, 34, 45, 28]
price = [12000.5, 13499.9, 11000.0, 15000.0]
```


 - Categorical Variables:
- These are variables that represent groups or categories.
- The values are discrete and limited.
- Can be:
  - Nominal: No order (e.g., gender, city, color)
  - Ordinal: Ordered categories (e.g., low, medium, high)

Examples:
  - Gender (Male, Female)
  - Product category (Electronics, Furniture)
  - Education level (High School, Graduate, Postgraduate)

Example:
```python
gender = ['Male', 'Female', 'Female', 'Male']
city = ['Delhi', 'Mumbai', 'Chennai', 'Delhi']
```

---
**21. What is feature scaling? How does it help in Machine Learning?**
  - Feature scaling is a preprocessing technique in machine learning used to normalize or standardize the range of independent variables (features).

- Different features in a dataset may have different units or ranges — and this can negatively affect the performance of many machine learning models.


 - Feature Scaling is Important:
- Some models (like KNN, SVM, Logistic Regression, and Gradient Descent-based models) are sensitive to the scale of the data.
- Features with larger values can dominate the learning process and bias the model.
- It helps in faster convergence and better accuracy during training.


 - Common Techniques for Feature Scaling:
- Min-Max Scaling (Normalization) :  
   Scales all values between 0 and 1.  
   Formula:  
   \[
   X_{\text{scaled}} = \frac{X - X_{\min}}{X_{\max} - X_{\min}}
   \]
   Example:
   ```python
   from sklearn.preprocessing import MinMaxScaler

   scaler = MinMaxScaler()
   X_scaled = scaler.fit_transform(X)
   ```
- Standardization (Z-score Normalization) :
   Centers data around the mean with unit variance.  
   Formula:  
   \[
   X_{\text{scaled}} = \frac{X - \mu}{\sigma}
   \]
   Example:
   ```python
   from sklearn.preprocessing import StandardScaler

   scaler = StandardScaler()
   X_scaled = scaler.fit_transform(X)
   ```

 - When Not to Use Feature Scaling:
- Tree-based models like Decision Trees, Random Forest, and XGBoost do not require feature scaling because they split data based on thresholds, not distances.

---
**22. How do we perform scaling in Python?**
   - Scaling can be performed using the `sklearn.preprocessing` module, which provides the necessary classes and functions to apply various scaling techniques like Min-Max Scaling and Standardization.

- Min-Max Scaling (Normalization) :

This scales the data between a specific range, usually [0, 1].

- Example:

```python
from sklearn.preprocessing import MinMaxScaler

# Example dataset
X = [[1], [2], [3], [4], [5]]

# Initialize the MinMaxScaler
scaler = MinMaxScaler()

# Fit and transform the data
X_scaled = scaler.fit_transform(X)

print(X_scaled)
```

- Standardization (Z-score Normalization) :

This scales the data so that it has a mean of 0 and a standard deviation of 1.

- Example:

```python
from sklearn.preprocessing import StandardScaler

# Example dataset
X = [[1], [2], [3], [4], [5]]

# Initialize the StandardScaler
scaler = StandardScaler()

# Fit and transform the data
X_scaled = scaler.fit_transform(X)

print(X_scaled)
```

- Handling Multiple Features (Multiple Columns) :

Scaling can be applied to datasets with multiple features (columns).


 - Example for Multiple Features:

```python
from sklearn.preprocessing import MinMaxScaler

# Example dataset with multiple features
X = [[1, 200], [2, 300], [3, 400], [4, 500], [5, 600]]

# Initialize the MinMaxScaler
scaler = MinMaxScaler()

# Fit and transform the data
X_scaled = scaler.fit_transform(X)

print(X_scaled)
```

---
**23. What is sklearn.preprocessing?**
  - `sklearn.preprocessing` is a module in the scikit-learn library that provides a collection of functions and classes used for preprocessing data before training machine learning models.

- It helps in scaling, transforming, encoding, and normalizing data to make it suitable for machine learning algorithms.

- Common Tasks You Can Do with `sklearn.preprocessing`:

- Scaling/Normalizing Features :
   - `StandardScaler`: Standardizes features (mean = 0, standard deviation = 1)
   - `MinMaxScaler`: Scales features to a fixed range (usually 0 to 1)
   - `RobustScaler`: Scales features using median and IQR, useful when data has outliers

- Encoding Categorical Features :
   - `LabelEncoder`: Converts class labels (target) to numbers
   - `OneHotEncoder`: Converts categorical variables to one-hot encoded format (dummy variables)
   - `OrdinalEncoder`: Encodes categorical features with an ordinal relationship

- Handling Missing Values :
   - `SimpleImputer`: Fills missing values with strategies like mean, median, or a constant

- Polynomial and Custom Feature Creation :
   - `PolynomialFeatures`: Creates polynomial and interaction features from existing features

- Example: Scaling and Encoding :

```python
from sklearn.preprocessing import StandardScaler, OneHotEncoder
import numpy as np

# Scaling
data = np.array([[1.0, 20.0], [2.0, 30.0], [3.0, 40.0]])
scaler = StandardScaler()
scaled_data = scaler.fit_transform(data)

# One-hot encoding
encoder = OneHotEncoder()
encoded = encoder.fit_transform([['Male'], ['Female'], ['Female']]).toarray()
```

---

**24. How do we split data for model fitting (training and testing) in Python?**
  - In Python, you can split your dataset into training and testing sets using the `train_test_split` function from `sklearn.model_selection`. This step is essential in machine learning to evaluate how well your model performs on unseen data.


 - Split the Data :
- Training set: Used to train the machine learning model.
- Testing set: Used to evaluate the model’s performance on unseen data.



```python
from sklearn.model_selection import train_test_split

# Suppose X is your features and y is your target variable
# Example:
# X = [[1], [2], [3], [4], [5]]
# y = [10, 20, 30, 40, 50]

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)
```


 - Parameters Explained:
- `X`: Feature data (independent variables)
- `y`: Target variable (dependent variable)
- `test_size`: Proportion of the dataset to include in the test split (e.g., 0.2 = 20% test, 80% train)
- `random_state`: Controls the shuffling process (helps with reproducibility)


 - Optional Parameter:
- `shuffle=True`: Shuffles the data before splitting (default is True)

- Example Output:

If you had 100 rows and used `test_size=0.2`, the function would split:
- 80 rows → Training data
- 20 rows → Testing data

---

**25. Explain data encoding?**
  - Data encoding is the process of converting categorical (non-numeric) data into numeric values so that machine learning algorithms can work with it. Most machine learning models require input features to be numerical, so encoding is an essential part of data preprocessing.

- Types of Data Encoding:

- Label Encoding  
Converts each category into a unique integer.  
Example: `["Red", "Green", "Blue"]` → `[2, 1, 0]`  
Simple, but may introduce an unintended order.

```python
from sklearn.preprocessing import LabelEncoder

data = ['Red', 'Green', 'Blue']
le = LabelEncoder()
encoded = le.fit_transform(data)
print(encoded)  # [2 1 0]
```

- One-Hot Encoding  
Converts each category into a new binary column.  
Avoids introducing ordinal relationships between categories.

```python
from sklearn.preprocessing import OneHotEncoder
import numpy as np

data = np.array([['Red'], ['Green'], ['Blue']])
encoder = OneHotEncoder(sparse=False)
encoded = encoder.fit_transform(data)
print(encoded)
```

Output:
```
[[0. 0. 1.]
 [0. 1. 0.]
 [1. 0. 0.]]
```

- Ordinal Encoding  
Similar to label encoding, but used for categories with a meaningful order.  
Example: `["Low", "Medium", "High"]` → `[0, 1, 2]`

```python
from sklearn.preprocessing import OrdinalEncoder

data = [['Low'], ['Medium'], ['High']]
encoder = OrdinalEncoder(categories=[['Low', 'Medium', 'High']])
encoded = encoder.fit_transform(data)
print(encoded)
```


 - When to Use Which:
- Label Encoding: For target variable or ordinal data  
- One-Hot Encoding: For unordered categorical variables  
- Ordinal Encoding: For ordered categorical variables

---