**Embedded Methods in Feature Selection**

**Definition**:  
Embedded methods are feature selection techniques that are integrated into the training process of a machine learning model. These methods use the model’s own learning algorithm to decide which features are most important, combining the benefits of filter and wrapper methods. They are computationally efficient and often yield highly relevant features.

---

**Key Techniques in Embedded Methods**

1. **LASSO (L1 Regularization)**:  
   - Adds a penalty to the loss function of a model to shrink coefficients of less important features to zero, effectively removing them.
   - Commonly used in linear models (e.g., Logistic Regression, Linear Regression).

2. **Tree-Based Feature Selection**:  
   - Decision tree-based models (e.g., Random Forest, Gradient Boosting) calculate feature importance during training based on metrics like Gini Impurity or Information Gain.

3. **Elastic Net Regularization**:  
   - Combines L1 (LASSO) and L2 (Ridge) penalties to balance between feature selection and coefficient shrinkage.

4. **Regularized Logistic Regression**:  
   - Logistic regression with regularization (L1 or Elastic Net) selects features while building the model.

---

**How It Works (Step-by-Step)**

1. Select a machine learning algorithm with built-in feature selection capability (e.g., LASSO or tree-based models).  
2. Train the model on the dataset.  
3. The algorithm automatically calculates feature importance or removes irrelevant features during the training process.  
4. Extract the most important features based on the algorithm’s output.

---

#### **Code Example**

```python

In [3]:
# Import Libraries
from sklearn.datasets import load_iris
from sklearn.linear_model import Lasso
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
import pandas as pd

# Load Dataset
data = load_iris()
X = data.data  # Features
y = data.target  # Target

# Standardize Features (LASSO works better with normalized data)
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Split the Data
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)

# Apply LASSO for Feature Selection
lasso = Lasso(alpha=0.1)  # Regularization strength
lasso.fit(X_train, y_train)

# Extract Feature Importance
feature_coefficients = pd.DataFrame({
    'Feature': data.feature_names,
    'Coefficient': lasso.coef_
})

# Selected Features
selected_features = feature_coefficients[feature_coefficients['Coefficient'] != 0]['Feature']

# Print Results
print("Feature Importance (LASSO):")
print(feature_coefficients)

print("\nSelected Features:")
print(selected_features.tolist())

Feature Importance (LASSO):
             Feature  Coefficient
0  sepal length (cm)     0.000000
1   sepal width (cm)    -0.000000
2  petal length (cm)     0.314092
3   petal width (cm)     0.375744

Selected Features:
['petal length (cm)', 'petal width (cm)']


---

**Explanation of Code**:
1. **Dataset**: The Iris dataset is used, with four features (e.g., petal length, sepal width).
2. **Standardization**: LASSO requires normalized data for better performance.
3. **Model**: LASSO regression is applied with an alpha value of 0.1 for regularization.
4. **Feature Importance**: Coefficients of features are computed, with zero coefficients indicating irrelevant features.
5. **Output**: Selected features and their coefficients are displayed.

---

**Advantages of Embedded Methods**  
- Computationally efficient compared to wrapper methods.  
- Integrated into the model’s training, reducing the need for external steps.  
- Accounts for feature interactions (especially with tree-based methods).  

**Disadvantages**  
- Dependent on the algorithm used (model-specific).  
- May not work well with all datasets or when irrelevant features dominate.  

Let me know if you'd like further clarification or examples!