### **Composite Transformers**

**Composite Transformers** in scikit-learn are transformers that allow you to combine multiple transformation steps into a single unit, making it easier to manage complex data preprocessing workflows. This can be done by using various techniques such as pipelines or specific composite transformers like `FeatureUnion` or `ColumnTransformer`.

Here’s a breakdown of the most common composite transformers and how to use them:

### 1. **Pipeline**

The **Pipeline** is a linear sequence of transformers followed by an estimator. This is the most basic way to chain multiple transformers together and ensure that each step's output is passed as input to the next step. It's a common method for combining preprocessing and modeling.

#### Example:

```python
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression

# Create a pipeline that standardizes the data and then applies logistic regression
pipeline = Pipeline([
    ('scaler', StandardScaler()),    # Step 1: Scale the features
    ('classifier', LogisticRegression())  # Step 2: Fit logistic regression model
])

# Fit the pipeline
pipeline.fit(X_train, y_train)

# Predict with the pipeline
predictions = pipeline.predict(X_test)
```

### 2. **FeatureUnion**

The **FeatureUnion** allows you to apply multiple transformers in parallel to the same dataset and concatenate their results. This is useful when you want to apply different transformations to the same data and combine their outputs.

#### Example:

```python
from sklearn.pipeline import FeatureUnion
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler

# Define individual transformers
scaler = StandardScaler()
pca = PCA(n_components=2)

# Combine them using FeatureUnion
combined_transformer = FeatureUnion([
    ('scaler', scaler),  # Standardize the features
    ('pca', pca)         # Apply PCA for dimensionality reduction
])

# Apply the combined transformations
X_transformed = combined_transformer.fit_transform(X)
```

In this case, the data will first be scaled and then reduced using PCA, with both the scaled and reduced data being concatenated into a single output.

### 3. **ColumnTransformer**

The **ColumnTransformer** allows you to apply different transformations to different columns (features) of your dataset. This is especially useful when dealing with datasets that contain both numerical and categorical features, where each type of feature might need different preprocessing.

#### Example:

```python
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.impute import SimpleImputer

# Define transformations for different types of columns
preprocessor = ColumnTransformer(
    transformers=[
        ('num', StandardScaler(), [0, 1, 2]),  # Apply StandardScaler to numerical columns
        ('cat', OneHotEncoder(), [3])          # Apply OneHotEncoder to categorical column
    ])

# Apply the ColumnTransformer
X_transformed = preprocessor.fit_transform(X)
```

Here:
- Columns 0, 1, and 2 are numerical features that will be scaled using `StandardScaler`.
- Column 3 is a categorical feature that will be one-hot encoded using `OneHotEncoder`.

### 4. **Combining Pipelines and ColumnTransformer**

You can also nest pipelines inside a `ColumnTransformer` or use `FeatureUnion` inside pipelines, making it easier to create complex data preprocessing workflows.

#### Example:

```python
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.linear_model import LogisticRegression

# Define numerical pipeline
num_pipeline = Pipeline([
    ('scaler', StandardScaler())  # Scale numerical data
])

# Define categorical pipeline
cat_pipeline = Pipeline([
    ('onehot', OneHotEncoder())  # One-hot encode categorical data
])

# Combine the pipelines into a ColumnTransformer
preprocessor = ColumnTransformer([
    ('num', num_pipeline, [0, 1, 2]),  # Apply num_pipeline to numerical columns
    ('cat', cat_pipeline, [3])         # Apply cat_pipeline to categorical columns
])

# Combine preprocessing with a classifier in a final pipeline
model = Pipeline([
    ('preprocessor', preprocessor),   # Preprocess the data
    ('classifier', LogisticRegression())  # Fit logistic regression
])

# Fit the model
model.fit(X_train, y_train)

# Predict with the model
predictions = model.predict(X_test)
```

### Summary of Composite Transformers:

1. **Pipeline:** Chains multiple steps together in sequence, ensuring each step’s output is passed to the next step. Useful for preprocessing and model fitting in one workflow.
   
2. **FeatureUnion:** Combines multiple transformers in parallel, allowing their outputs to be concatenated. Useful for applying different transformations to the same data in parallel.

3. **ColumnTransformer:** Applies different transformations to different subsets of the data (columns). Ideal for datasets with both numerical and categorical features that require different preprocessing.

4. **Combining Pipelines:** You can nest pipelines inside `ColumnTransformer` or other composite transformers, creating flexible, layered workflows.

Each of these composite transformers allows for flexible, reusable, and scalable preprocessing pipelines that fit seamlessly into the scikit-learn ecosystem. Let me know if you need more examples or specific help with composite transformers!