### Q1: Designing a Pipeline for Feature Engineering and Missing Value Handling

To create a pipeline for handling numerical and categorical features, you can use the `Pipeline` and `ColumnTransformer` classes from `scikit-learn`. Here's a step-by-step approach:

#### 1. Import Necessary Libraries

```python
import pandas as pd
import numpy as np
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.ensemble import RandomForestClassifier
from sklearn.feature_selection import SelectKBest, f_classif
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
```

#### 2. Load the Dataset

```python
# Assuming dataset is loaded into a DataFrame `df`
df = pd.read_csv('path_to_dataset.csv')

# Separate features and target variable
X = df.drop('target', axis=1)
y = df['target']
```

#### 3. Identify Numerical and Categorical Features

```python
numerical_features = X.select_dtypes(include=['int64', 'float64']).columns.tolist()
categorical_features = X.select_dtypes(include=['object']).columns.tolist()
```

#### 4. Create Numerical Pipeline

```python
numerical_pipeline = Pipeline([
    ('imputer', SimpleImputer(strategy='mean')),  # Impute missing values with mean
    ('scaler', StandardScaler())  # Scale numerical features
])
```

#### 5. Create Categorical Pipeline

```python
categorical_pipeline = Pipeline([
    ('imputer', SimpleImputer(strategy='most_frequent')),  # Impute missing values with the most frequent value
    ('onehot', OneHotEncoder(handle_unknown='ignore'))  # One-hot encode categorical features
])
```

#### 6. Combine Pipelines Using ColumnTransformer

```python
preprocessor = ColumnTransformer(
    transformers=[
        ('num', numerical_pipeline, numerical_features),
        ('cat', categorical_pipeline, categorical_features)
    ]
)
```

#### 7. Feature Selection and Model Pipeline

```python
pipeline = Pipeline([
    ('preprocessor', preprocessor),
    ('feature_selection', SelectKBest(score_func=f_classif, k='all')),  # Select important features
    ('classifier', RandomForestClassifier())  # Use RandomForestClassifier as the final model
])
```

#### 8. Split the Dataset

```python
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
```

#### 9. Train and Evaluate the Model

```python
pipeline.fit(X_train, y_train)
y_pred = pipeline.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy:.2f}')
```

#### 10. Interpretation and Improvements

- **Interpretation**: This pipeline automates preprocessing, feature selection, and model training. The accuracy score provides a measure of the model's performance on unseen data.
- **Improvements**: Consider using different feature selection methods or hyperparameter tuning for the Random Forest model. You could also add cross-validation to better assess model performance.

### Q2: Building a Pipeline with Random Forest and Logistic Regression

To combine a Random Forest and Logistic Regression using a Voting Classifier, follow these steps:

#### 1. Import Libraries

```python
from sklearn.ensemble import VotingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
```

#### 2. Load and Prepare the Iris Dataset

```python
iris = load_iris()
X = iris.data
y = iris.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
```

#### 3. Create Classifiers and Voting Classifier

```python
rf_clf = RandomForestClassifier(n_estimators=100, random_state=42)
lr_clf = LogisticRegression(max_iter=1000, random_state=42)

voting_clf = VotingClassifier(
    estimators=[('rf', rf_clf), ('lr', lr_clf)],
    voting='soft'  # Use soft voting to predict probabilities
)
```

#### 4. Train and Evaluate the Voting Classifier

```python
voting_clf.fit(X_train, y_train)
y_pred = voting_clf.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy of Voting Classifier: {accuracy:.2f}')
```
