# Example 2C: Backward Stepwise Selection

Backward Stepwise Selection (BwSS) attempts to remove predictors from the list and find the best models. This approach has the same complexity as FSS; it just approaches the problem from the perspective of removing variables rather than adding them. Below is pseudocode for this approach. It is important to note that this code *excludes* the full model from the returned list of models. This, however, could easily be adjusted. How would you change the pseudocode below to include the full model in `models`?

**Function** `backward_stepwise_selection(data, response_variable)`

**Input:**
- `data`: Dataset containing predictor variables and response variable
- `response_variable`: Target variable for the regression model

**Output:**
- A list of models with scores added in each step

1. Initialize list `current_predictors` with all predictors in `data`.
2. Initialize an empty list `remaining_predictors`.
3. Initialize `best_score` to 0.
4. Initialize an empty list `models`.

5. **For** `i` from 1 to the number of columns in `data`, do the following:
    1. Set `best_candidate_score` to 0.
    2. Set `best_candidate` to None.

    3. **For** each `predictor` in `current_predictors`, do the following:
        - Create a copy of `current_predictors` excluding `predictor`.
        - Fit a regression model using the remaining predictors.
        - **If** the R-squared of the model is greater than `best_candidate_score`, then:
            1. Set `best_candidate_score` to the current R-squared.
            2. Set `best_candidate` to `predictor`.

    4. **If** `best_candidate` is not None, then:
        - Remove `best_candidate` from `current_predictors`.
        - Append a copy of `current_predictors` and `best_candidate_score` to `models`.

6. **Return** `models`


In [1]:
def backward_stepwise_selection(data, response_variable):
    remaining_predictors = list(data.columns)
    current_predictors = list(data.columns)
    best_score = 0
    models = []

    for i in range(data.shape[1]):
        best_candidate_score = 0
        best_candidate = None

        for predictor in current_predictors:
            trial_columns = [col for col in current_predictors if col != predictor]

            model = fit_model(trial_columns)
            rsquared = model.rsquared

            if rsquared > best_candidate_score:
                best_candidate_score = rsquared
                best_candidate = predictor

        if best_candidate:
            current_predictors.remove(best_candidate)
            models.append((list(current_predictors), best_candidate_score))

    return models

Results are similar to the previous cases. You are encouraged to copy and paste the relevant code to see those results. 