# Hyperband and Successive Halving

#### Successive Halving:

- Successive Halving is a resource-efficient algorithm that allocates resources (e.g., time, data, or iterations) to multiple configurations and eliminates the worst-performing ones at each step.
- It starts by testing many configurations with minimal resources and progressively allocates more resources to the most promising configurations.
#### Hyperband:

- Hyperband extends Successive Halving by dynamically adjusting the allocation of resources based on the trade-off between the number of configurations tested and the resources allocated to each configuration.
- It uses a parameter called R (maximum resources per configuration) to balance exploration and exploitation.

In [None]:
import torch

# Define the objective function (to minimize)
def objective_function(x):
    return (x - 2) ** 2  # Minimum at x = 2

# Successive Halving implementation
def successive_halving(configurations, max_resources, reduction_factor):
    num_configs = len(configurations)
    resource_per_config = max_resources
    iteration = 1

    while num_configs > 1:
        print(f"Iteration {iteration}: {num_configs} configurations with {resource_per_config} resources each")
        
        # Evaluate all configurations
        scores = []
        for config in configurations:
            y = objective_function(config)  # Evaluate the objective function
            scores.append(y)

        # Sort configurations by performance (lower is better)
        sorted_configs = sorted(zip(scores, configurations), key=lambda x: x[0])
        
        # Keep the top configurations based on the reduction factor
        num_keep = max(1, num_configs // reduction_factor)
        configurations = [config[1] for config in sorted_configs[:num_keep]]
        
        # Reduce resources per configuration
        resource_per_config //= reduction_factor
        num_configs = len(configurations)
        iteration += 1

    # Return the best configuration
    best_config = configurations[0]
    best_score = objective_function(best_config)
    return best_config, best_score

# Run Successive Halving
if __name__ == "__main__":
    initial_configs = [torch.rand(1) * 4.0 for _ in range(16)]  # 16 random configurations in [0, 4]
    max_resources = 27  # Total resources
    reduction_factor = 3  # Reduce by a factor of 3 each iteration

    best_config, best_score = successive_halving(initial_configs, max_resources, reduction_factor)
    print(f"Best Configuration: x = {best_config.item():.4f}, f(x) = {best_score.item():.4f}")


#### Advantages of Hyperband

1. Resource Efficiency:
- Hyperband reduces wasted computational resources by focusing on promising configurations.
2. Scalability:
- Works well even with large hyperparameter spaces or expensive models, as it dynamically allocates resources.
3. Exploration and Exploitation:
- Balances testing many configurations (exploration) with refining the best ones (exploitation).
#### When to Use Hyperband

- When training is computationally expensive (e.g., deep learning models).
- When you want to explore a large hyperparameter space efficiently.
- When you need a dynamic trade-off between the number of configurations tested and the resources allocated.