## Finding the Maximum: Simulating Population Sampling

### Overview
This simulation does population sampling to try to find the maximum (true) value in the population. 

The simulation is conducted under the following conditions:
- The known maximum value of the population is `1`.
- The population distribution is a continuous uniform distribution, ranging from `0` to `1`.

### Key Parameters of the Simulation
The simulation is governed by two main parameters, which help us assess the effectiveness of our sampling approach:
1. **Stability Threshold (`stability_threshold`)**: 
   - This parameter represents the number of consecutive sampling runs needed where the maximum value does not change.
   - The sampling process concludes when this stability is observed.
   - For instance, if this is set to 4, and the maximum observed value is stable across 4 periods, then the simulation will complete. 
2. **Maximum Runs Range (`max_runs_range`)**: 
   - This defines the upper boundary for the total number of samples that will be taken in the simulation.
   - The process halts at this limit, regardless of whether the maximum value has stabilized.
   - For instance, if this is set to 10, the simultion will end after 10 samples regardless of whether the stability threshold criteria has been met.

The true population maximum is 1, so the closer the results are to 1, the better it's performing. 

## Code

In [10]:
import random
import pandas as pd

def generate_random_scores(max_runs):
    """
    Generate a list of random scores between 0 and 1.

    Parameters:
    max_runs (int): The number of random scores to generate.

    Returns:
    list: A list of floating-point numbers representing random scores.
    """
    return [random.uniform(0, 1) for _ in range(max_runs)]

def is_stable(scores, stability_threshold):
    """
    Check if the last 'n' scores are stable, where 'n' is the stability threshold.

    Parameters:
    scores (list): A list of floating-point numbers representing scores.
    stability_threshold (int): The number of scores to check for stability.

    Returns:
    bool: True if the last 'n' scores are stable, False otherwise.
    """
    if len(scores) < stability_threshold:
        return False

    last_n_scores = scores[-stability_threshold:]
    max_score = max(last_n_scores)
    return all(score == max_score for score in last_n_scores)

def simulate_run(max_runs, stability_threshold):
    """
    Simulate a single run of the sampling process.

    Parameters:
    max_runs (int): The maximum number of runs.
    stability_threshold (int): The threshold for stability in scores.

    Returns:
    float: The maximum score achieved in the simulation run.
    """
    scores = generate_random_scores(max_runs)
    for run in range(max_runs):
        if is_stable(scores[:run + 1], stability_threshold):
            return max(scores[:run + 1])
    return max(scores)

def simulation_results(num_simulations, max_runs, stability_threshold):
    """
    Calculate the average maximum score over a number of simulations.

    Parameters:
    num_simulations (int): The number of simulations to run.
    max_runs (int): The maximum number of runs in each simulation.
    stability_threshold (int): The threshold for stability in scores.

    Returns:
    float: The average maximum score over all simulations.
    """
    total_max_score = sum(simulate_run(max_runs, stability_threshold) for _ in range(num_simulations))
    return total_max_score / num_simulations

def run_simulations(num_simulations, max_runs_range, stability_threshold_range):
    """
    Run multiple simulations and record the results in a DataFrame.

    Parameters:
    num_simulations (int): The number of simulations to run for each parameter set.
    max_runs_range (iterable): An iterable of integers representing different max runs.
    stability_threshold_range (iterable): An iterable of integers representing different stability thresholds.

    Returns:
    pandas.DataFrame: A DataFrame with the results of the simulations, 
                      where rows represent stability thresholds and columns represent max runs.
    """
    # Using dictionary comprehension to calculate the results
    data = {max_runs: {stability_threshold: simulation_results(num_simulations, max_runs, stability_threshold)
                       for stability_threshold in stability_threshold_range}
            for max_runs in max_runs_range}

    # Creating the DataFrame
    results = pd.DataFrame(data)

    # Setting labels for rows and columns
    results.index.name = 'Stability Threshold'
    results.columns.name = 'Max Runs'

    return results



## Simulation of One Run

In [11]:
max_runs = 100
stability_threshold = 10
print("Maximum value found:")
simulate_run(max_runs, stability_threshold)

Maximum value found:


0.9989229733660068

## Simulation of Multiple Runs With Range of Stability Thresholds and Maximum Runs: 

### Average maximum value found

In [12]:
num_simulations = 100
max_runs_range = range(5, 30, 5)  # From 5 to 20, step by 5
stability_threshold_range = range(2, 8)  # From 2 to 8
df_results = run_simulations(num_simulations, max_runs_range, stability_threshold_range)
print(df_results)

Max Runs                   5         10        15        20        25
Stability Threshold                                                  
2                    0.814337  0.913660  0.939050  0.953933  0.960969
3                    0.837133  0.900081  0.933646  0.956188  0.962902
4                    0.815836  0.897233  0.921017  0.961401  0.965908
5                    0.847609  0.904466  0.933269  0.945223  0.961862
6                    0.812723  0.925210  0.941523  0.955148  0.961865
7                    0.827760  0.896904  0.931840  0.954833  0.957815
