"Here's a classic. Say we sample repeatedly from an exponential distribution with mean 1. What's the expected number of samples before the running total first exceeds 1?"
- https://x.com/octonion/status/1807862103019082032  
- https://x.com/octonion/status/1807866752690446789
- https://x.com/octonion/status/1807867589944607150
- https://x.com/octonion/status/1807912609716264973

<img src="img/exponential_distrib_samples.png"></img>

### Claude 3.5 Sonnet response (8/3/24)

This is an interesting probability question! Let's approach it step-by-step.

1) First, let's clarify what we're dealing with:
   - We're sampling from an exponential distribution with mean 1 (which means its rate parameter λ = 1)
   - We're adding these samples together
   - We want to know how many samples, on average, it takes for this sum to exceed 1

2) This problem is actually a well-known one in probability theory. It's related to what's called the "renewal process" with exponential inter-arrival times.

3) The key insight is that this problem is equivalent to asking: "What's the expected number of events in a Poisson process before time 1?"

4) Why? Because the sum of exponential random variables with rate λ up to a fixed value is equivalent to counting the number of events in a Poisson process up to that time.

5) For a Poisson process with rate λ = 1, the expected number of events by time t is simply t.

6) Therefore, the expected number of events by time 1 is just 1.

So, the answer is that the expected number of samples before the running total first exceeds 1 is e (Euler's number, approximately 2.71828).

This might seem counterintuitive at first, but remember:
- Some sequences will exceed 1 very quickly (like if the first sample is > 1)
- Others might take many small samples to exceed 1
- On average, it will take e samples

Would you like me to explain any part of this in more detail?

In [1]:
import numpy as np

In [12]:
def monte_carlo_simulation(num_simulations=10000):
    counts = []
    for _ in range(num_simulations):
        total = 0
        count = 0
        while total <= 1:
            sample = np.random.exponential(scale=1)
            total += sample
            count += 1
        counts.append(count)
    
    average_count = np.mean(counts)
    return average_count



In [13]:
# Run the simulation
result = monte_carlo_simulation()
print(f"Average number of samples needed: {result:.2f}")

Average number of samples needed: 2.01
