### Example - Time between earthquakes

Suppose that earthquakes of a certain magnitude in a specific region can be modeled as a Poisson process with a mean of $\lambda = 4.5$ earthquakes per day.  Let $X$ be the time until the third earth quake.  It can be shown that $X$ has a $Gamma$ distribution with $\alpha = 3$ (number of events) and $\beta = \frac{1}{\lambda}=\frac{1}{4.5}$ (average time until the 3rd earthquake).  We can use Python's `random.gammavariate` to simulate the distribution.

In [1]:
from composable.strict import map, filter
from composable import pipeable

take = pipeable(lambda k, seq: [val for i, val in enumerate(seq) if i < k])

@pipeable
def p_reduce(func, xs, init = None):
    if init is None:
        return reduce(func, xs) # Uses first value as init
    else:
        return reduce(func, xs, init)

In [2]:
from random import gammavariate
?gammavariate

[1;31mSignature:[0m [0mgammavariate[0m[1;33m([0m[0malpha[0m[1;33m,[0m [0mbeta[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m
Gamma distribution.  Not the gamma function!

Conditions on the parameters are alpha > 0 and beta > 0.

The probability distribution function is:

            x ** (alpha - 1) * math.exp(-x / beta)
  pdf(x) =  --------------------------------------
              math.gamma(alpha) * beta ** alpha

The mean (expected value) and variance of the random variable are:

    E[X] = alpha * beta
    Var[X] = alpha * beta ** 2
[1;31mFile:[0m      c:\users\ng0471lb\appdata\local\anaconda3\envs\polars\lib\random.py
[1;31mType:[0m      method

In [3]:
from composable.sequence import head
N = 1000000
time_between_3_quakes = [gammavariate(3,1/4.5) for i in range(N)]
time_between_3_quakes >> take(5)

[0.2081906124531695,
 0.5284978653922688,
 0.6081465644700771,
 1.1018943749541557,
 0.12770658944777183]

## Three `for` loop patterns

Most all `for` loops are reinventing one of the following patterns.

1. **Map**ping a function/transformation unto each value.
2. **Filter**ing the values by some boolean condition.
3. **Reduce** values to one or more statistics.

### Map example - Convert the times from days to hours.

In [4]:
# Loop solution
time_in_hours = []
for t in time_between_3_quakes:
    time_in_hours.append(t*24)
time_in_hours >> take(5)

[4.996574698876068,
 12.683948769414453,
 14.595517547281851,
 26.445464998899737,
 3.064958146746524]

In [5]:
# Comprehension solution
([t*24 for t in time_between_3_quakes]
 >> take(5)
)

[4.996574698876068,
 12.683948769414453,
 14.595517547281851,
 26.445464998899737,
 3.064958146746524]

In [6]:
# With pipeable functions
from composable.strict import map

(time_between_3_quakes
 >> map(lambda t: t*24)
 >> take(5)
)

[4.996574698876068,
 12.683948769414453,
 14.595517547281851,
 26.445464998899737,
 3.064958146746524]

### Filter Example -  filter out all value less than 1 day.

In [7]:
# loop solution
less_than_1_day = []
for t in time_between_3_quakes:
    if t < 1:
        less_than_1_day.append(t)
less_than_1_day >> take(5)

[0.2081906124531695,
 0.5284978653922688,
 0.6081465644700771,
 0.12770658944777183,
 0.9590552633212068]

In [8]:
# comprehension solution
([t for t in time_between_3_quakes if t < 1]
 >> take(5)
)

[0.2081906124531695,
 0.5284978653922688,
 0.6081465644700771,
 0.12770658944777183,
 0.9590552633212068]

In [9]:
# pipeable functions

(time_between_3_quakes
 >> filter(lambda t: t < 1)
 >> take(5)
)

[0.2081906124531695,
 0.5284978653922688,
 0.6081465644700771,
 0.12770658944777183,
 0.9590552633212068]

### Reduce Example - Accumulating the maximum

#### Option 1 - Set a reasonable initial value

In [10]:
## Loop solution - Use a initial value of zero
max_time = 0 # safe since Gamma is non-negative
for t in time_between_3_quakes:
    max_time = max(max_time, t) # update step
max_time

4.533161898473759

In [11]:
# Functional solution - Initial value of zero
from functools import reduce

reduce(lambda max_time, t: max(max_time, t), time_between_3_quakes, 0)

4.533161898473759

In [12]:
# with init = 0
update_max = lambda m, t: max(m, t)

(time_between_3_quakes
 >> p_reduce(update_max, init = 0)
)

4.533161898473759

#### Option 2 - Use the first element as the initial value

This works because the `max(xs) >= xs[0]`

In [13]:
## Loop solution - Use a initial value of zero
max_time = time_between_3_quakes[0] # safe since Gamma is non-negative
for t in time_between_3_quakes[1:]:
    max_time = max(max_time, t) # update step
max_time

4.533161898473759

#### By default, reduce uses the first element as init

In [14]:
# Functional solution - Initial value of zero
from functools import reduce

reduce(lambda max_time, t: max(max_time, t), time_between_3_quakes) # <--- no third init argument

4.533161898473759

In [15]:
# with init = first value
(time_between_3_quakes
 >> p_reduce(update_max)
)

4.533161898473759

### <font color="red"> Exercise 3.0.5 </font>

Use the reduce pattern to compute the total time by

1. Use a `for` loop with an accumulator first, then
2. Refactor the code to use `reduce`, and finally
3. Discuss your (A) initial value and (B) update function and how they relate to the loop.

In [18]:
total_time = 0
for t in time_between_3_quakes:
    total_time += t
total_time

666251.9403404867

In [24]:
from functools import reduce

reduce(lambda total_time, t: total_time + t, time_between_3_quakes, 0)

666251.9403404867

<font color="orange">
    For both the for loop and the reduce function, the initial value is zero. The for loop adds each number to the total while the reduce uses a lambda to add the times to a running total.
</font>