## Part 1
If $p=0.65$ and $N=7$ then the expectation and variance for the number of wins are given by

$$E(w) = N\cdot p = 4.55$$

and

$$Var(w) = Np(1-p) = 1.59$$

The PMF for winning $k$ games out of $N$ is given by the binomial distribution

$$pr(k) = \binom{N}{k} p^k(1-p)^{(N-k)}$$

$$ = \frac{N!}{k!(N - k)!}p^k(1-p)^{(N-k)}$$

For our particular values, we get the following PMF

<table>
    <tr>
        <th>k</th>
        <td>0</td>
        <td>1</td>
        <td>2</td>
        <td>3</td>
        <td>4</td>
        <td>5</td>
        <td>6</td>
        <td>7</td>
    </tr>
    <tr>
        <th>pr(k)</th>
        <td>0.0006</td>
        <td>0.0084</td>
        <td>0.0466</td>
        <td>0.1442</td>
        <td>0.2679</td>
        <td>0.2985</td>
        <td>0.1848</td>
        <td>0.049</td>
    </tr>
</table>




In [1]:
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as stats

In [2]:
# Simulate the seven-game series

# define constants
nTrials = int(1e6)
N , p = 7 , 0.65

# do the actual simulation
nWins = stats.binom(n=N,p=p).rvs(nTrials)

# count the frequency of each number of wins ...
wins,counts = np.unique(nWins , return_counts=True)

# ... and convert into percentages
pct = counts/nTrials

# print results
print("Expected value simulated   = " + f"{nWins.mean():0.3f}")
print("               theoretical = " + f"{N*p:0.3f}" )

print("")

print("Variance       simulated   = " + f"{nWins.var():0.3f}")
print("               theoretical = " + f"{N*p*(1-p):0.3f}" )

print("")

print("PMF")
print("  k     sim     theoretical")
print("---------------------------")
for k in range(8):
    print(
        f"{k:3d}" + "\t" + 
        f"{pct[k]:0.3f}" + "\t" +
        f"{stats.binom(n=N,p=p).pmf(k):0.3f}"
    )


Expected value simulated   = 4.547
               theoretical = 4.550

Variance       simulated   = 1.593
               theoretical = 1.592

PMF
  k     sim     theoretical
---------------------------
  0	0.001	0.001
  1	0.008	0.008
  2	0.047	0.047
  3	0.144	0.144
  4	0.269	0.268
  5	0.298	0.298
  6	0.184	0.185
  7	0.049	0.049


## Part 2
The probability of winning the series is the probability of winning four or more games. Numerically:

$$pr(win) = \sum_4^7 \binom{N}{k} p^k(1-p)^{(N-k)}$$

We can compute this value using a loop:

In [3]:
pr_win_theoretical = 0
for k in range(4,8):
    pr_win_theoretical += stats.binom(n=N,p=p).pmf(k)

We can use our earlier simulation to determine the simulated probability:

In [4]:
pr_win_simulated = np.count_nonzero(nWins>=4) / nTrials

And finally we can print the results:

In [5]:
print("Likelihood of winning series theoretical " + f"{pr_win_theoretical:0.3f}")
print("                             simulated   " + f"{pr_win_simulated:0.3f}")

Likelihood of winning series theoretical 0.800
                             simulated   0.800


## Part 3
Find the PMF for the number of consecutive heads in a six-coin-flip sequence

I started by defining a helper function that will return the number of consecutive heads:

In [6]:
# count the biggest number of consecutive ones 
# in the sequence
nFlips = 6

def count_repeats(x):
    # there is a less ugly way to do this but this is straightforward
    y = np.zeros( x.shape )
    y[0] = x[0]
    for i in range (1,nFlips):
        if x[i]==1:
            y[i] = y[i-1] + 1
    return y.max()

Determine the theoretical PMF by cycling through all $2^6$ combinations of heads and tails and counting the number of consecutive heads in each case. Then look to see how many times each count occurs.

In [7]:
# compute theoretical PMF
# generate all equally-likely 64 outcomes
# count the frequency of occurrences of duplicates
# 
# alternatively you can do this by hand

y = []
for i in range(64):
    # generate all 64 combinations of 1's and 0's
    x = np.array([
        (i//32)%2,
        (i//16)%2,
        (i//8)%2,
        (i//4)%2,
        (i//2)%2,
        (i//1)%2
    ])
    
    # for each combination, count the max number
    # of consecutive ones
    y.append ( count_repeats(x) )

# y is 64 elements long and contains the max number of consecutive 1's in each
# of the input combinations

# convert to numpy array
y    = np.array(y)

# count the frequency of each number
vals = np.arange(nFlips+1)
cnt  = [np.count_nonzero(y==i) for i in range(nFlips+1)]

# normalize to get the probabilities for the PMF
theoretical = np.array(cnt) / 64


Now conduct the actual simulation. Each single coin flip is a "Bernoulli" random variable.

In [8]:
# count duplicate adjacent Heads in a six-flip experiment

nTrials = int(1e5)

# create a 2D matrix of coin flips: nTrials rows x nFlips (6) columns
process = stats.bernoulli(p=0.5)
values  = process.rvs( (nTrials,nFlips) )

# count the number of consecutives (duplicates) in each of the nTrials rows
nDuplicates = [ count_repeats(values[i,]) for i in range(nTrials) ]
nDuplicates = np.array(nDuplicates)

# compute the frequency of occurrence for each of the nTrials "counts"
vals = np.arange(nFlips+1)
cnt  = np.array([np.count_nonzero(nDuplicates==i) for i in range(nFlips+1)])

# normalize (to get percents)
cnt = cnt / nTrials

# print results

print("PMF")
print("  k     sim     theoretical")
print("---------------------------")
for k in range(nFlips+1):
    print(
        f"{k:3d}" + "\t" + 
        f"{cnt[k]:0.3f}" + "\t" + 
        f"{theoretical[k]:0.3f}"
    )


PMF
  k     sim     theoretical
---------------------------
  0	0.015	0.016
  1	0.313	0.312
  2	0.359	0.359
  3	0.188	0.188
  4	0.079	0.078
  5	0.031	0.031
  6	0.015	0.016
