# A baseball player is to play in the World Series

# Based upon his season play, you estimate that if he comes to bat four times in a game the number of hits he will get has a distribution

# $p_{X} = \left\{\begin{matrix}0.4 & ;0\\ 0.2 & ;1\\0.2 & ;2\\0.1 & ;3\\ 0.1 & ;4\end{matrix}\right.$

# Assume a player has four at bats in each game

## a) Let $X$ denote the number of hits that he gets in a series. Using the program *NFoldConvolution*, find the distribution of $X$ for each of the possible series lengths (4, 5, 6, or 7)

## b) Using the distribution found in a), find the probability that his batting average is over 0.400 in a four game series

## c) Given the distribution $p_{X}$, what is his long term batting average?

____

# a)

In [22]:
def P_X(x):
    if x == 0:
        return 0.4
    elif x in [1,2]:
        return 0.2
    elif x in [3,4]:
        return 0.1
    else:
        return 0

In [25]:
def NFoldConvolution(x, n):
    if n == 1:
        return P_X(x)
    elif n == 2:
        total = 0
        for i in range(x+1):
            total += P_X(i)*P_X(x-i)
        return total
    else:
        total = 0
        for i in range(x+1):
            total += NFoldConvolution(x-i, n-1)*P_X(i)
        return total




In [40]:
dict_probabilities = {}

for n_games in [4, 5, 6, 7]:
    probs = []
    for n_hits in range(4*n_games+1):
        val = NFoldConvolution(n_hits, n_games)
        probs.append(val)
    dict_probabilities[n_games] = probs

In [28]:
import pandas as pd

In [41]:
df = pd.DataFrame(index = range(7*4+1))

for n_games in [4,5,6,7]:
    df.loc[:4*n_games, n_games] = dict_probabilities[n_games]

In [42]:
df

Unnamed: 0,4,5,6,7
0,0.0256,0.01024,0.004096,0.0016384
1,0.0512,0.0256,0.012288,0.0057344
2,0.0896,0.0512,0.027648,0.014336
3,0.1152,0.0768,0.047104,0.0272384
4,0.1424,0.1056,0.071424,0.0451584
5,0.1408,0.12192,0.092928,0.0648704
6,0.1312,0.1296,0.110144,0.0844032
7,0.1056,0.1224,0.117504,0.0994688
8,0.0808,0.108,0.116352,0.1085056
9,0.0528,0.0856,0.105472,0.1092672


_____

# b)

# If his batting average is above 0.400, he had at least 7 hits in 16 at bats

In [44]:
df.loc[7:, 4].sum()

0.3040000000000001

# So $P(\text{Batting Avg.}\geq0.400) = 0.304$

_____

# c)

# Let $S_{n}$ be the number of hits in $n$ games

# Then $S_{n} = X_{1} + X_{2} + ... + X_{n}$ where each $X_{i}$ represents the number of hits in a single game

# Then $E(S_{n}) = E(X_{1}) + E(X_{2}) + ... + E(X_{n})$

# $E(X_{i}) = (1)(0.2) + (2)(0.2) + (3)(0.1) + (4)(0.1)= 1.3$

# $\implies E(S_{n}) = n(1.3)$

# Then, $E(\text{Batting Avg. in }n\text{ games}) = E(\frac{S_{n}}{4\cdot n}) = E(\frac{1.3}{4}) = 0.325$