(activity1_solution)=
# Activity 1 Solutions: Jupyter and NumPy

**2025-01-30**

---

In [23]:
# import numpy
# the 'as' allows us to use np as a shorthand for numpy
import numpy as np

## 1. Replicating the combined contingency table

We can replicate the calculations we previously did by hand using numpy:

In [24]:
#                               Y=1  Y=0
contingency_overall = np.array([[273, 77],  # T=1
                               [289, 61]])  # T=0

In [25]:
contingency_overall.shape

(2, 2)

Accessing the contingency table can be done by indexing `array[row, column]`. The `:` operator is used to select all rows or columns.


In [26]:
# select the first row, second column
print(contingency_overall[0, 1])
print("---------------")

# select the second column
print(contingency_overall[:, 1])

print("---------------")
# select the first row
print(contingency_overall[0, :])

77
---------------
[77 61]
---------------
[273  77]


In [27]:
# Compute the marginal probability of Y=1
# TODO your code here
# sum the column where Y=1, divided by the total sum of the entire table
# print what's inside the f string, only to the first two decimal points
print(f"{contingency_overall[:, 0].sum() / contingency_overall.sum():.2f}")

0.80


In [28]:
# Computer the conditional probability of Y=1 given T=1
# TODO your code here
print(contingency_overall[0, 0] / contingency_overall[0, :].sum())

0.78


In [29]:
# Computer the conditional probability of Y=1 given T=0
# TODO your code here
print(contingency_overall[1, 0] / contingency_overall[1, :].sum())

0.8257142857142857


**Optional extension**: if you're already comfortable with the coding environment, you can explore the [np.sum](https://numpy.org/doc/stable/reference/generated/numpy.sum.html) documentation, particularly the `axis` parameter, to see how you can equivalently compute these probabilities without slicing out the rows or columns.

---

## 2. Contingency tables for small and large stones


Below are the contingency tables for the two cases, stored as numpy arrays:

In [30]:
# contingency table for C=0, large stones
#                          Y=1  Y=0
contingency_C0 = np.array([[192, 71], # T=1
                          [55, 25]]) # T=0

# contingency table for C=1, small stones
#                          Y=1  Y=0
contingency_C1 = np.array([[81, 6], # T=1
                          [234, 36]]) # T=0

# prints (number of rows, number of columns)
print(contingency_C0.shape)

(2, 2)


We can verify that the two contingency tables are consistent with the overall contingency table:

In [31]:
# this returns a numpy array of booleans of whether each element in the two arrays are equal
np.isclose(contingency_C0 + contingency_C1, contingency_overall)


array([[ True,  True],
       [ True,  True]])

In [32]:
contingency_overall

array([[273,  77],
       [289,  61]])

In [33]:
contingency_C0 + contingency_C1

array([[273,  77],
       [289,  61]])

Compute the expected value of $Y$ given $T=1$ and $T=0$ for the case of large stones ($C=0$).

Compute the estimated probability of $Y=1$ given $T=1$ and $T=0$ for the case of large stones ($C=0$).

In [34]:
# TODO your code
# we want Y=1, T=1, C=0
EY_given_T1_C0 = contingency_C0[0,0] / contingency_C0[0, :].sum()
EY_given_T0_C0 = contingency_C0[1,0] / contingency_C0[1, :].sum()

# the f-string allows us to insert the value of the variable into the string
# and the :.2f allows us to round the value to 2 decimal places
print(f"Estimated expected value of Y given T=1 and C=0: {EY_given_T1_C0:.2f}")
print(f"Estimated expected value of Y given T=0 and C=0: {EY_given_T0_C0:.2f}")


Estimated expected value of Y given T=1 and C=0: 0.73
Estimated expected value of Y given T=0 and C=0: 0.69


Which treatment appears to be more successful in the large stone case ($C=0$)?

**Your response:** Open surgery ($T=1$)



Now let's compute the expected value of $Y$ given $T=1$ and $T=0$ for the case of small stones ($C=1$).

In [35]:
# TODO your code
EY_given_T1_C1 = contingency_C1[0,0] / contingency_C1[0, :].sum()
EY_given_T0_C1 = contingency_C1[1,0] / contingency_C1[1, :].sum()

# the f-string allows us to insert the value of the variable into the string
# and the :.2f allows us to round the value to 2 decimal places
print(f"Estimated expected value of Y given T=1 and C=1: {EY_given_T1_C1:.2f}")
print(f"Estimated expected value of Y given T=0 and C=1: {EY_given_T0_C1:.2f}")


Estimated expected value of Y given T=1 and C=1: 0.93
Estimated expected value of Y given T=0 and C=1: 0.87


Which treatment appears to be more successful in the small stone case ($C=1$)?

**Your response:** Open surgery ($T=1$)


Note that we've split out the contingency tables for each stone case, but what we're effectively doing is computing the following conditional expectations:

$$E[Y\;|\;T=t, C=c]$$

which we read as "the expected value of $Y$ given $T$ set to $t$ and $C$ set to $c$".
