# Context

One day a team lead notices that some members of their team wear cool hats, and that these members of the team tend to be less productive. Being data drive, the Team Lead starts to record whether or not a team member wears a cool hat (X=1 for a cool hat, X=0 for no cool hat) and whether or not they are productive (Y=1 for productive, Y=0 for unproductive).

In [15]:
import numpy as np
import pandas as pd

p_z = 0.5
p_x_z = [0.9, 0.1]
p_y_xz = [0.2, 0.4, 0.6, 0.8]
z = np.random.binomial(n=1, p=p_z, size=500)
p_x = np.choose(z, p_x_z)
x = np.random.binomial(n=1, p=p_x, size=500)
p_y = np.choose(x+2*z, p_y_xz)
y = np.random.binomial(n=1, p=p_y, size=500)

df = pd.DataFrame({"x": x, "y": y})
print(df.shape)
df.head()

(500, 2)


Unnamed: 0,x,y
0,0,1
1,1,0
2,0,1
3,0,0
4,1,0


ATE = P(Y=1|X=1) − P(Y=1|X=0)

In [16]:
def estimate_uplift(ds):
    """
    Estimate the difference in means between two groups.
    "estimated_effect" - the difference in mean values of $y$ for treated and untreated samples.
    "standard_error" - 90% confidence intervals arround "estimated_effect"
    """
    base = ds[ds.x == 0]
    variant = ds[ds.x == 1]
    delta = variant.y.mean() - base.y.mean()
    delta_err = 1.96 * np.sqrt(variant.y.var() / variant.shape[0] + base.y.var() / base.shape[0])
    return {"estimated_effect": delta, "standard_error": delta_err}

estimate_uplift(df)

{'estimated_effect': -0.10809772956367297,
 'standard_error': 0.08728843545324436}

It looks like people with cool hats are less productive.

In [21]:
from scipy.stats import chi2_contingency

tmp = df.assign(placeholder=1).pivot_table(index='x', columns='y', values='placeholder', aggfunc='sum')
result = chi2_contingency(tmp.to_numpy(), lambda_='log-likelihood')
print(f'p-value: {result[1]:.4f}; <0.005: {result[1] < 0.005}')

p-value: 0.0198; <0.005: False


We can use this information to make statements about what we might think about someone's probability if we see them wearing a cool hat. As long as we believe that they are "drawn from the same distribution" as our previous observations, we expect the same correlations to exist.

In [14]:
import numpy as np
import pandas as pd
from sklearn.preprocessing import PolynomialFeatures


def generate_dataset(n_samples=500, set_X=None, show_z=False):
    p_z = 0.5
    p_x_z = [0.9, 0.1]
    p_y_xz = [0.2, 0.4, 0.6, 0.8]
    z = np.random.binomial(n=1, p=p_z, size=n_samples)
    if set_X is not None:
        assert(len(set_X) == n_samples)
        x = set_X
    else:
        p_x = np.choose(z, p_x_z)
        x = np.random.binomial(n=1, p=p_x, size=n_samples)
    p_y = np.choose(x+2*z, p_y_xz)
    y = np.random.binomial(n=1, p=p_y, size=n_samples)
    if show_z:
        return pd.DataFrame({"x":x, "y":y, "z":z})
    return pd.DataFrame({"x":x, "y":y})


def run_ab_test(datagenerator, n_samples=10000, filter_=None):
    n_samples_a = int(n_samples / 2)
    n_samples_b = n_samples - n_samples_a
    set_X = np.concatenate([np.ones(n_samples_a), np.zeros(n_samples_b)]).astype(np.int64)
    ds = datagenerator(n_samples=n_samples, set_X=set_X)
    if filter_ != None:
        ds = ds[filter_(ds)].copy()
    return estimate_uplift(ds)

run_ab_test(generate_dataset)

{'estimated_effect': 0.19799999999999995,
 'standard_error': 0.01921192118407518}