
# Item Response Theory (IRT) — 1PL (Rasch) with Bonus 2PL — Python Notebook

**When to Use**  
- Measuring a **latent trait** (ability, brand affinity, satisfaction) from item responses (correct/incorrect, agree/disagree).  
- Building **short, adaptive questionnaires** that are comparable across populations and time.

**Best Application**  
- Psychometrics, **brand loyalty/sentiment scales**, and survey measurement where items have varying difficulty.  
- **Computerized adaptive testing (CAT)** or respondent scoring from partial item sets.

**When Not to Use**  
- When items do not measure a **single underlying construct** (unidimensionality violated). Use factor analysis or multidimensional IRT.  
- With purely **continuous outcomes**; consider latent-variable SEM/GLMs instead.

**How to Interpret Results**  
- **Person ability (θ)**: latent trait level per respondent on a standardized scale (mean≈0).  
- **Item difficulty (b)**: trait level where P(correct/agree)=0.5 (higher b = harder/stricter item).  
- **Item discrimination (a)** (2PL): slope/steepness; higher a = more information around b.  
- Use **item characteristic curves (ICC)** and **test information** to judge measurement precision by trait level.


In [None]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.special import expit
from scipy.optimize import minimize

pd.set_option('display.max_columns', 200)
plt.rcParams['figure.figsize'] = (8,4)
rng = np.random.default_rng(123)


### Data: Simulate respondent abilities and item parameters (2PL ground truth)

In [None]:

N = 500
J = 15

theta_true = rng.normal(0, 1, size=N)
a_true = rng.lognormal(mean=0.1, sigma=0.2, size=J)
b_true = rng.normal(0, 1, size=J)

P = expit((theta_true[:,None] - b_true[None,:]) * a_true[None,:])
Y = rng.binomial(1, P)
Y[:5,:5]


## Fit 1PL (Rasch) via Joint MLE with Identifiability Constraints

In [None]:

def pack(theta, b):
    return np.r_[theta[:-1], b[:-1]]

def unpack(w, N, J):
    theta_free = w[:N-1]
    b_free = w[N-1:]
    theta = np.r_[theta_free, -theta_free.sum()]
    b = np.r_[b_free, -b_free.sum()]
    return theta, b

def nll_rasch(w, Y, l2=1e-3):
    N, J = Y.shape
    theta, b = unpack(w, N, J)
    eta = theta[:,None] - b[None,:]
    p = expit(eta)
    eps = 1e-9
    p = np.clip(p, eps, 1-eps)
    ll = Y*np.log(p) + (1-Y)*np.log(1-p)
    pen = l2*(np.sum(theta**2) + np.sum(b**2))
    return -ll.sum() + pen

theta0 = (Y.mean(axis=1) - 0.5) * 2.0
b0 = -(Y.mean(axis=0) - 0.5) * 2.0
w0 = pack(theta0, b0)

res = minimize(nll_rasch, w0, args=(Y, 1e-3), method='L-BFGS-B')
theta_hat, b_hat = unpack(res.x, N, J)

res.success, float(res.fun)


### Rasch Diagnostics: Person/Item summaries and calibration plots

In [None]:

theta_df = pd.DataFrame({'theta_hat': theta_hat})
item_df = pd.DataFrame({'b_hat': b_hat, 'a_assumed': 1.0, 'p_correct': Y.mean(axis=0)})
theta_df.describe(), item_df.describe()


In [None]:

P_hat_rasch = expit(theta_hat[:,None] - b_hat[None,:])
obs = Y.mean(axis=0)
pred = P_hat_rasch.mean(axis=0)

plt.scatter(obs, pred)
plt.plot([0,1],[0,1],'k--')
plt.xlabel('Observed mean correct (per item)')
plt.ylabel('Predicted mean (Rasch)')
plt.title('Item Calibration: Rasch')
plt.show()


### Item Characteristic Curves (ICC)

In [None]:

theta_grid = np.linspace(-3, 3, 200)
plt.figure()
for j in range(J):
    p = expit(theta_grid - b_hat[j])
    plt.plot(theta_grid, p, alpha=0.5)
plt.xlabel('Ability θ')
plt.ylabel('P(correct)')
plt.title('Rasch ICCs (a=1)')
plt.show()


### Test Information (Rasch)

In [None]:

def info_rasch(theta, b):
    p = expit(theta - b)
    return p*(1-p)

info = np.zeros_like(theta_grid)
for j in range(J):
    info += info_rasch(theta_grid, b_hat[j])
plt.plot(theta_grid, info)
plt.xlabel('Ability θ')
plt.ylabel('Information (sum over items)')
plt.title('Test Information (Rasch)')
plt.show()


### Scoring: Estimate θ for New Respondents Given Fixed Items

In [None]:

new_Y = np.array([
    [1,1,1,1,1, 1,0,1,0,0, 1,0,1,1,0],
    [0,0,0,0,0, 0,0,1,0,0, 1,0,0,0,0],
    [1,0,1,0,1, 0,1,0,1,0, 1,0,1,0,1],
])

def estimate_theta_fixed_items(y_row, b, l2=1e-3):
    mask = ~np.isnan(y_row)
    y = y_row[mask]
    bj = b[mask]
    def nll(theta):
        p = expit(theta - bj)
        eps=1e-9
        p = np.clip(p, eps, 1-eps)
        return -(y*np.log(p) + (1-y)*np.log(1-p)).sum() + l2*theta**2
    r = minimize(lambda th: nll(th[0]), x0=np.array([0.0]), method='L-BFGS-B')
    return r.x[0]

thetas_new = np.array([estimate_theta_fixed_items(row, b_hat) for row in new_Y])
thetas_new


## Bonus: 2PL (a, b) via Alternating Optimization (compact demo)

In [None]:

def nll_items_ab(params, theta, Y, l2a=1e-2, l2b=1e-3):
    J = Y.shape[1]
    a = params[:J]
    b = params[J:]
    p = expit((theta[:,None] - b[None,:]) * a[None,:])
    eps=1e-9
    p = np.clip(p, eps, 1-eps)
    ll = Y*np.log(p) + (1-Y)*np.log(1-p)
    pen = l2a*np.sum((a-1.0)**2) + l2b*np.sum(b**2) + 100*np.sum((a<0)*(-a))
    return -(ll.sum() - pen)

def nll_theta(theta, a, b, Y, l2=1e-3):
    p = expit((theta[:,None] - b[None,:]) * a[None,:])
    eps=1e-9
    p = np.clip(p, eps, 1-eps)
    ll = Y*np.log(p) + (1-Y)*np.log(1-p)
    pen = l2*np.sum(theta**2)
    return -(ll.sum() - pen)

a_hat = np.ones(J)
b2_hat = b_hat.copy()
theta2 = theta_hat.copy()

for it in range(5):
    x0 = np.r_[a_hat, b2_hat]
    res_ab = minimize(nll_items_ab, x0, args=(theta2, Y), method='L-BFGS-B')
    ab = res_ab.x
    a_hat = np.maximum(ab[:J], 0.05)
    b2_hat = ab[J:] - ab[J:].mean()
    res_th = minimize(lambda th: nll_theta(th, a_hat, b2_hat, Y), theta2, method='L-BFGS-B')
    theta2 = res_th.x - res_th.x.mean()

P_hat_2pl = expit((theta2[:,None] - b2_hat[None,:]) * a_hat[None,:])
obs = Y.mean(axis=0)
pred_rasch = expit(theta_hat[:,None] - b_hat[None,:]).mean(axis=0)
pred_2pl = P_hat_2pl.mean(axis=0)

cal = pd.DataFrame({'obs': obs, 'rasch': pred_rasch, 'twopl': pred_2pl})
cal


In [None]:

plt.scatter(cal['obs'], cal['rasch'], label='Rasch', alpha=0.7)
plt.scatter(cal['obs'], cal['twopl'], label='2PL (alt opt)', alpha=0.7)
plt.plot([0,1],[0,1],'k--')
plt.xlabel('Observed mean correct (per item)')
plt.ylabel('Predicted mean')
plt.title('Calibration: Rasch vs 2PL')
plt.legend(); plt.show()



---

### Practical Guidance
- Start with **Rasch (1PL)** for stability and comparability; move to **2PL** if items differ in discrimination.  
- Center parameters for identifiability (mean θ = 0, mean b = 0), and regularize lightly.  
- For production use with priors and missing data, consider **Bayesian IRT** (PyMC/Stan) and **CAT** logic for adaptive testing.

### References (non‑link citations)
1. Embretson & Reise — *Item Response Theory for Psychologists*.  
2. Baker — *The Basics of Item Response Theory*.  
3. Rossi, Allenby & McCulloch — *Bayesian Statistics and Marketing* (IRT applications).
