# Test Generator

Read in the candidates and item data and generate a randomised test from them.

We assume that the 1PL model is used.

$$
Pr(X=1) = \frac{exp(\theta-b)}{1 + exp(\theta-b)}
$$

## Data Ingest

There are two files in the `data` folder that we need: `items.csv` and `candidates.csv`. From these we generate a randomised test.

In [1]:
import numpy as np
from numpy.random import seed
from typing import List, Tuple
from csv import reader
import pandas as pd


def getDataAsList(datafile: str) -> List[Tuple]:
    with open(datafile, 'r', encoding='utf-8-sig') as fs:
        csv_reader = reader(fs)
        row_list = list(map(tuple, csv_reader))
        return row_list[1:]    # ignore the header row
    

# convert the raw data into a simple duple of ( ucid, theta )
def getCandidates() -> List[Tuple]:
    candidates = getDataAsList('data/candidates.csv')
    new_list = [(c[0], float(c[1])) for c in candidates]
    return new_list;
    

# convert the raw data into a simple triple of ( uiid, a, b )
def getItems() -> List[Tuple]:
    items = getDataAsList('data/items.csv')
    new_list = [(i[0], float(i[1]), float(i[2])) for i in items]
    return new_list;

In [2]:
items = getItems()
candidates = getCandidates()

## Item Response Generation

The `getItemResponse()` function is used to generate a randomised response: correct (1) or incorrect (0) for a given candidate taking an item.

In [3]:
def getItemResponse(b: float, theta: float) -> str:
    """Gets a randomised item response for a given candidate
    according to the 1PL model:

    P(X=1) = e^(theta-b) / 1 + e^(theta-b)

    :param b: the difficulty parameter for the item
    :param theta: the latent ability of the candidate
    :return: '0' = incorrect, '1' = correct
    """
    rng = np.random.default_rng()
    rv = rng.random()

    p1 = np.exp(theta - b) / (1 + np.exp(theta - b))
    p0 = 1 - p1

    assert p0 <= 1.0
    assert p1 <= 1.0

    rLookup = {
        '0': [0.00, p0],
        '1': [p0, 1.00]
    }
    r = {k: v for (k, v) in rLookup.items() if v[0] <= rv <= v[1]}
    rKey = list(r.keys())

    assert rKey[0] == '1' or rKey[0] == '0'

    return rKey[0]


We iterate through the data and genereate item responses for each candidate. Each candidate takes a test comprising each item; with a simulated response being generated for each.

In [4]:
def GenerateRandomTests():
    test_responses = [] # a list of lists

    # generate a header row for the results
    header = []
    header.append('ucid')
    for i in items:
        header.append(i[0])

    # now create the simulated test responses
    for c in candidates:
        test = []
        test.append(c[0])
        for i in items:
            r = int(getItemResponse(i[2], c[1]))
            test.append(r)
        test_responses.append(test)
        
    return test_responses, header

In [8]:
t, header = GenerateRandomTests()

df = pd.DataFrame(t, columns=header)

(df)

Unnamed: 0,ucid,I001,I002,I003,I004,I005,I006
0,C001,1,1,1,1,0,0
1,C002,1,1,0,1,0,0
2,C003,1,1,1,1,1,0
3,C004,1,1,1,1,1,0
4,C005,1,0,0,0,0,0


## Next Steps
You can call the `GenerateRandomTests()` function as many times as you want to re-generate a test.
Add items and candidates to the data files to generate larger tests.
When you are happy with the results you can write out to a results CSV file like this:

In [6]:
df.to_csv('data/results.csv', index=False)