# Quiz Theory

## Introduction
This Python project is about modelling how one can learn to memorise simple associations (such as names of countries and their capitals) through quizzing and feedback - a process I would call Drilled Association Training. The dynamical model in this notebook will be used to simulate an individual learner's behavior through the quiz, with the aim of gaining insights on the optimality of important quiz parameters such as number of options per quiz item, the number of items to be learned in a drill, and so on through simulation.

The future aims of this project are:

1. Collect performance data for association drills with arbitrary stimuli acting as quiz items, and obtain emprical estimates of optimal parameters.

2. Develop training apps for different domains.

3. Expand the applicability of the dynamical model for different learning design problems.
 

## Formulation
The problem is as follows: there are two classes of items -standing in for the question and the answers, let's call one $Q$ and the other $A$. Let us assume that every item in $Q$ have a unique correct association in $A$, i.e. $q_{i} \leftrightarrow a_{i}  \forall i$. Let $N$ denote the common total number of items in either class (and thus the total number of associations to be learned).

> **_Design Assumption:_** Every item in $Q$ has a unique correct association in $A$.

The goal of the learner is to learn all $N$ associations through an automated quizzing intervention with feedback, starting from complete ignorance.

The learning cycle is assumed to be as follows:

1. Every learner starts from complete ignorance, ie initially every Question item is equally likely to be associated with every Answer Item.

2. On any $t^{th}$ trial the learner is presented with any $i^{th}$ Question item $q_{i}(t)$ (selected randomly or as per some other algorithm from $Q$) and must choose from a set of $m$ options $O(t) = \{o^{(k)}(t)\}$, drawn from $A$ such that at least one of the options $o^{(correct)}(t)$ exists such that  $o^{(correct)}(t) = a_{i} \leftrightarrow q_{i}$.
(where $k$ denotes the $1^{st}$, $2^{nd}$, etc. option as presented in the trial)


> **_Design Assumption:_** Let $q_{i}(t)$ be selected randomly from $Q$.

> **_Design Assumption:_** Let $m$ be uniform across the whole drilling exercise.

3. The learner selects from the option set $O(t)$ based on a guessing function $G$ applied to an Evidence function $E(q_{i}, a_{j})$. The evidence function stores some precursor likelihood of associations between every $q$ and every $a$, but the guessing function converts it to a probability distribution over $O(t)$ so that only one of the presented options is considered as a valid guess. (At this stage I am not looking to model 'none of these' or any other kind of special response since these are less relevant for training).
 
> **_Cognitive Assumption:_** Let $E(q_{i}, a_{j})$ be an unbounded function that accumulates evidence for every possible association pair. 

> **_Cognitive Assumption:_** Let $G$ simply select the option with the highest $E$ for the question-option pair, and select randomly between closely contested options (whose difference in $E$ might be below some threshold).

4. At the end of every trial the learner accumulates some evidence for the correct association, and (possibly) some evidence against the presented incorrect associations. One could conceive of the evidence being stronger or weaker for some options depending on what the learner actually selected/guessed, or one could simplify it by assuming that evidence gained for or against each option wrt the question item does not depend on what the user selected.
(note: one other possible reasonable assumption is that evidence against is only obtained if the wrong option is selected.)

> **_Cognitive Assumption:_** Let $\varDelta E(q_{i}(t), o^{(k)}(t))$ on trial $t$ not depend on the user's selection.

> **_Cognitive Assumption:_** Let $\varDelta E(q_{i}, o^{(k)}(t))$ = $\delta_{mismatch}$ for all incorrect options and $E(q_{i}, o^{(correct)}(t))$ = $\delta_{match}$ for the correct option. 

> **_Cognitive Assumption:_** Let  $\delta_{mismatch}$ and $\delta_{match}$ be constants such that $\delta_{mismatch} \leq 0$, $\delta_{match} > 0$, and $\delta_{mismatch} \leq \frac{\delta_{match}}{m}$. 

5. Evidence decays over time (trials), but can be consolidated through repetition. One way to model this is to have a decay rate $\gamma(q, a)$ which gets (This is the part of the formulation that I am most uncertain about, since it has multiple built in assumptions about the dynamics. Perhaps we can go with a decay-savings model. )

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import random

# Parameters
N = 20
matchWeightage = 0.3
optionArraySize = 4
mismatchWeightage = 0.3/optionArraySize
EDT = 0.01 #Evidence difference Threshold: evidence difference below this will register as equivalent for guessing
maxTrials = 100

# Operational Globals
Q = []
A = []
E = np.zeros((N, N))   #Evidence Array. First dimension is Q index, second is A index
TrialSet = []    #Stores all trials

In [231]:
# Defining the classes.
class Item:
#     This is the quizzing item. 
#     Should contain metadata as well as item-specific props such as image and stuff
  def __init__(self, index_, itemType_ = "Q"):
    self.index = index_
    self.itemType = itemType_
  def __str__(self):
    return f"This is {self.itemType} #{self.index}"

class Trial:
    #Every trial has presents one question item (q) with an set of options, one of whom is the correct answer
    def __init__(self, index_, q_, optionArray_):
        #q_ is a single question Item
        #optionArray_ is an array of answer items, one of which must be correct (ie have matching index with q)
        self.q = q_
        self.optionArray = optionArray_
        self.index = index_
#         print(f"Created Trial #"+ str(self.index) + "with question #" + str(self.q.index))
    def __str__(self):
        theString = f"Trial #{self.index} with Q#{self.q.index}:  "
        for i in range(len(self.optionArray)):
            theString += "Option " + chr(i + 97) + " is A#" + str(self.optionArray[i].index) + ".   "
        return theString
    def guess(): 
        # Guessing function: theoretically the most important method here with updateEvidence()
        # Strategy: Of all the options pick the one with the highest E[q, :].
        
        #first find the option with max E for q and store its index
        correctOptionIndex = 0
        for i in range(len(self.optionArray)):
            if(E[self.q.index, self.optionArray[i].index] > E[self.q.index, self.optionArray[correctOptionIndex].index]):
                correctOptionIndex = i

        # Now check if there were other options with same E for q
        correctOptionsIndicesArray = [correctOptionIndex]
        for i in range(len(self.optionArray)):
            if(E[self.q.index, self.optionArray[i].index] - E[self.q.index, self.optionArray[correctOptionIndex].index] < EDT):
                correctOptionsIndicesArray.append(i)
                
        # If there are multiple correct options, randomize the guess
        if(correctOptionsIndicesArray.length > 1):
            correctOptionIndex = random.choice(correctOptionsIndicesArray)
        
        return correctOptionIndex
            
        
    def checkMatch(inputAnswer):
        if(inputAnswer.index == self.q.index):
            return True
        else:
            return False
    def updateEvidence():
        global E
        for i in range(len(self.optionArray)):
            if(self.optionArray[i].index == self.q.index):
                #Match - increase evidence of association
                E[self.q.index, self.optionArray[i].index] += matchWeightage
            else:
                #Mismatch - decrease evidence of association
                E[self.q.index, self.optionArray[i].index] -= MismatchWeightage
        


In [232]:


#defining functions
def initialize():
    #first create N items in each Q and A with id assigned
    createQAndA()
#     print(len(A))
    createTrialSet()
    
    
# def runSim():
    
    
def createQAndA():
    global Q
    global A
    Q = []
    A = []
    for i in range(N):
        newQ = Item(i, "Q")
#         print("Adding " + str(i) + " to Q.")
        Q.append(newQ)
        newA = Item(i, "A")
#         print("Adding " + str(i) + " to A.")
        A.append(newA)
#         print(f"Length of A is {len(A)}")

    
        
def createTrialSet():
    TrialSet = []
    for trialNum in range(maxTrials):
        #Sample from Q one by one with replacement, since questions might be repeated.
        new_q = random.choice(Q)
        
        #Sample options by including the right answer, drawing the wrong options w/o replacement, then shuffling
        new_optionArray = createOptionArray(new_q)
        newTrial = Trial(trialNum, new_q, new_optionArray)
        print(newTrial)
        TrialSet.append(newTrial)
    
def createOptionArray(q_):
#     print("SJOULD CREATE")

    #initialize array with first element as correct answer
    optionArray = [a for a in A if a.index == q_.index]
#     print(f"The option array precursor length ought to be {len(optionArray)} with correct option {optionArray[0].index}")
    
    #randomly assign wrong options !SITE FOR ADAPTIVE DIFFICULTY MODIFICATION
    wrongAnswerSet = [a for a in A if a.index != q_.index]
    wrongAnswerSet = random.sample(wrongAnswerSet, optionArraySize - 1)
    
    optionArray += wrongAnswerSet
    
    #Shuffle array
    random.shuffle(optionArray)
    
#     print("After Shyffling:")
#     [print(option.index) for option in optionArray]
    
    return optionArray
    
    

    
    
# def EtoP(e):
#     #get probability from evidence value
#     return (1/(1 + exp(-e)))

In [233]:
initialize()
# runSim()

Trial #0 with Q#0:  Option a is A#8.   Option b is A#16.   Option c is A#0.   Option d is A#5.   
Trial #1 with Q#11:  Option a is A#19.   Option b is A#1.   Option c is A#11.   Option d is A#7.   
Trial #2 with Q#12:  Option a is A#9.   Option b is A#12.   Option c is A#11.   Option d is A#0.   
Trial #3 with Q#1:  Option a is A#1.   Option b is A#16.   Option c is A#4.   Option d is A#14.   
Trial #4 with Q#13:  Option a is A#13.   Option b is A#18.   Option c is A#1.   Option d is A#0.   
Trial #5 with Q#9:  Option a is A#0.   Option b is A#1.   Option c is A#9.   Option d is A#11.   
Trial #6 with Q#2:  Option a is A#16.   Option b is A#2.   Option c is A#4.   Option d is A#12.   
Trial #7 with Q#9:  Option a is A#15.   Option b is A#13.   Option c is A#4.   Option d is A#9.   
Trial #8 with Q#1:  Option a is A#15.   Option b is A#1.   Option c is A#8.   Option d is A#13.   
Trial #9 with Q#10:  Option a is A#13.   Option b is A#2.   Option c is A#10.   Option d is A#14.   
Trial #

In [172]:
#ISSUEL AFTER CreateQANDA Somehow we get a 50 item list instead of 10. Who the hell knows


4