# Adaptive Learning System Quiz MockUp
## Based on **Item Response Theory**

**Item Response Theory** is briefly divided into two categories based on approaches taken:
1. Classical Item Response Theory
    -ML estimator with maximum-information selection
2. Bayesian Item Response Theory
    -MAP estimator with selection from posterior
    
**Item Response Theory** is also briefly divided into three categories based on the models used:
- 3PL-model (3-parameter logistic model)
  
  >Parameters a,b,c indicating discrimination,difficulty and guessing respectively
  
  >Good model for MCQ tests where guessing creates a lot of unexpected errors in estimation


- 2PL-model (2-parameter logistic model)
  
  >Parameters a,b indicating discrimination and difficulty
  
  >Good model for tests where guessing is highly unlikely, like in Fill in the Blanks tests


- 1PL-model (1-parameter logistic model or Rasch model)
  
  >Parameter b indicating difficulty
  
  >Good model for tests that assume for a difficult question, all able students do equally well
    
**The IRT Equation:**
p_i(theta) = c_i + ( (1 - c_i) / (1 + e ^ ( -a_i * (theta - b_i) ) )

*For 2PL-models, c_i = 0*

*For 1PL-models, a_i = 1, c_i = 0*
    
Briefly going into the terms used above:
- Score Estimate(theta) : It represents the estimated score of a subject from the sample
- Difficulty(b) : It represents the difficulty of an item and shifts the **Item Response Function** towards left or right
- Discrimination(a) : It represents how well an item discriminates between subjects of the sample having extreme score estimates and also determines slope of the **Item Response Function**
- Guessing Probability(c) : It represents how guessing affects the score estimate of a sample and shifts the whole **Item Response Function** up based on that intercept
    
Generally the design principles governing any **Computer Adaptive Test** using **IRT** are:
1. Creating an item bank based on the model of IRT being used
2. Calibrating the item bank to have close to reality difficulty,discrimination or guessing probability
3. Starting the test and initial ability estimation
4. Selecting questions based on current ability estimate
5. Interim ability estimation based on item responses
6. Stopping rule
7. Final ability estimation for displaying scores

In the following code, the specifications mentioned below have been followed:
* 1-PL model having only item difficulty
* Initial ability estimation is done from a uniform Beta distribution as a prior
* A mixed approach for item selection and ability estimation
  - MAP estimator for ability estimation (as mentioned in the pyMC reference prescribed)
  - Maximum information item selection (this is a classical approach)
  - The reason for deviating from Bayesian Modelling for item selection
    + Resource did not explain item selction from posterior distribution
    + pyMC docs are not that readable or understandable for people with no statistical background
* Stopping rule is determined by the quizLength attribute of the Quiz class which basically is the number of questions to be administered before stopping

Few points worth mentioning:
+ The quiz starts at minimum ability and on answering the first question wrong, cannot display any other item that is less difficult, so carries on with questions of higher difficulty although increasing in steps based on its adaptive-ness
+ On answering the first question correctly, there is a bump to the hardest question in the question bank because the theta value of the learner jumps to 1.0 thus fetching the toughest question and on answering the toughest question correctly, keeps the score as 1.0 and since it cannot find anything tougher in the item bank displays easier questions. This could be quite a good place for the introduction of a 2PL model and a variable quiz length
+ For average students, the test would work almost fine, since it could be highly estimated that almost everyone (~99%) would be able to answer the easiest question correctly and many would not be able to answer the toughest question correctly.
+ Final Score Estimation has not been done in this project as that would much likely be application specific and since the score estimate ranges between 0 and 1 it could easily be multiplied by 100 to get a percentage score
+ Not everything is very efficient complexity-wise, although they can later be looked at if decreasing time complexity is possible
+ Lastly, the questions.txt file basically contains four sets of the same 5 questions with varied levels of difficulty for demonstration purposes. The answers to those questions have been marked with an asterisk so as to let the user decide if to act frantically or not. Moreover the item bank is not calibrated and the difficulty values for items have just been assigned randomly.

In [14]:
#ten different difficulty levels (0-9) can be a floating-point number
#questions can return their information for a theta via the function evalIIF
#questions will be selected for an ability level depending upon the maximum-information of an item

# Question Class

In [1]:
import numpy as np
import pymc as pm
import random

class Question:
    def __init__(self, i, question, answers, correct_answer, difficulty):
        self.id = i
        self.question = question
        self.answers = answers
        self.correct_answer = correct_answer
        self.difficulty = difficulty
        
    def display(self):
        print(self.question+"\nDifficulty: "+str(self.difficulty)+"\na. "+self.answers[0]+"\nb. "+self.answers[1]+"\nc. "+self.answers[2]+"\nd. "+self.answers[3])
    
    def validateAnswer(self, answer):
        if(answer.upper()==self.correct_answer.upper()):
            return True
        return False
    
    def writeFormat(self):
        return "{0}=>{1}=>{2[0]}=>{2[1]}=>{2[2]}=>{2[3]}=>{3}=>{4}".format(self.id,self.question,self.answers,self.correct_answer,self.difficulty)
    
    def evalIRF(self,theta):
        return 1/(1+np.exp(-1*(theta*10-self.difficulty)))
    
    def evaldIRF(self,theta):
        e = 2.718
        return 10*(np.log(e)*np.exp(-1*(theta*10-self.difficulty)))/np.power((1+np.exp(-1*(theta*10-self.difficulty))),2)
    
    def evalIIF(self, theta):
        dp = self.evaldIRF(theta)
        p = self.evalIRF(theta)
        return np.power(dp,2)/(p*(1-p))

# Learner Class

In [2]:
import numpy as np
import pymc as pm
import random

class Learner:
    def __init__(self,name):
        self.name = name
        self.answered = list()
    
    def addQuestionToAnswered(self,question):
        self.answered.append(question.id)
        
    def updateAbility(self,question,data,theta):
        @pm.deterministic
        def p(theta=theta):
            return 1.0/(1+np.exp(-1*(theta*10-question.difficulty)))
        x = pm.Bernoulli('x',p,value=data[-1],observed=True)
        model = pm.Model([theta, p, x])
        m = pm.MAP(model)
        m.fit()
        self.ability = '%.4f'%m.get_node('theta').value
        
    def viewScore(self):
        print(self.name+" scored "+self.ability)

# Fetch the different Questions into a Question Bank

In [3]:
def fetchQuestionBank(filepath):
    qBank = list()
    file = open(filepath,mode = "r",encoding="utf-8")
    
    for line in file:
        questionDetails = line.split('=>')
        questionId = int(questionDetails[0])
        question = questionDetails[1]
        answers = list()
        for i in range(2,6):
            answers.append(questionDetails[i])
        correct_answer = questionDetails[6]
        difficulty = float(questionDetails[7])
        q = Question(questionId,question,answers,correct_answer,difficulty)
        qBank.append(q)
    
    file.close()
    return qBank

## Just Displaying the Questions 

In [4]:
questionBank = fetchQuestionBank("questions.txt")
for question in questionBank:
    question.display()

Which Indian-origin boy has been crowned as the UK’s ‘Child Genius’ in a popular television quiz competition?
Difficulty: 3.0
a. Rahul Doshi*
b. Naveen Jain
c. Preethi Singh
d. Kalpana Chauhan
Which state government has launched an exclusive 24×7 helpline ‘181’ for women?
Difficulty: 4.0
a. Uttar Pradesh
b. Telangana*
c. Assam
d. Jharkhand
Which city is hosting the 8th World Renewable Energy Technology Congress?
Difficulty: 1.0
a. Pune
b. Guwahati
c. New Delhi*
d. Bhopal
MK Damodaran, who passed away recently, was the former Advocate General of which state?
Difficulty: 5.0
a. Odisha
b. Uttar Pradesh
c. Haryana
d. Kerala*
Which state government will launch a new scheme for Compassionate Family Pension (CFP) in lieu of compassionate appointment?
Difficulty: 2.0
a. Assam*
b. Manipur
c. Arunachal Pradesh
d. Jammu & Kashmir
Which Indian-origin boy has been crowned as the UK’s ‘Child Genius’ in a popular television quiz competition?
Difficulty: 9.0
a. Rahul Doshi*
b. Naveen Jain
c. Preethi S

# Quiz Class To Simulate the Adaptive Test

In [5]:
import numpy as np
import pymc as pm
import random

class Quiz:
    def __init__(self,questionBank,learner,quizLength):
        self.questionBank = questionBank
        self.learner = learner
        self.count = 0
        self.responses = list()
        self.quizLength = quizLength
        
    def fetchMostRelevantQuestion(self):
        ability = float(self.learner.ability)
        unAnswered = [question for question in self.questionBank if question.id not in self.learner.answered]
        information = [question.evalIIF(ability) for question in self.questionBank if question.id not in self.learner.answered]
        mostRelevantQuestion = unAnswered[information.index(max(information))]
        return mostRelevantQuestion
    
    def runQuiz(self):
        a=1
        b=1
        theta = pm.Beta('theta',a,b)
        
        difficulties = [question.difficulty for question in self.questionBank]
        minDifficulty = min(difficulties)
        
        qList = [question for question in self.questionBank if (question.difficulty==minDifficulty)]
        index = random.randint(0,len(qList)-1)
        mostRecentQuestion = qList[index]
        print("\n"+str(self.count+1)+"::")
        mostRecentQuestion.display()
        
        response = input('Please Enter Your Answer: ')
        self.learner.addQuestionToAnswered(mostRecentQuestion)
        self.count += 1
        
        if(mostRecentQuestion.validateAnswer(response)):
            self.responses.append(1)
        else:
            self.responses.append(0)
        
        self.learner.updateAbility(mostRecentQuestion,self.responses,theta)
        
        while(self.count<self.quizLength):
            print("\n"+str(self.count+1)+"::")
            mostRecentQuestion = self.fetchMostRelevantQuestion()
            mostRecentQuestion.display()
            
            response = input('Please Enter Your Answer: ')
            self.learner.addQuestionToAnswered(mostRecentQuestion)
            self.count += 1
            
            if(mostRecentQuestion.validateAnswer(response)):
                self.responses.append(1)
            else:
                self.responses.append(0)
            
            self.learner.updateAbility(mostRecentQuestion,self.responses,theta)
        
        self.learner.viewScore()

In [6]:
learner = Learner("Baladitya Swaika")

In [7]:
quiz = Quiz(questionBank,learner,5)

In [8]:
quiz.runQuiz()


1::
Which state government has launched an exclusive 24×7 helpline ‘181’ for women?
Difficulty: 0.0
a. Uttar Pradesh
b. Telangana*
c. Assam
d. Jharkhand
Please Enter Your Answer: b

2::
Which Indian-origin boy has been crowned as the UK’s ‘Child Genius’ in a popular television quiz competition?
Difficulty: 9.0
a. Rahul Doshi*
b. Naveen Jain
c. Preethi Singh
d. Kalpana Chauhan
Please Enter Your Answer: c
Cannot calculate AIC: float division by zero

3::
Which state government will launch a new scheme for Compassionate Family Pension (CFP) in lieu of compassionate appointment?
Difficulty: 4.5
a. Assam*
b. Manipur
c. Arunachal Pradesh
d. Jammu & Kashmir
Please Enter Your Answer: a

4::
MK Damodaran, who passed away recently, was the former Advocate General of which state?
Difficulty: 7.0
a. Odisha
b. Uttar Pradesh
c. Haryana
d. Kerala*
Please Enter Your Answer: c

5::
Which Indian-origin boy has been crowned as the UK’s ‘Child Genius’ in a popular television quiz competition?
Difficulty:

In [9]:
learner = Learner("Baladitya Swaika")
quiz = Quiz(questionBank,learner,5)
quiz.runQuiz()


1::
Which state government has launched an exclusive 24×7 helpline ‘181’ for women?
Difficulty: 0.0
a. Uttar Pradesh
b. Telangana*
c. Assam
d. Jharkhand
Please Enter Your Answer: b

2::
Which Indian-origin boy has been crowned as the UK’s ‘Child Genius’ in a popular television quiz competition?
Difficulty: 9.0
a. Rahul Doshi*
b. Naveen Jain
c. Preethi Singh
d. Kalpana Chauhan
Please Enter Your Answer: c
Cannot calculate AIC: float division by zero

3::
Which state government will launch a new scheme for Compassionate Family Pension (CFP) in lieu of compassionate appointment?
Difficulty: 4.5
a. Assam*
b. Manipur
c. Arunachal Pradesh
d. Jammu & Kashmir
Please Enter Your Answer: d

4::
Which state government will launch a new scheme for Compassionate Family Pension (CFP) in lieu of compassionate appointment?
Difficulty: 2.0
a. Assam*
b. Manipur
c. Arunachal Pradesh
d. Jammu & Kashmir
Please Enter Your Answer: a

5::
MK Damodaran, who passed away recently, was the former Advocate General 

## Conclusion
In conclusion, the adaptive test battery shown above demonstrates how the way of selcting items and score estimation provides adaptiveness in the system. This system works best for the average student(with theta 0.2 - 0.8). This is due to a few drawbacks in the implementation
- a very small item bank which is not properly calibrated
- items have not been modelled to have a discrimination parameter

Both of these are very easily solvable by very few changes in the code because only the IRF(Item Response Function) and IIF(Item Information Function) mathematics changes which is easy to write and implement. Making the item bank large is very easy, just calibrating it could take some time if done with the help of some prior test which imitates the type of items in the item pool. There are other ways of calibrating the item bank with the help of statistical software but that is not a very good approach as sampling stuff from a software cold often be erroneous than manually doing it. Its better to launch a linear version of the adaptive test which would give us the data on how to parameterize our questions based on the difficulty and discrimination parameters.

Also, worth mentioning 3PL models could be used but that would have very little effect on the adaptiveness of the system. It is recommended to have a lot of questions of the same difficulty but with different discrimination values.

All in all, the system models the functional part of the adaptive testing system.

### Thank You!
#### **Baladitya Swaika**