## Automatic Test Generation

In this project, we will build an automatic test generation and grading platform!
All we have to do is to provide a topic, the number of questions and the number of options for each question!
Based on this information, a suitable test is generated, presented to the user and graded automatically!

## Imports

In [22]:
import os
import openai


## OpenAI API


### Set-up Open AI API Key


In [23]:
openai.api_key = "sk-********************************"

### Tell GPT how to generate the test

We tell GPT to create a multiple choiz quiz. Hence we define the topic, the number of possible answers as well as the number of questions.
To enable automatical grading later, GPT needs to incorporate the correct answer!


In [24]:
def create_test_prompt(topic, num_questions, num_possible_answers):
    prompt = f"Create a multiple choice quiz on the topic of {topic} consisting of {num_questions} questions. " \
                 + f"Each question should have {num_possible_answers} options. "\
                 + f"Also include the correct answer for each question using the starting string 'Correct Answer: '."
    return prompt

In [67]:
create_test_prompt("Korean history", 4, 4)

"Create a multiple choice quiz on the topic of Korean history consisting of 4 questions. Each question should have 4 options. Also include the correct answer for each question using the starting string 'Correct Answer: '."

### OpenAI API Call
Let's use text-davinci-003 for normal text generation

In [55]:
response = openai.Completion.create(engine="text-davinci-003",
                                            prompt=create_test_prompt("Korean history", 4, 4),
                                            max_tokens=256,
                                            temperature=0.7)

In [56]:
response

<OpenAIObject text_completion id=cmpl-7ZdQbXpyUAT0kQ9HXz2I7JwLong8i at 0x2799dbf9b80> JSON: {
  "id": "cmpl-7ZdQbXpyUAT0kQ9HXz2I7JwLong8i",
  "object": "text_completion",
  "created": 1688727397,
  "model": "text-davinci-003",
  "choices": [
    {
      "text": "\n\nQ1. Who was the first ruler of the Korean peninsula?\n\nA. Kim Il-sung\nB. King Gojong\nC. King Sejong\nD. King Dongmyeong\n\nCorrect Answer: D. King Dongmyeong\n\nQ2. When did the Korean War begin?\n\nA. 1950\nB. 1951\nC. 1952\nD. 1953\n\nCorrect Answer: A. 1950\n\nQ3. When did Korea become divided into North and South?\n\nA. 1945\nB. 1948\nC. 1950\nD. 1953\n\nCorrect Answer: B. 1948\n\nQ4. When did South Korea become a democracy?\n\nA. 1945\nB. 1948\nC. 1987\nD. 1993\n\nCorrect Answer: C. 1987",
      "index": 0,
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 40,
    "completion_tokens": 170,
    "total_tokens": 210
  }
}

In [57]:
response["choices"][0]["text"]

'\n\nQ1. Who was the first ruler of the Korean peninsula?\n\nA. Kim Il-sung\nB. King Gojong\nC. King Sejong\nD. King Dongmyeong\n\nCorrect Answer: D. King Dongmyeong\n\nQ2. When did the Korean War begin?\n\nA. 1950\nB. 1951\nC. 1952\nD. 1953\n\nCorrect Answer: A. 1950\n\nQ3. When did Korea become divided into North and South?\n\nA. 1945\nB. 1948\nC. 1950\nD. 1953\n\nCorrect Answer: B. 1948\n\nQ4. When did South Korea become a democracy?\n\nA. 1945\nB. 1948\nC. 1987\nD. 1993\n\nCorrect Answer: C. 1987'

### Q/A Extraction

We now need to extract the questions and answers to present them to the students later

In [58]:
def create_student_view(test, num_questions):
    student_view = {1 : ""}
    question_number = 1
    for line in test.split("\n"):
        if not line.startswith("Correct Answer:"):
            student_view[question_number] += line+"\n"
        else:

            if question_number < num_questions:
                question_number+=1
                student_view[question_number] = ""
    return student_view
 

In [59]:
create_student_view(response["choices"][0]["text"], 4)

{1: '\n\nQ1. Who was the first ruler of the Korean peninsula?\n\nA. Kim Il-sung\nB. King Gojong\nC. King Sejong\nD. King Dongmyeong\n\n',
 2: '\nQ2. When did the Korean War begin?\n\nA. 1950\nB. 1951\nC. 1952\nD. 1953\n\n',
 3: '\nQ3. When did Korea become divided into North and South?\n\nA. 1945\nB. 1948\nC. 1950\nD. 1953\n\n',
 4: '\nQ4. When did South Korea become a democracy?\n\nA. 1945\nB. 1948\nC. 1987\nD. 1993\n\n'}

In [60]:
def extract_answers(test, num_questions):
    answers = {1 : ""}
    question_number = 1
    for line in test.split("\n"):
        if line.startswith("Correct Answer:"):
            answers[question_number] += line+"\n"

            if question_number < num_questions:
                question_number+=1
                answers[question_number] = ""
    return answers



In [61]:
extract_answers(response["choices"][0]["text"], 4)

{1: 'Correct Answer: D. King Dongmyeong\n',
 2: 'Correct Answer: A. 1950\n',
 3: 'Correct Answer: B. 1948\n',
 4: 'Correct Answer: C. 1987\n'}

### Exam simulation
Based on the extracted questions, we can now simulate the exam

In [62]:
def take(student_view):
    answers = {}
    for question, question_view in student_view.items():
        print(question_view)
        answer = input("Enter your answer: ")
        answers[question] = answer
    return answers


In [None]:
student_answers = take(create_student_view(response["choices"][0]["text"], 4))



Q1. Who was the first ruler of the Korean peninsula?

A. Kim Il-sung
B. King Gojong
C. King Sejong
D. King Dongmyeong




### Automatic Grading
Based on the student's answers and correct answers, we can now grade the test!

In [65]:
def grade(correct_answer_dict, answers):
    correct_answers = 0
    for question, answer in answers.items():
        if answer.upper() == correct_answer_dict[question].upper()[16]:
            correct_answers+=1
    grade = 100 * correct_answers / len(answers)

    if grade < 60:
        passed = "Not passed!"
    else:
        passed = "Passed!"
    return f"{correct_answers} out of {len(answers)} correct! You achieved: {grade} % : {passed}"


In [66]:
grade(extract_answers(response["choices"][0]["text"], 4), student_answers)

'3 out of 4 correct! You achieved: 75.0 % : Passed!'