# **MCQ Generator using Spacy Library**

# Problem Statement:
The problem statement is to develop a solution that can automatically generate objective questions with multiple correct answers based on a given chapter from a subject. These questions should test the reader's understanding of the chapter and have more than one possible correct answer to increase the complexity and challenge of the questions. The objective is to create engaging and challenging assessments for students.

# Potential Application of the Problem Statement:
- Education Sector: This solution can be applied in the education sector to assist educators in generating a wide range of objective questions for assessments, quizzes, and exams.
- E-Learning Platforms: E-learning platforms can use this solution to automatically create interactive quizzes and assessments for online courses.
- Publishing: Educational publishers can use this to automatically generate question banks for textbooks.
- Tutoring and Test Preparation: Companies offering tutoring and test preparation services can use this to create customized practice tests for their students.

# Type of Problem:
The problem described in the problem statement is primarily a Natural Language Processing (NLP) problem. It involves tasks such as text processing, text generation, and question generation based on a given text. The solution may involve elements of both supervised and unsupervised learning, as well as text generation techniques.

# Data:
The data required for this problem includes text from chapters in different subjects, and this text can be used as the source material for generating questions. You would need a diverse and extensive corpus of subject-specific texts.

# Problems with Data and How They Are Solved:
- **Availability of Diverse Texts:** One challenge is to have access to a diverse set of texts for different subjects. You may need to collect or compile texts from various sources and subjects.

# Stepwise Solution (Flowchart):
1. **Data Collection:** Gather a diverse set of text data from different subjects and chapters.
2. **Text Preprocessing:** Clean and preprocess the text data, including tokenization, removal of stop words, and other text normalization steps.
3. **Text Analysis:** Use NLP techniques to analyze the text, identify important concepts, keywords, and relevant sections in the text.
4. **Question Generation:** Develop algorithms or models that can automatically generate objective questions based on the analyzed text. The questions should be designed to have multiple correct answers, encouraging critical thinking and exploration.
5. **Answer Generation:** Generate multiple correct answers for each question, ensuring that they align with the content of the chapter and the question itself.
6. **Validation:** Implement a validation process to ensure the generated questions and answers are accurate and aligned with the content.
7. **User Interface:** Develop a user interface or API that allows educators to input the chapter and receive a set of generated questions.
8. **Deployment:** Deploy the solution for use in educational settings.

# Final Solution:
The final solution would be a robust NLP-based system that takes a chapter from a subject as input and automatically generates objective questions with multiple correct answers. Educators can use this solution to create engaging and challenging assessments for their students. The system should be accurate, efficient, and user-friendly, enabling easy integration into various educational platforms and workflows. It should encourage critical thinking and exploration by offering questions with multiple correct answers, thereby testing the reader's comprehension and encouraging them to explore different perspectives and possibilities.

The below code generates multiple-choice questions (MCQs) based on a given context paragraph using the Spacy library. The MCQs are designed to have multiple correct answer choices for added variety. The generated MCQs are then displayed to the user.

### **Import necessary libraries**

The script begins by importing two libraries: `spacy` for natural language processing and `random` for generating random choices.

In [1]:
import spacy
import random

### **Load English language model**

It loads the English language model for spaCy named "en_core_web_sm" using `spacy.load("en_core_web_sm")`. This model is used for processing and analyzing the text.

In [2]:
nlp = spacy.load("en_core_web_sm")

### **Define function**

A function named `get_mca_questions` is defined, which takes two arguments: `context` (the text from which questions will be generated) and `num_questions` (the number of questions to be generated).

In [3]:
def get_mca_questions(context: str, num_questions: int):
  doc = nlp(context)

### **Define MCQ generation function**

Defines a function generate_mcq_with_multiple_correct to create MCQs with multiple correct answers.

In [4]:
def generate_mcq_with_multiple_correct(question, correct_answers, other_options, num_options=4):
        options = correct_answers + other_options
        random.shuffle(options)

        mcq = {
            "question": question,
            "options": options,
            "correct_answers": correct_answers
        }

### **Generate a variety question**

A nested function named `generate_mcq_with_multiple_correct` is defined within `get_mca_questions`. This function is responsible for generating a single MCQ with multiple correct answers.
   - It takes a `question` (the question text), `correct_answers` (a list of correct answer options), `other_options` (a list of incorrect answer options), and an optional argument `num_options` (the number of answer options, which is set to 4 by default).

In [9]:
def generate_variety_question():
        # randomly select the sentence from content
        sentence = random.choice(list(doc.sents))

        # randomly choose non- pronounciation words from sentence as blank word
        blank_word = random.choice([token for token in sentence if not token.is_punct])

        # create a question text with blank word ----
        question_text = sentence.text.replace(blank_word.text, "______")

        #set correct answers to the blank word
        correct_answers = [blank_word.text]

        #generating other possible answers
        other_options = [token.text for token in doc if token.is_alpha and token.text != correct_answers[0]]

        #randonly determine how many correct options
        num_correct_options = random.randint(1, 2)

        #randomly select correct options to the list of options
        correct_answers.extend(random.sample(other_options, num_correct_options))

        # no of incorrect answers
        num_other_options = min(4 - num_correct_options, len(other_options))
        other_options = random.sample(other_options, num_other_options)

        #generationg final MCQ
        mcq = generate_mcq_with_multiple_correct(question_text, correct_answers, other_options)
        return mcq

### **Generate questions & Process and format questions**

- Another nested function named `generate_variety_question` is defined within `get_mca_questions`. This function generates a variety of MCQs with blank spaces in random sentences of the given context.
   - It selects a random sentence from `doc` and a random non-punctuation token within that sentence.
   - It replaces the selected token with "______" to create a question with a blank space.
   - It generates one or two correct answer options and selects other random options from the text.

In [None]:
#created empty list to store multiple choice questions
mca_questions = []

    # enumerate function is used to iterate over the questions
for i, question in enumerate(questions, start=1):

    #created a string for question number and question text.
    question_str = f"Q{i}: {question['question']}\n"

    #created empty string to store option for current question
    options_str = ""

    #iterate through options
    for j, option in enumerate(question['options']):
        options_str += f"{j+1}. {option}\n"

    #format the correct answers into human redable format
    correct_options_formatted = " & ".join([f"({chr(97+question['options'].index(ans))})" for ans in question['correct_answers']])

    #combine the questions and options and format the correct answes
    correct_options_str = f"Correct Options: {correct_options_formatted}"

    #add the questions into formated questions
    mca_question = f"{question_str}{options_str}{correct_options_str}\n"
    mca_questions.append(mca_question)

#return the MCQ questions
return mca_questions

### **Print Questions**

- The `generate_variety_question` function is called in a loop `num_questions` times, and the generated MCQs are stored in the `questions` list.

In [None]:
#user input for paragraph
context = input("Enter the paragraph: ")

#no of questions user want to generate
num_questions = int(input("Enter the number of questions: "))

#calls the function and generate MCQ questions
mca_questions = get_mca_questions(context, num_questions)
for question in mca_questions:
    print(question)

## **MCQ Source Code**

In [13]:
import spacy
import random

# Load English language model
nlp = spacy.load("en_core_web_sm")

def get_mca_questions(context: str, num_questions: int):
    doc = nlp(context)

    def generate_mcq_with_multiple_correct(question, correct_answers, other_options, num_options=4):
        options = correct_answers + other_options
        random.shuffle(options)

        mcq = {
            "question": question,
            "options": options,
            "correct_answers": correct_answers
        }

        return mcq

    def generate_variety_question():
        sentence = random.choice(list(doc.sents))
        blank_word = random.choice([token for token in sentence if not token.is_punct])

        question_text = sentence.text.replace(blank_word.text, "______")
        correct_answers = [blank_word.text]

        other_options = [token.text for token in doc if token.is_alpha and token.text != correct_answers[0]]
        num_correct_options = random.randint(1, 2)  # Generate 1 or 2 correct options
        correct_answers.extend(random.sample(other_options, num_correct_options))

        num_other_options = min(4 - num_correct_options, len(other_options))
        other_options = random.sample(other_options, num_other_options)

        mcq = generate_mcq_with_multiple_correct(question_text, correct_answers, other_options)
        return mcq

    questions = [generate_variety_question() for _ in range(num_questions)]

    mca_questions = []
    for i, question in enumerate(questions, start=1):
        question_str = f"Q{i}: {question['question']}\n"
        options_str = ""
        for j, option in enumerate(question['options']):
            options_str += f"{j+1}. {option}\n"

        correct_options_formatted = " & ".join([f"({chr(97+question['options'].index(ans))})" for ans in question['correct_answers']])
        correct_options_str = f"Correct Options: {correct_options_formatted}"

        mca_question = f"{question_str}{options_str}{correct_options_str}\n"
        mca_questions.append(mca_question)

    return mca_questions

context = input("Enter the paragraph: ")
num_questions = int(input("Enter the number of questions: "))
mca_questions = get_mca_questions(context, num_questions)
for question in mca_questions:
    print(question)


Q1: By ______ early seventeenth  century, ______ Dutch too were exploring ______ possibilities  of trade in ______ Indian Ocean.
1. possibilities
2. Vasco
3. the
4. so
5. competition
Correct Options: (c) & (e) & (b)

Q2: In fact, it was Vasco da  Gama, a Portuguese explorer, ______ had discovered this  sea route to India in 1498.
1. Portuguese
2. first
3. the
4. who
5. early
Correct Options: (d) & (e) & (b)

Q3: With  this charter, the Company  could venture across the  oceans, looking for new lands  from which it could buy goods  at a cheap price, and carry them back to Europe to  sell at higher ______.
1. prices
2. not
3. so
4. the
5. and
Correct Options: (a) & (d)

Q4: ______ India Company  Comes ______ In 1600, the ______ India  Company acquired a charter  from the ruler of England,  Queen Elizabeth I, granting  it the sole right to trade with  the ______.
1. goods
2. By
3. East
4. competition
5. excluding
Correct Options: (c) & (d)

Q5: By the time the ______ English ships sailed 

## **Conclusion**

Finally, the script prints the generated MCQs, displaying each question along with answer options and the correct answer options.

The code essentially allows you to input a paragraph, and it generates multiple-choice questions based on that paragraph with blank spaces. The answer options are also randomly selected from the text, providing a variety of MCQs for practice or assessment.