# MCQ Generation

![AI](https://cdn.classpoint.io/wp-content/uploads/generate-quiz.jpg)

This is a simple multiple-choice question generator using Google's Generative AI (LLM) and LangChain.

## 00. Load Gemini API Keys

In [31]:
import os 
import json
from dotenv import load_dotenv

load_dotenv()

# Get gemini_key from .env
GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")

## 01. Setup Gemini LLM

In [32]:
from langchain_google_genai import ChatGoogleGenerativeAI
from tqdm import tqdm

llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0.7)

In [33]:
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain, SequentialChain

## 02. Prompt Templates & LLM Chaining

In [34]:
RESPONSE_JSON = {
    "1": {
        "mcq": "multiple choice question",
        "options": {
            "a": "choice here",
            "b": "choice here",
            "c": "choice here",
            "d": "choice here"
        },
        "correct": "correct answer"
    },
    "2": {
        "mcq": "multiple choice question",
        "options": {
            "a": "choice here",
            "b": "choice here",
            "c": "choice here",
            "d": "choice here"
        },
        "correct": "correct answer"
    },
    "3": {
        "mcq": "multiple choice question",
        "options": {
            "a": "choice here",
            "b": "choice here",
            "c": "choice here",
            "d": "choice here"
        },
        "correct": "correct answer"
    }
}

In [35]:
# Creating a zero-shot prompt template
TEMPLATE=""""
Text: {text}
You are an expert MCQ maker. Given the above text, it is your job to \
    create a quiz of {number} multiple choice questions for {subject} students in {tone} tone.
    Make sure the questions are not repeated and check all the questions to be conforming the text as well.
    Make sure to format your response like RESPONSE_JSON below with appropriate json structure and use it as a guide. \
    Ensure to make {number} MCQs

    {response_json}
"""

In [36]:
quiz_generation_prompt = PromptTemplate(
    input_variables=["text", "number", "subject", "tone", "response_json"],
    template=TEMPLATE
)

In [37]:
# Setting up the LLMChain
quiz_chain = LLMChain(llm=llm, prompt=quiz_generation_prompt, output_key="quiz", verbose=True)

In [38]:
TEMPLATE2="""
You are an expert english grammarian and writer. Given a Multiple Choice Quiz for {subject} students.\
You need to evaluate the complexity of the question and give a complete analysis of the quiz. Only use at max 50 words for complexity analysis. 
if the quiz is not at per with the cognitive and analytical abilities of the students,\
update the quiz questions which needs to be changed and change the tone such that it perfectly fits the student abilities
Quiz_MCQs:
{quiz}

Check from an expert English Writer of the above quiz:
"""

In [39]:
quiz_evaluation_prompt = PromptTemplate(input_variables=["subject", "quiz'"], template=TEMPLATE2)

In [40]:
review_chain = LLMChain(llm=llm, prompt=quiz_evaluation_prompt, output_key="review", verbose=True)

In [41]:
# Sequential Chain
generate_evaluate_chain = SequentialChain(
    chains=[quiz_chain, review_chain],
    input_variables=["text", "number", "subject", "tone", "response_json"],
    output_variables=["quiz", "review"],
    verbose=True
)

## 04. Read Data File

In [42]:
# Read data from .txt file
file_path = "../data.txt"

try:
    with open(file_path, 'r') as file:
        TEXT = file.read()
except:
    print("data.txt NOT FOUND")

In [43]:
print(TEXT)

Deep Learning is a specialized area within the broader field of machine learning that focuses on algorithms modeled after the human brain’s architecture. It employs deep neural networks, which consist of multiple layers of interconnected nodes, to process data and identify intricate patterns and representations. Unlike traditional machine learning methods, which often rely on manually engineered features, deep learning models automatically learn hierarchical features directly from raw data. This capability makes them particularly effective in handling and analyzing large volumes of unstructured data such as images, audio, and text. The training of deep learning models is computationally intensive and typically requires powerful hardware like GPUs or TPUs to handle the large-scale data and complex calculations involved. Advances in deep learning have led to significant improvements in areas such as speech recognition, image classification, and autonomous systems, driving innovation acro

In [44]:
# Serialize the python dicitonary into a JSON formatted string
json.dumps(RESPONSE_JSON)

'{"1": {"mcq": "multiple choice question", "options": {"a": "choice here", "b": "choice here", "c": "choice here", "d": "choice here"}, "correct": "correct answer"}, "2": {"mcq": "multiple choice question", "options": {"a": "choice here", "b": "choice here", "c": "choice here", "d": "choice here"}, "correct": "correct answer"}, "3": {"mcq": "multiple choice question", "options": {"a": "choice here", "b": "choice here", "c": "choice here", "d": "choice here"}, "correct": "correct answer"}}'

## 05. Generate MCQs

In [45]:
NUM = 5                             # No of MCQ to generate
SUB = "Deep Learning"               # Subject
TONE = "Mid"                        # Simplicity Level: Easy, Mid, Hard, Very-Hard

In [46]:
response = generate_evaluate_chain(
    {
        "text": TEXT,
        "number": NUM,
        "subject": SUB,
        "tone": TONE,
        "response_json": json.dumps(RESPONSE_JSON)
    }
)



[1m> Entering new SequentialChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m"
Text: Deep Learning is a specialized area within the broader field of machine learning that focuses on algorithms modeled after the human brain’s architecture. It employs deep neural networks, which consist of multiple layers of interconnected nodes, to process data and identify intricate patterns and representations. Unlike traditional machine learning methods, which often rely on manually engineered features, deep learning models automatically learn hierarchical features directly from raw data. This capability makes them particularly effective in handling and analyzing large volumes of unstructured data such as images, audio, and text. The training of deep learning models is computationally intensive and typically requires powerful hardware like GPUs or TPUs to handle the large-scale data and complex calculations involved. Advances in deep learning have


[1m> Finished chain.[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
You are an expert english grammarian and writer. Given a Multiple Choice Quiz for Deep Learning students.You need to evaluate the complexity of the question and give a complete analysis of the quiz. Only use at max 50 words for complexity analysis. 
if the quiz is not at per with the cognitive and analytical abilities of the students,update the quiz questions which needs to be changed and change the tone such that it perfectly fits the student abilities
Quiz_MCQs:
```json
{"1": {"mcq": "What is the key advantage of deep learning over traditional machine learning methods in handling unstructured data?", "options": {"a": "Deep learning models require less data for training.", "b": "Deep learning models automatically learn features from raw data.", "c": "Deep learning models are faster to train.", "d": "Deep learning models are more accurate for structured data."}, "correct": "b"},

In [47]:
response

{'text': 'Deep Learning is a specialized area within the broader field of machine learning that focuses on algorithms modeled after the human brain’s architecture. It employs deep neural networks, which consist of multiple layers of interconnected nodes, to process data and identify intricate patterns and representations. Unlike traditional machine learning methods, which often rely on manually engineered features, deep learning models automatically learn hierarchical features directly from raw data. This capability makes them particularly effective in handling and analyzing large volumes of unstructured data such as images, audio, and text. The training of deep learning models is computationally intensive and typically requires powerful hardware like GPUs or TPUs to handle the large-scale data and complex calculations involved. Advances in deep learning have led to significant improvements in areas such as speech recognition, image classification, and autonomous systems, driving innov

In [48]:
quiz = response.get("quiz")
print(quiz)

```json
{"1": {"mcq": "What is the key advantage of deep learning over traditional machine learning methods in handling unstructured data?", "options": {"a": "Deep learning models require less data for training.", "b": "Deep learning models automatically learn features from raw data.", "c": "Deep learning models are faster to train.", "d": "Deep learning models are more accurate for structured data."}, "correct": "b"}, "2": {"mcq": "Which of the following is NOT a key application area where deep learning has significantly improved performance?", "options": {"a": "Speech recognition", "b": "Image classification", "c": "Database management", "d": "Autonomous systems"}, "correct": "c"}, "3": {"mcq": "What type of neural network is specifically designed for processing and analyzing image data?", "options": {"a": "Recurrent Neural Networks (RNNs)", "b": "Convolutional Neural Networks (CNNs)", "c": "Generative Adversarial Networks (GANs)", "d": "Multilayer Perceptrons (MLPs)"}, "correct": "b

## 06. Clean LLM Response

In [49]:
# Since the response is in markdown format, removing the markers
def extract_json_from_markdown(markdown_string):
    start_marker = '```json'
    end_marker = '```'
    start = markdown_string.find(start_marker) + len(start_marker)
    end = markdown_string.find(end_marker, start)
    return markdown_string[start:end].strip()

cleaned_json = extract_json_from_markdown(quiz)

In [50]:
print(cleaned_json)

{"1": {"mcq": "What is the key advantage of deep learning over traditional machine learning methods in handling unstructured data?", "options": {"a": "Deep learning models require less data for training.", "b": "Deep learning models automatically learn features from raw data.", "c": "Deep learning models are faster to train.", "d": "Deep learning models are more accurate for structured data."}, "correct": "b"}, "2": {"mcq": "Which of the following is NOT a key application area where deep learning has significantly improved performance?", "options": {"a": "Speech recognition", "b": "Image classification", "c": "Database management", "d": "Autonomous systems"}, "correct": "c"}, "3": {"mcq": "What type of neural network is specifically designed for processing and analyzing image data?", "options": {"a": "Recurrent Neural Networks (RNNs)", "b": "Convolutional Neural Networks (CNNs)", "c": "Generative Adversarial Networks (GANs)", "d": "Multilayer Perceptrons (MLPs)"}, "correct": "b"}, "4":

In [51]:
data = json.loads(cleaned_json)
print(data)

{'1': {'mcq': 'What is the key advantage of deep learning over traditional machine learning methods in handling unstructured data?', 'options': {'a': 'Deep learning models require less data for training.', 'b': 'Deep learning models automatically learn features from raw data.', 'c': 'Deep learning models are faster to train.', 'd': 'Deep learning models are more accurate for structured data.'}, 'correct': 'b'}, '2': {'mcq': 'Which of the following is NOT a key application area where deep learning has significantly improved performance?', 'options': {'a': 'Speech recognition', 'b': 'Image classification', 'c': 'Database management', 'd': 'Autonomous systems'}, 'correct': 'c'}, '3': {'mcq': 'What type of neural network is specifically designed for processing and analyzing image data?', 'options': {'a': 'Recurrent Neural Networks (RNNs)', 'b': 'Convolutional Neural Networks (CNNs)', 'c': 'Generative Adversarial Networks (GANs)', 'd': 'Multilayer Perceptrons (MLPs)'}, 'correct': 'b'}, '4':

## 07. Create & Save MCQ DataFrame

In [52]:
# Prepare lists to store the data and return 
mcqs = []
choices = []
correct_answers = []

# Loop through each question in the data
for question in data.values():
    mcqs.append(question['mcq'])
    choices.append(question['options'])
    correct_answers.append(question['correct'])

In [56]:
import pandas as pd

# Create a DataFrame
quiz_df = pd.DataFrame({
    "MCQ": mcqs,
    "CHOICES": choices,
    "CORRECT ANSWER": correct_answers
})

In [57]:
quiz_df

Unnamed: 0,MCQ,CHOICES,CORRECT ANSWER
0,What is the key advantage of deep learning ove...,{'a': 'Deep learning models require less data ...,b
1,Which of the following is NOT a key applicatio...,"{'a': 'Speech recognition', 'b': 'Image classi...",c
2,What type of neural network is specifically de...,"{'a': 'Recurrent Neural Networks (RNNs)', 'b':...",b
3,Which of the following NLP techniques excels a...,"{'a': 'Support Vector Machines (SVMs)', 'b': '...",c
4,What is a major challenge associated with trai...,"{'a': 'Limited availability of data', 'b': 'Hi...",b


In [58]:
# Save the dataframe in csv file
quiz_df.to_csv('DeepLearning.csv',index=False)

In [60]:
# Display basic MCQ Format
def print_mcq(data):
    for question, details in data.items():
        print(f"Question: {details['mcq']}")
        
        # Print choices
        for key, value in details['options'].items():
            print(f"  {key}: {value}")
        
        # Print correct answer
        correct_key = details['correct']
        correct_answer = details['options'][correct_key]
        print(f"\nCorrect Answer: {correct_key}\n")

# Call the function with the data
print_mcq(data)

Question: What is the key advantage of deep learning over traditional machine learning methods in handling unstructured data?
  a: Deep learning models require less data for training.
  b: Deep learning models automatically learn features from raw data.
  c: Deep learning models are faster to train.
  d: Deep learning models are more accurate for structured data.

Correct Answer: b

Question: Which of the following is NOT a key application area where deep learning has significantly improved performance?
  a: Speech recognition
  b: Image classification
  c: Database management
  d: Autonomous systems

Correct Answer: c

Question: What type of neural network is specifically designed for processing and analyzing image data?
  a: Recurrent Neural Networks (RNNs)
  b: Convolutional Neural Networks (CNNs)
  c: Generative Adversarial Networks (GANs)
  d: Multilayer Perceptrons (MLPs)

Correct Answer: b

Question: Which of the following NLP techniques excels at capturing the sequential and co