[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1D_1eC7cZ9LrWx9TifsxYk7tnzW_K7UgO?usp=sharing)

## Generate MCQs from Data using [Educhain](https://github.com/satvik314/educhain)


Explore the power of AI-driven education with Educhain! This notebook demonstrates how to create high-quality Multiple Choice Questions (MCQs) from various data sources using the Educhain package.

Key Features:
- Generate MCQs from PDF files, web pages, and plain text
- Customize difficulty levels and learning objectives
- Export questions to CSV, JSON, or PDF formats
- Leverage advanced language models for question generation

Perfect for educators, content creators, and e-learning developers looking to automate and enhance their question creation process. Dive in to revolutionize your approach to educational content generation!



In [1]:
# First, install the Educhain package
!pip install -qU educhain

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m987.6/987.6 kB[0m [31m8.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.2/2.2 MB[0m [31m14.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m46.1/46.1 kB[0m [31m2.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m328.5/328.5 kB[0m [31m14.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.9/1.9 MB[0m [31m20.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m232.6/232.6 kB[0m [31m16.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m366.5/366.5 kB[0m [31m13.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m127.9/127.9 kB[0m [31m6.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━

In [2]:
# Set up your OpenAI API key
import os
from google.colab import userdata

os.environ["OPENAI_API_KEY"] = userdata.get('OPENAI_API_KEY')

### Generating MCQs from a URL

In [4]:
from educhain import qna_engine

url_mcqs = qna_engine.generate_mcqs_from_data(
        source="https://www.buildfastwithai.com/genai-course",
        source_type="url",
        num=5,
        learning_objective="schedule of the course",
        difficulty_level="Easy"
    )

url_mcqs.show()

Question 1:
Question: When does the 5th Cohort of the Crash Course on Generative AI start?
Options:
  A. August 3rd
  B. August 10th
  C. August 17th
  D. August 24th

Correct Answer: August 10th
Explanation: The 5th Cohort of the Crash Course on Generative AI starts on August 10th, as mentioned in the schedule section of the content.

Question 2:
Question: Who is the lead mentor for the course?
Options:
  A. Satvik
  B. Zoya
  C. Gita
  D. Krishna

Correct Answer: Satvik
Explanation: Satvik is mentioned as the founder of Build Fast with AI and the lead mentor for the course in the Mentor section of the content.

Question 3:
Question: What is the topic of Week 2 in the course schedule?
Options:
  A. Introduction to Gen AI
  B. Building AI Chatbots & Chat with Data
  C. Fine Tuning Models on your Data
  D. AI Agents + Multimodal models + Image Models

Correct Answer: Building AI Chatbots & Chat with Data
Explanation: Week 2 in the course schedule covers the topic of Building AI Chatbots

In [27]:
from educhain import qna_engine

mcqs_from_url = qna_engine.generate_mcqs_from_data(
        source="https://en.wikipedia.org/wiki/Butterfly_effect",
        source_type="url",
        num=3,
        learning_objective="Impact of the Butterfly Effect",
    )

mcqs_from_url.show()

Question 1:
Question: Who is closely associated with the concept of the butterfly effect?
Options:
  A. Isaac Newton
  B. Edward Norton Lorenz
  C. Albert Einstein
  D. Marie Curie

Correct Answer: Edward Norton Lorenz
Explanation: The butterfly effect is closely associated with the work of mathematician and meteorologist Edward Norton Lorenz.

Question 2:
Question: What is the butterfly effect also known as?
Options:
  A. Sensitive dependence on initial conditions
  B. Random effect theory
  C. Chaotic system principle
  D. Deterministic nonlinearity

Correct Answer: Sensitive dependence on initial conditions
Explanation: The butterfly effect is also known as sensitive dependence on initial conditions, where small changes can lead to significant differences in outcomes.

Question 3:
Question: Which mathematical framework provides a simple example demonstrating the butterfly effect?
Options:
  A. Logistic map
  B. Exponential map
  C. Horseshoe map
  D. Standard map

Correct Answer: Lo

### Generate MCQs from Text

In [5]:
text_content = """
    The Big Mac Index, introduced by The Economist in 1986, is a lighthearted way to measure purchasing power parity (PPP) between different currencies.
    It compares the price of a McDonald's Big Mac burger across various countries, using the idea that a widely available, uniform product should cost the same in different nations when adjusted for exchange rates.
    This index suggests that, in the long run, exchange rates should adjust so that a basket of goods (represented by the Big Mac) costs the same in different countries.
    While not a precise economic tool, the Big Mac Index has gained popularity for its simplicity in explaining complex economic concepts.
    It's often used as a starting point for discussions about currency valuation and economic disparities between nations.
    The index has even inspired similar comparisons using other products, like the "iPad index" or the "Starbucks latte index".
    """

text_mcqs = qna_engine.generate_mcqs_from_data(
        source=text_content,
        source_type="text",
        num=3,
    )

text_mcqs.show()

Question 1:
Question: What is the main purpose of the Big Mac Index?
Options:
  A. To compare the nutritional value of fast food items
  B. To measure purchasing power parity between different currencies
  C. To determine the most popular fast food chain worldwide
  D. To analyze the marketing strategies of McDonald's

Correct Answer: To measure purchasing power parity between different currencies
Explanation: The Big Mac Index is used to compare the price of a McDonald's Big Mac burger across different countries to measure purchasing power parity.

Question 2:
Question: Which year was the Big Mac Index introduced by The Economist?
Options:
  A. 1999
  B. 2005
  C. 1986
  D. 2010

Correct Answer: 1986
Explanation: The Big Mac Index was introduced by The Economist in 1986 as a lighthearted way to measure purchasing power parity.

Question 3:
Question: What other product comparison index has been inspired by the Big Mac Index?
Options:
  A. Pizza index
  B. Coca-Cola index
  C. iPad inde

### Generate MCQs from PDF

In [6]:
pdf_mcqs = qna_engine.generate_mcqs_from_data(
        source="/content/CrashCourse_Info_Cohort5.pdf",
        source_type="pdf",
        num=5
    )

pdf_mcqs.show()

Question 1:
Question: Who is the lead mentor for the course?
Options:
  A. Satvik
  B. John
  C. Emily
  D. Michael

Correct Answer: Satvik
Explanation: Satvik is the founder of Build Fast with AI and the lead mentor for the course.

Question 2:
Question: What is the fee for the comprehensive 4-week course?
Options:
  A. Rs 10,000
  B. Rs 15,000
  C. Rs 20,000
  D. Rs 25,000

Correct Answer: Rs 20,000
Explanation: The fee for the course is Rs 20,000, covering all instructional materials, hands-on sessions, and projects.

Question 3:
Question: Which module involves creating an AI Writer application?
Options:
  A. Week 1: Introduction to GenAI & Crafting your First AI Application
  B. Week 2: Building AI Chatbots & Chat with Data (RAG)
  C. Week 3: Fine Tuning Models on your Data + Intro to Audio Models
  D. Week 4: AI Agents + Multimodal Models + Image Models

Correct Answer: Week 3: Fine Tuning Models on your Data + Intro to Audio Models
Explanation: The module in Week 3 involves build

### Using different models

In [7]:
!pip install -qU langchain-google-genai

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/164.2 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m164.2/164.2 kB[0m [31m5.2 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/718.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m718.3/718.3 kB[0m [31m20.5 MB/s[0m eta [36m0:00:00[0m
[?25h

In [17]:
from langchain_google_genai import ChatGoogleGenerativeAI
from google.colab import userdata

# Initialize Google's Gemini 1.5 Flash
gemini_flash = ChatGoogleGenerativeAI(model = "gemini-1.5-flash-latest",
                                      google_api_key=userdata.get('GOOGLE_API_KEY')
                                      )

gemini_pro = ChatGoogleGenerativeAI(model = "gemini-1.5-pro-latest",
                                      google_api_key=userdata.get('GOOGLE_API_KEY')
                                      )

In [23]:
from educhain import qna_engine

url_mcqs = qna_engine.generate_mcqs_from_data(
        source="https://en.wikipedia.org/wiki/Butterfly_effect",
        source_type="url",
        num=3,
        learning_objective="real life examples",
        llm = gemini_flash
    )

url_mcqs.show()

Error parsing output: Expecting value: line 1 column 1 (char 0)
Raw output:
```json
{
  "questions": [
    {
      "question": "Which of the following is a real-life example of the butterfly effect?",
      "options": [
        "A small change in the initial conditions of a weather model leads to a significantly different weather forecast.",
        "A butterfly flapping its wings causes a tornado in a distant location.",
        "A single mutation in a gene can lead to a new species.",
        "A drop of water falling into a pond creates ripples that spread outward."
      ],
      "answer": "A small change in the initial conditions of a weather model leads to a significantly different weather forecast.",
      "explanation": "This is the most accurate representation of the butterfly effect. The other options are not directly related to the concept of sensitive dependence on initial conditions."
    },
    {
      "question": "The butterfly effect is often used to explain why:",
     

In [19]:
from educhain import qna_engine

url_mcqs = qna_engine.generate_mcqs_from_data(
        source="https://en.wikipedia.org/wiki/List_of_cognitive_biases",
        source_type="url",
        num=3,
        learning_objective="Types of Biases",
        llm = gemini_flash
    )

url_mcqs.show()

Error parsing output: Expecting value: line 1 column 1 (char 0)
Raw output:
```json
{
  "questions": [
    {
      "question": "Which cognitive bias describes the tendency to overestimate the likelihood of events that are easily recalled, often due to their recency or emotional impact?",
      "options": [
        "Anchoring bias",
        "Availability heuristic",
        "Confirmation bias",
        "Egocentric bias"
      ],
      "answer": "Availability heuristic",
      "explanation": "The availability heuristic is a mental shortcut where we judge the likelihood of events based on how easily examples come to mind. This often leads to overestimating the frequency or probability of events that are more vivid or readily available in our memory."
    },
    {
      "question": "The tendency to interpret ambiguous information in a way that supports pre-existing beliefs is known as:",
      "options": [
        "Anchoring bias",
        "Confirmation bias",
        "Framing effect",
   

In [24]:
# Deepseek Model
from langchain_openai import ChatOpenAI

deepseek = ChatOpenAI(model = "deepseek/deepseek-chat",
                      openai_api_key = userdata.get("OPENROUTER_API_KEY"),
                      openai_api_base = "https://openrouter.ai/api/v1"

)

In [25]:
from educhain import qna_engine

url_mcqs = qna_engine.generate_mcqs_from_data(
        source="https://en.wikipedia.org/wiki/List_of_cognitive_biases",
        source_type="url",
        num=3,
        learning_objective="Types of Biases",
        llm = deepseek
    )

url_mcqs.show()

Error parsing output: Expecting value: line 1 column 1 (char 0)
Raw output:
```json
{
  "questions": [
    {
      "question": "What is the 'Anchoring Bias'?",
      "options": [
        "A tendency to rely too heavily on the first piece of information encountered",
        "A tendency to ignore new information",
        "A tendency to overestimate the importance of small details",
        "A tendency to remember past events as more positive than they actually were"
      ],
      "answer": "A tendency to rely too heavily on the first piece of information encountered",
      "explanation": "Anchoring bias refers to the common human tendency to rely too heavily on the first piece of information (the 'anchor') when making decisions."
    },
    {
      "question": "Which cognitive bias involves the tendency to perceive meaningful connections between unrelated things?",
      "options": [
        "Apophenia",
        "Confirmation Bias",
        "Hindsight Bias",
        "Optimism Bias"
 

In [31]:
from educhain import qna_engine

mcqs_from_url = qna_engine.generate_mcqs_from_data(
        source="https://en.wikipedia.org/wiki/Butterfly_effect",
        source_type="url",
        num=5
    )

mcqs_from_url.show()

Question 1:
Question: Who is closely associated with the concept of the butterfly effect?
Options:
  A. Isaac Newton
  B. Edward Norton Lorenz
  C. Albert Einstein
  D. Stephen Hawking

Correct Answer: Edward Norton Lorenz
Explanation: The butterfly effect is closely associated with the work of mathematician and meteorologist Edward Norton Lorenz.

Question 2:
Question: Which mathematical framework exhibits sensitive dependence on initial conditions?
Options:
  A. Logistic map
  B. Linear equation
  C. Quadratic formula
  D. Trigonometric function

Correct Answer: Logistic map
Explanation: The logistic map provides a simple mathematical framework that demonstrates sensitive dependence on initial conditions, also known as the butterfly effect.

Question 3:
Question: What did Jacques Hadamard note in 1898 regarding trajectories in spaces of negative curvature?
Options:
  A. Convergence of trajectories
  B. General divergence of trajectories
  C. Circular trajectories
  D. Stable trajecto