[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1xV9PZiEFTwTZJUtttk2bEvX6NKIGJzBd?usp=sharing)

## Generate MCQs from Data using [Educhain](https://github.com/satvik314/educhain)


Explore the power of AI-driven education with Educhain! This notebook demonstrates how to create high-quality Multiple Choice Questions (MCQs) from various data sources using the Educhain package. ✅

Key Features:
- Generate MCQs from PDF files, web pages, and plain text
- Customize difficulty levels and learning objectives
- Leverage advanced language models for question generation

Perfect for educators, content creators, and e-learning developers looking to automate and enhance their question creation process. Dive in to revolutionize your approach to educational content generation!



In [1]:
!pip install -qU educhain langchain-google-genai

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/67.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.3/67.3 kB[0m [31m4.7 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.4/50.4 kB[0m [31m1.7 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m399.9/399.9 kB[0m [31m16.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m599.2/599.2 kB[0m [31m22.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.4/2.4 MB[0m [31m48.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

### Initiating Educhain with Gemini Pro 002

In [6]:
from langchain_google_genai import ChatGoogleGenerativeAI
from educhain import Educhain, LLMConfig
from google.colab import userdata

gemini = ChatGoogleGenerativeAI(
    model="gemini-1.5-pro-002",
    api_key=userdata.get("GOOGLE_API_KEY"))

gemini_config = LLMConfig(custom_model=gemini)

client = Educhain(gemini_config)

### Generating MCQs from a PDF

In [9]:
!wget https://arxiv.org/pdf/2306.05499.pdf

--2024-09-24 21:49:06--  https://arxiv.org/pdf/2306.05499.pdf
Resolving arxiv.org (arxiv.org)... 151.101.131.42, 151.101.3.42, 151.101.195.42, ...
Connecting to arxiv.org (arxiv.org)|151.101.131.42|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://arxiv.org/pdf/2306.05499 [following]
--2024-09-24 21:49:06--  http://arxiv.org/pdf/2306.05499
Connecting to arxiv.org (arxiv.org)|151.101.131.42|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 718844 (702K) [application/pdf]
Saving to: ‘2306.05499.pdf’


2024-09-24 21:49:06 (16.2 MB/s) - ‘2306.05499.pdf’ saved [718844/718844]



In [10]:
%%time
mcqs_from_url = client.qna_engine.generate_questions_from_data(
        source="2306.05499.pdf",
        source_type="pdf",
        num=10
    )

mcqs_from_url.show()

Question 1:
Question: What is HOUYI in the context of this paper?
Options:
  A. A large language model.
  B. A type of LLM-integrated application.
  C. A novel black-box prompt injection attack technique.
  D. A defense mechanism against prompt injection.

Correct Answer: A novel black-box prompt injection attack technique.
Explanation: HOUYI is the name of the proposed attack method designed to exploit vulnerabilities in LLM-integrated applications.

Question 2:
Question: What is the primary security risk addressed in this paper?
Options:
  A. LLM hallucination.
  B. Data poisoning in LLMs.
  C. Prompt injection attacks against LLM-integrated applications.
  D. Denial-of-service attacks against LLM providers.

Correct Answer: Prompt injection attacks against LLM-integrated applications.
Explanation: The paper focuses on the vulnerabilities of LLM-integrated applications to malicious prompt injections.

Question 3:
Question: What are the three key components of the HOUYI attack?
Option

### It also supports URLs and Text

In [8]:
mcqs_from_url = client.qna_engine.generate_questions_from_data(
        source="https://en.wikipedia.org/wiki/Butterfly_effect",
        source_type="url",
        num=5
    )

mcqs_from_url.show()

Question 1:
Question: In chaos theory, what does the butterfly effect describe?
Options:
  A. The life cycle of a butterfly
  B. The migration patterns of butterflies
  C. Sensitive dependence on initial conditions
  D. The impact of butterflies on weather patterns

Correct Answer: Sensitive dependence on initial conditions
Explanation: The butterfly effect highlights how a small change in a system's initial state can lead to drastically different outcomes over time.

Question 2:
Question: Who is primarily associated with the development of the butterfly effect concept?
Options:
  A. Henri Poincaré
  B. Alan Turing
  C. Edward Norton Lorenz
  D. Norbert Wiener

Correct Answer: Edward Norton Lorenz
Explanation: While others like Poincaré and Wiener contributed, Lorenz's work solidified the concept's connection to chaos theory.

Question 3:
Question: What was the initial example Lorenz used to illustrate the butterfly effect?
Options:
  A. A butterfly causing a tornado
  B. A tornado cau