<h1>Installing Dependencies</h1>

In [1]:
! pip install spacy



In [2]:
! python -m spacy download en_core_web_sm

Collecting en-core-web-sm==3.7.1
  Using cached https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.7.1/en_core_web_sm-3.7.1-py3-none-any.whl (12.8 MB)
[38;5;2m✔ Download and installation successful[0m
You can now load the package via spacy.load('en_core_web_sm')


<h1>Importing Dependencies</h1>

In [3]:
import spacy
import random
from collections import Counter


In [4]:
# input text
text = """
The Greek historian knew what he was talking about. The Nile River fed Egyptian civilization for hundreds of years. 
The Longest River the Nile is 4,160 miles long—the world’s longest river. It begins near the equator in Africa and 
flows north to the Mediterranean Sea. In the south the Nile churns with cataracts. A cataract is a waterfall. Near the 
sea the Nile branches into a delta. A delta is an area near a river’s mouth where the water deposits fine soil called silt. 
In the delta, the Nile divides into many streams. The river is called the upper Nile in the south and the lower Nile in the
north. For centuries, heavy rains in Ethiopia caused the Nile to flood every summer. The floods deposited rich soil along the 
Nile’s shores. This soil was fertile, which means it was good for growing crops. Unlike the Tigris and Euphrates,
the Nile River flooded at the same time every year, so farmers could predict when to plant their crops.
"""

num_questions = 5

In [5]:
# Load the spacy model
nlp = spacy.load('en_core_web_sm')

<h1>Making Concept</h1>

In [6]:
#text
nlp(text)


The Greek historian knew what he was talking about. The Nile River fed Egyptian civilization for hundreds of years. 
The Longest River the Nile is 4,160 miles long—the world’s longest river. It begins near the equator in Africa and 
flows north to the Mediterranean Sea. In the south the Nile churns with cataracts. A cataract is a waterfall. Near the 
sea the Nile branches into a delta. A delta is an area near a river’s mouth where the water deposits fine soil called silt. 
In the delta, the Nile divides into many streams. The river is called the upper Nile in the south and the lower Nile in the
north. For centuries, heavy rains in Ethiopia caused the Nile to flood every summer. The floods deposited rich soil along the 
Nile’s shores. This soil was fertile, which means it was good for growing crops. Unlike the Tigris and Euphrates,
the Nile River flooded at the same time every year, so farmers could predict when to plant their crops.

In [7]:
# Process the text with spaCy
doc =  nlp(text)

#each sentences one by one each line ma dekhaucha i.e. Extract sentences from the text
sentences = [sent.text for sent in doc.sents] 
sentences

# len([sent for sent in doc.sents])


['\nThe Greek historian knew what he was talking about.',
 'The Nile River fed Egyptian civilization for hundreds of years. \n',
 'The Longest River the Nile is 4,160 miles long—the world’s longest river.',
 'It begins near the equator in Africa and \nflows north to the Mediterranean Sea.',
 'In the south the Nile churns with cataracts.',
 'A cataract is a waterfall.',
 'Near the \nsea the Nile branches into a delta.',
 'A delta is an area near a river’s mouth where the water deposits fine soil called silt. \n',
 'In the delta, the Nile divides into many streams.',
 'The river is called the upper Nile in the south and the lower Nile in the\nnorth.',
 'For centuries, heavy rains in Ethiopia caused the Nile to flood every summer.',
 'The floods deposited rich soil along the \nNile’s shores.',
 'This soil was fertile, which means it was good for growing crops.',
 'Unlike the Tigris and Euphrates,\nthe Nile River flooded at the same time every year, so farmers could predict when to plant t

In [8]:
# Randomly select sentences to form questions
selected_sentences=random.sample(sentences,min(num_questions,len(sentences)))
selected_sentences
# min(num_questions,len(sentences))

['In the south the Nile churns with cataracts.',
 'The floods deposited rich soil along the \nNile’s shores.',
 '\nThe Greek historian knew what he was talking about.',
 'Near the \nsea the Nile branches into a delta.',
 'Unlike the Tigris and Euphrates,\nthe Nile River flooded at the same time every year, so farmers could predict when to plant their crops.\n']

In [9]:
mcqs = []
# Generate MCQs for each selected sentence
for sentence in selected_sentences:
  sentence = sentence.lower() #yo garena bhane capital wala noun dekhaudaina
  # process with spacy (sentence)
  sent_doc = nlp(sentence)
  # Extract entities(nouns) from sentence
  nouns = [token.text for token in sent_doc if token.pos_ == "NOUN"] #pos = parts of speech
  
  # to generate proper noun
  if len(nouns) < 2:
    continue

  # Count the occurrence of each noun
  noun_counts = Counter(nouns)
  # print(noun_counts)

 
  # extract subject from the noun list
  if noun_counts:
    subject = noun_counts.most_common(1)[0][0]
    # print(subject)

    answer_choices = [subject]
   
    # Generate question in a question way
    question_stem = sentence.replace(subject,"_________")
    # print(question_stem)

    for _ in range(3):
      distractor = random.choice(list(set(nouns) -  set([subject]))) #for unique we used set
      answer_choices.append(distractor)
    # print(answer_choices)

    random.shuffle(answer_choices) #so that it wont be obvious that 1st one is the answer
   
    correct_answer = chr(64 + answer_choices.index(subject) + 1) # to conver index to letter, we can use this formula

    mcqs.append((question_stem,answer_choices,correct_answer))





  

In [10]:
mcqs

[('in the _________ the nile churns with cataracts.',
  ['cataracts', 'south', 'cataracts', 'cataracts'],
  'B'),
 ('the _________ deposited rich soil along the \nnile’s shores.',
  ['floods', 'soil', 'soil', 'soil'],
  'A'),
 ('near the \n_________ the nile branches into a delta.',
  ['branches', 'delta', 'sea', 'branches'],
  'C'),
 ('unlike the _________ and euphrates,\nthe nile river flooded at the same time every year, so farmers could predict when to plant their crops.\n',
  ['year', 'year', 'tigris', 'year'],
  'C')]

<h1>Overall Concept of MCQ put in a FUNCTION</h1>

In [11]:

def generate_mcqs(text, num_questions=5):
    # text = clean_text(text)
    if text is None:
        return []

    # Process the text with spaCy
    doc = nlp(text)

    # Extract sentences from the text
    sentences = [sent.text for sent in doc.sents]

    # Randomly select sentences to form questions
    selected_sentences = random.sample(sentences, min(num_questions, len(sentences)))

    # Initialize list to store generated MCQs
    mcqs = []

    # Generate MCQs for each selected sentence
    for sentence in selected_sentences:
        # Process the sentence with spaCy
        sent_doc = nlp(sentence)

        # Extract entities (nouns) from the sentence
        nouns = [token.text for token in sent_doc if token.pos_ == "NOUN"]

        # Ensure there are enough nouns to generate MCQs
        if len(nouns) < 2:
            continue

        # Count the occurrence of each noun
        noun_counts = Counter(nouns)

        # Select the most common noun as the subject of the question
        if noun_counts:
            subject = noun_counts.most_common(1)[0][0]

            # Generate the question stem
            question_stem = sentence.replace(subject, "_______")

            # Generate answer choices
            answer_choices = [subject]

            # Add some random words from the text as distractors
            
            distractors = list(set(nouns) - set([subject]))

            # Ensure there are at least three distractors
            while len(distractors) < 3:
                distractors.append("[Distractor]") # Placeholder for missing distractors
           
            random.shuffle(distractors)
            for distractor in distractors[:3]:
                answer_choices.append(distractor)

            # Shuffle the answer choices
            random.shuffle(answer_choices)

            # Append the generated MCQ to the list
            correct_answer = chr(64 + answer_choices.index(subject) + 1)  # Convert index to letter
            mcqs.append((question_stem, answer_choices, correct_answer))

    return mcqs
    
    
    

<h1>Testing the FUNCTION</h2>

In [12]:
text = '''Organic chemistry is a branch of chemistry that focuses on the study of carbon-containing compounds and their properties, structures, and reactions. Carbon's unique ability to form stable bonds with other carbon atoms and a wide variety of elements, including hydrogen, oxygen, nitrogen, and halogens, results in an immense diversity of organic compounds. These compounds range from simple molecules like methane (CH₄) to complex macromolecules such as proteins and DNA, which are essential to life. A central concept in organic chemistry is the functional group, which is a specific grouping of atoms within a molecule that imparts characteristic chemical reactivity and properties. Common functional groups include hydroxyl groups (-OH), carboxyl groups (-COOH), amino groups (-NH₂), and many others. Organic reactions often involve the making and breaking of bonds within these functional groups, leading to the synthesis of new compounds. The study of organic chemistry is crucial for many industries, including pharmaceuticals, petrochemicals, and agriculture. It plays a key role in the development of new medications, synthetic fibers, plastics, and other materials. Organic chemists employ a variety of techniques, such as spectroscopy and chromatography, to analyze and characterize compounds, allowing for the precise manipulation of molecular structures to achieve desired chemical outcomes. This field continues to evolve, driving innovations that impact many aspects of modern life.'''




<h2>Method 1</h2>

In [15]:
results = generate_mcqs(text,num_questions=7)
for i,mcq in enumerate(results):
  question_stem,answer_choices,correct_answer = mcq
  print(f'Q{i+1}:{question_stem}')
  print('                              ')

  for j,choices in enumerate(answer_choices):
    print(f'{chr(64+j+1)}:{choices}')

  print(f'Correct Answer is: {correct_answer}')
  print(f'___________________________________________________________________')

 


  

Q1:Organic _______ is a branch of _______ that focuses on the study of carbon-containing compounds and their properties, structures, and reactions.
                              
A:study
B:properties
C:structures
D:chemistry
Correct Answer is: D
___________________________________________________________________
Q2:These _______ range from simple molecules like methane (CH₄) to complex macromolecules such as proteins and DNA, which are essential to life.
                              
A:life
B:DNA
C:compounds
D:proteins
Correct Answer is: C
___________________________________________________________________
Q3:Organic _______ often involve the making and breaking of bonds within these functional groups, leading to the synthesis of new compounds.
                              
A:groups
B:bonds
C:reactions
D:breaking
Correct Answer is: C
___________________________________________________________________
Q4:Organic _______ employ a variety of techniques, such as spectroscopy and chromato

<h2>Method 2</h2>

In [14]:
mcqs = generate_mcqs(text, num_questions=10)  # Pass the selected number of questions
# Ensure each MCQ is formatted correctly as (question_stem, answer_choices, correct_answer)
mcqs_with_index = [(i + 1, mcq) for i, mcq in enumerate(mcqs)]

for question in mcqs_with_index:
    print("Question", question[0], ":", question[1][0])
    print("Options:")
    options = question[1][1]
    for i, option in enumerate(options):
        print(f"{chr(97 + i)}) {option}")
    print("Correct Answer:", question[1][2])
    print("\n")

Question 1 : This _______ continues to evolve, driving innovations that impact many aspects of modern life.
Options:
a) life
b) innovations
c) field
d) aspects
Correct Answer: C


Question 2 : It plays a key _______ in the development of new medications, synthetic fibers, plastics, and other materials.
Options:
a) medications
b) fibers
c) development
d) role
Correct Answer: D


Question 3 : Organic _______ often involve the making and breaking of bonds within these functional groups, leading to the synthesis of new compounds.
Options:
a) making
b) compounds
c) reactions
d) synthesis
Correct Answer: C


Question 4 : Organic _______ is a branch of _______ that focuses on the study of carbon-containing compounds and their properties, structures, and reactions.
Options:
a) compounds
b) chemistry
c) reactions
d) carbon
Correct Answer: B


Question 5 : Common functional _______ include hydroxyl _______ (-OH), carboxyl _______ (-COOH), amino _______ (-NH₂), and many others.
Options:
a) groups