# **Student Performance Indicator**

<h1 style="font-family: 'poppins'; font-weight: bold; color: Blue;">Author: Muhammad Adil Naeem</h1>

[![GitHub](https://img.shields.io/badge/GitHub-Profile-green?style=for-the-badge&logo=github)](https://github.com/muhammadadilnaeem) [![Twitter/X](https://img.shields.io/badge/Twitter-Profile-red?style=for-the-badge&logo=twitter)](https://twitter.com/adilnaeem0) [![LinkedIn](https://img.shields.io/badge/LinkedIn-Profile-blue?style=for-the-badge&logo=linkedin)](https://www.linkedin.com/in/muhammad-adil-naeem-26878b2b9/)  

----

# **Experimentation for End to end Generative AI Project: MCQ Generator using OpenAI, Langchain Streamlit**

---------

#### **Importing Required Libraries**

In [14]:
import pandas as pd
import PyPDF2
import json
import traceback
import os
from dotenv import load_dotenv

from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain.chains import SequentialChain
from langchain.callbacks import get_openai_callback 


#### **Setting up OPENAI API KEY**

In [15]:
load_dotenv() # load environment variables from .env file 

openai_api_key = os.getenv('OPENAI_API_KEY') # get api key from environment variable

#### **Create an instance of LLM with api key and model name**

In [16]:
llm = ChatOpenAI(openai_api_key=KEY, model_name="gpt-3.5-turbo",temperature=0.5) 

- The provided code snippet initializes an instance of the ChatOpenAI class from the openai library, configured with an OpenAI API key, the "gpt-3.5-turbo" model, and a temperature parameter of 0.5.

#### **Define Prompt Template**

In [21]:
# sample response json for prompt template to generate quiz

RESPONSE_JSON = { 
    "1": {
        "mcq": "multiple choice question",
        "options": {
            "a": "choice here",
            "b": "choice here",
            "c": "choice here",
            "d": "choice here",
        },
        "correct": "correct answer",
    },
    "2": {
        "mcq": "multiple choice question",
        "options": {
            "a": "choice here",
            "b": "choice here",
            "c": "choice here",
            "d": "choice here",
        },
        "correct": "correct answer",
    },
    "3": {
        "mcq": "multiple choice question",
        "options": {
            "a": "choice here",
            "b": "choice here",
            "c": "choice here",
            "d": "choice here",
        },
        "correct": "correct answer",
    },
}

#### **Prompt template to generate quiz from text**

In [22]:
TEMPLATE = """
Text:{text}
You are an expert MCQ maker. Given the above text, it is your job to \
create a quiz  of {number} multiple choice questions for {subject} students in {tone} tone. 
Make sure the questions are not repeated and check all the questions to be conforming the text as well.
Make sure to format your response like  RESPONSE_JSON below  and use it as a guide. \
Ensure to make {number} MCQs
### RESPONSE_JSON
{response_json}

"""

In [23]:

quiz_generation_prompt = PromptTemplate(
    input_variables = ["text", "number", "subject", "tone", "response_json"],
    template = TEMPLATE
)

#### **Define llm chain to generate quiz**

In [24]:
quiz_chain = LLMChain(
    llm = llm,
    prompt = quiz_generation_prompt,
    output_key = "quiz",
    verbose = True
)

#### **Prompt template2 to generate quiz from text**

In [25]:
# prompt template2 to generate quiz from text

TEMPLATE2 = """
You are an expert english grammarian and writer. Given a Multiple Choice Quiz for {subject} students.\
You need to evaluate the complexity of the question and give a complete analysis of the quiz. Only use at max 50 words for complexity analysis. 
if the quiz is not at per with the cognitive and analytical abilities of the students,\
update the quiz questions which needs to be changed and change the tone such that it perfectly fits the student abilities
Quiz_MCQs:
{quiz}

Check from an expert English Writer of the above quiz:
"""

#### **This prompt template is used to evaluate the complexity of the quiz**

In [26]:
quiz_evaluation_prompt = PromptTemplate(input_variables=["subject", "quiz"], template=TEMPLATE) 

#### **Review chain to evaluate the quiz and give feedback**

In [27]:
review_chain = LLMChain(llm=llm, prompt=quiz_evaluation_prompt, output_key="review", verbose=True) 

#### **Sequential chain to generate quiz and evaluate it**

In [29]:
generate_evaluate_chain = SequentialChain(chains=[quiz_chain, review_chain], input_variables=["text", "number", "subject", "tone", "response_json"],
                                        output_variables=["quiz", "review"], verbose=True)  

#### **Defining Path of file that we will use to generate mcqs**

In [31]:
file_path = r"E:\Generative Ai Project\mcq_training_data.txt"

#### **Read text from file_path**

In [34]:
with open(file_path, 'r') as file:
    TEXT = file.read() 

print(TEXT)

Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalize to unseen data and thus perform tasks without explicit instructions.[1] Recently, artificial neural networks have been able to surpass many previous approaches in performance.[2]

ML finds application in many fields, including natural language processing, computer vision, speech recognition, email filtering, agriculture, and medicine. When applied to business problems, it is known under the name predictive analytics. Although not all machine learning is statistically based, computational statistics is an important source of the field's methods.

The mathematical foundations of ML are provided by mathematical optimization (mathematical programming) methods. Data mining is a related (parallel) field of study, focusing on exploratory data analysis (EDA) through unsupervised learning.

From a theoretical viewpoint, p

#### **Serialize the Python dictionary into a JSON-formatted string**

In [35]:
json.dumps(RESPONSE_JSON)

'{"1": {"mcq": "multiple choice question", "options": {"a": "choice here", "b": "choice here", "c": "choice here", "d": "choice here"}, "correct": "correct answer"}, "2": {"mcq": "multiple choice question", "options": {"a": "choice here", "b": "choice here", "c": "choice here", "d": "choice here"}, "correct": "correct answer"}, "3": {"mcq": "multiple choice question", "options": {"a": "choice here", "b": "choice here", "c": "choice here", "d": "choice here"}, "correct": "correct answer"}}'

#### **This code will define tone of the quiz**

In [37]:
NUMBER=5 
SUBJECT="machine learning"
TONE="simple" # You can choose between "simple", "formal", "professional"

#### **How to setup Token Usage Tracking in LangChain**

In [38]:

# https://python.langchain.com/docs/modules/model_io/llms/token_usage_tracking

with get_openai_callback() as cb:
    response = generate_evaluate_chain(
        {
            "text": TEXT,
            "number": NUMBER,
            "subject":SUBJECT,
            "tone": TONE,
            "response_json": json.dumps(RESPONSE_JSON)
        }
        )



[1m> Entering new SequentialChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Text:Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalize to unseen data and thus perform tasks without explicit instructions.[1] Recently, artificial neural networks have been able to surpass many previous approaches in performance.[2]

ML finds application in many fields, including natural language processing, computer vision, speech recognition, email filtering, agriculture, and medicine. When applied to business problems, it is known under the name predictive analytics. Although not all machine learning is statistically based, computational statistics is an important source of the field's methods.

The mathematical foundations of ML are provided by mathematical optimization (mathematical programming) methods. Data mining is a relat

#### **Cost of above operation**

In [39]:
print(f"Total Tokens:{cb.total_tokens}")
print(f"Prompt Tokens:{cb.prompt_tokens}")
print(f"Completion Tokens:{cb.completion_tokens}")
print(f"Total Cost:{cb.total_cost}")

Total Tokens:6974
Prompt Tokens:6058
Completion Tokens:916
Total Cost:0.010918999999999998


#### **Response**

In [40]:
response

{'text': 'Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalize to unseen data and thus perform tasks without explicit instructions.[1] Recently, artificial neural networks have been able to surpass many previous approaches in performance.[2]\n\nML finds application in many fields, including natural language processing, computer vision, speech recognition, email filtering, agriculture, and medicine. When applied to business problems, it is known under the name predictive analytics. Although not all machine learning is statistically based, computational statistics is an important source of the field\'s methods.\n\nThe mathematical foundations of ML are provided by mathematical optimization (mathematical programming) methods. Data mining is a related (parallel) field of study, focusing on exploratory data analysis (EDA) through unsupervised learning.\n\nFrom a theoret

#### **Response in JSON**

In [52]:
quiz_str = response.get("quiz")

#### **Use json.loads on the quiz_str to convert it from a JSON string to a dictionary.**

In [53]:
quiz = json.loads(quiz_str)

#### **This code is used to convert the quiz to a table**

In [54]:

quiz_table_data = []
for key, value in quiz.items():
    mcq = value["mcq"]
    options = " | ".join(
        [
            f"{option}: {option_value}"
            for option, option_value in value["options"].items()
        ]
    )
    correct = value["correct"]
    quiz_table_data.append({"MCQ": mcq, "Choices": options, "Correct": correct}) 

In [55]:
quiz_table_data

[{'MCQ': 'What is the main goal of machine learning?',
  'Choices': 'a: To perform tasks with explicit instructions | b: To learn from data and generalize to unseen data | c: To ignore data and focus on theoretical frameworks | d: To memorize data without understanding it',
  'Correct': 'b'},
 {'MCQ': "Who coined the term 'machine learning' in 1959?",
  'Choices': 'a: Donald Hebb | b: Walter Pitts | c: Arthur Samuel | d: Warren McCulloch',
  'Correct': 'c'},
 {'MCQ': 'What is the difference between machine learning and data mining?',
  'Choices': 'a: Machine learning focuses on prediction, while data mining focuses on discovering unknown properties in data | b: Machine learning uses unsupervised learning, while data mining uses supervised learning | c: Machine learning is used for image processing, while data mining is used for speech recognition | d: Machine learning is a subset of data mining',
  'Correct': 'a'},
 {'MCQ': 'What is the goal of generalization in machine learning?',
  '

#### **Make a dataframe of the quiz**

In [59]:
pd.DataFrame(quiz_table_data)

Unnamed: 0,MCQ,Choices,Correct
0,What is the main goal of machine learning?,a: To perform tasks with explicit instructions...,b
1,Who coined the term 'machine learning' in 1959?,a: Donald Hebb | b: Walter Pitts | c: Arthur S...,c
2,What is the difference between machine learnin...,"a: Machine learning focuses on prediction, whi...",a
3,What is the goal of generalization in machine ...,a: To memorize the training data perfectly | b...,b
4,What is the relationship between machine learn...,a: Machine learning and statistics have differ...,a


In [56]:
quiz=pd.DataFrame(quiz_table_data)

#### **Save it in csv file**

In [57]:
quiz.to_csv("machine_learning_quiz.csv",index=False)

In [58]:
from datetime import datetime
datetime.now().strftime('%m_%d_%Y_%H_%M_%S')

'07_06_2024_22_34_59'