## Steps
1. Get answers for question paper
  - Take jpeg
  - provide pedantic of question_paper("number_or_position","instruction","question", "number_of_points")
  - Use GPT to get the answer of the question and save it in the json file/object
2. Get answers for question:
  - Provide GPT-V with the json_object of the questions in pager with the answer
  - Ask the GPT to evaluate whether it is correct or not. Get the results from it


## Issues:
1. Potentailly cutting when having multiple questions in a page. 
2. For matching, if a learner scribles, i.e. draw multiple lines, it will show as correct ie. lines were drawn
3. Having multiple nested models increases the lenght of time with GPT, thereby increasing cost...
Goals: Secondary, Math, Kiswahili

In [19]:
from dotenv import load_dotenv
load_dotenv()
import os
import pandas as pd

In [1]:
from dotenv import load_dotenv
import os
## check if OPENAI_API_TOKEN is loaded from dotenv
load_dotenv()

OPENAI_API_TOKEN = os.getenv('OPENAI_API_TOKEN')
REPLICATE_API_TOKEN = os.getenv('REPLICATE_API_TOKEN')


In [102]:
from pydantic import BaseModel
from typing import List

from llama_index.program.openai import OpenAIPydanticProgram
from llama_index.llms.openai import OpenAI

In [110]:
from pydantic import BaseModel, parse_obj_as
from typing import Optional
from enum import Enum

class QuestionType(str, Enum):
    multiple_choice = 'multiple_choice'
    fill_in_the_blank = 'fill_in_the_blank'
    short_answer = 'short_answer'
    long_answer = 'long_answer'
    draw = 'draw'
    match = 'match'
    other = 'other'
    
class Question(BaseModel):
    """
    A model representing a question with various details including 
    the position, instruction, question detail, and the correct answer.
  
    position_of_question can be: 1, 1a, 1.a.i, 1.a.ii, 3, etc..
    instruction is the instruction for the question ie. 'Answer the following questions'. Instructions can  be repeated if it's a nested question ie. 'Answer the following questions about the passage'
    question_detail is the question itself ie. 'What is the capital of France?'
    max_number_of_points is the maximum number of points that can be awarded for the question. It can be indicated as 4mks or 4 points etc..
    question_types can be: 'multiple_choice', 'fill_in_the_blank', 'short_answer', 'long_answer','draw','match', 'other'. \
        For 'match' questions it will involve drawing lines to match the correct answers. The lines may overlap but of importance is a direction from item in column to item in other column. \
            For 'draw' questions, the student will be required to draw a diagram with or without colours based off the description.
    correct_answer: evaluate the question and find the correct answer for it that will be used to evaluate subsequent alternative answers
    """
    position_of_question: str
    instruction: str
    question_detail: str
    max_number_of_points: str
    question_type: QuestionType
    correct_answer: str
    # Provided and correct answers along with confidence score and points allocated are optional
    # as they might not be available for an unanswered question.
    provided_answer: Optional[str] = None
    points_allocated_for_provided_answer: Optional[str] = None
    confidence_score: Optional[int] = None

class AnsweredQuestion(Question):
    """
    A model representing an answered question, inheriting from the Question model.
    Contains additional details specific to an answered question, like the provided answer
    and evaluation metrics.
    provided_answer is the answer provided in the answer sheet image. If it's a drawing, describe the drawing in text.
    points_allocated_for_provided_answer is the number of points awarded for the provided answer.
    explanation is the explanation for the points_allocated_for_provided_answer given the answer provided.
    confidence_score is the confidence score between 0 to 10 where 0 is no confidence and 10 is high confidence.
    """
    # For answered questions, these fields are required.
    provided_answer: str
    points_allocated_for_provided_answer: str
    explanation: str
    confidence_score: int


In [21]:
# class BaseQuestion(BaseModel):
#     """
#     A question could contain a question, a position of the question, an instruction, a max_number_of_points.
#     ie. 
#     position_of_question: 
#     instruction: "Make words"
#     question_detail: "c + a + t = "
    
#     question_types can be: 'multiple_choice', 'fill_in_the_blank', 'short_answer', 'long_answer','draw','other'
#     provided_answer is the answer the user has provided by GPT by actually answering the question.
#     points_allocated_for_provided_answer is the answer the image contains and how many points have they gotten for that answer.
#     correct_answer_by_gpt: evaluate the question and find the correct answer for it.
#     confidence_score: how confident is the model(gpt) in evaluating the answer. between 1 to 10 with 1 being low confidece and 10 high confidence

#     """
#     position_of_question: str
#     instruction: str
#     question_detail: str
#     max_number_of_points: str
#     provided_answer: str
#     correct_answer_by_gpt: str
#     points_allocated_for_provided_answer: str
#     question_type: str
#     confidence_score: int

In [111]:
class QuestionPaper(BaseModel):
    """
    A question paper page, is one page that contains multiple questions. Subject and grade if not found can be infered
    """
    questions: List[Question]
    subject: Optional[List[str]] = None
    grade: Optional[str] = None

In [107]:
class AnsweredQuestionPaper(QuestionPaper):
    """
    A question paper page, is one page that contains multiple questions. Subject and grade if not found can be infered.
    student_name: the name of the student who answered the question paper
    """
    questions: List[AnsweredQuestion]
    subject: Optional[List[str]] = None
    grade: Optional[str] = None
    student_name: Optional[str] = None

In [112]:
from llama_index.multi_modal_llms.openai import OpenAIMultiModal
from llama_index.core import SimpleDirectoryReader

In [113]:
# put your local directory here
image_documents = SimpleDirectoryReader("/home/njui/kn_workspace/curriculum_taxonomy_extractor/data/raw/integrated/intergrated_learning_areas_learning_original/pg_4").load_data()

openai_mm_llm = OpenAIMultiModal(
    model="gpt-4-vision-preview", api_key=OPENAI_API_TOKEN, max_new_tokens=4096, image_detail='high'
)

In [114]:
question_prompt_template_str = """\
  Please examine the following images and extract the textual information regarding the questions.\
    Identify the position of the question, the instruction given, the question details, the maximum number of points,\
      and the type of question. Summarize the information in a structured format suitable for a Question object as defined in our model.\
  """
  
from llama_index.core.program import MultiModalLLMCompletionProgram
from llama_index.core.output_parsers import PydanticOutputParser


openai_program_question = MultiModalLLMCompletionProgram.from_defaults(
    output_parser=PydanticOutputParser(QuestionPaper),
    image_documents=image_documents,
    temprature=0.3,
    prompt_template_str=question_prompt_template_str,
    multi_modal_llm=openai_mm_llm,
    verbose=True,
)

response_question = openai_program_question()

[1;3;38;2;90;149;237m> Raw output: ```json
{
  "questions": [
    {
      "position_of_question": "1",
      "instruction": "Answer the following questions",
      "question_detail": "What is tannwin?",
      "max_number_of_points": "1mk",
      "question_type": "short_answer",
      "correct_answer": ""
    },
    {
      "position_of_question": "2",
      "instruction": "Answer the following questions",
      "question_detail": "Surah Al Fatiha has ______ verses.",
      "max_number_of_points": "1mk",
      "question_type": "fill_in_the_blank",
      "correct_answer": "7"
    },
    {
      "position_of_question": "3",
      "instruction": "Answer the following questions",
      "question_detail": "We recite surah ______ for protection.",
      "max_number_of_points": "1mk",
      "question_type": "fill_in_the_blank",
      "correct_answer": "Nas, Fatiha"
    },
    {
      "position_of_question": "4",
      "instruction": "Answer the following questions",
      "question_detail": "

In [115]:
# Serialize the response from the question extraction to JSON
questions_json = response_question.json()

In [116]:
# Now prepare the prompt for extracting answers using the serialized questions
question_ans_prompt_template_str = """\
  Now that we have the questions extracted, please analyze the images containing students' answered questions.\
    For each question detailed below, identify the provided answer, allocate points according to the correctness of the answer,\
      provide an explanation for the points allocated, and assign a confidence score based on your evaluation.\
        Summarize this information in a structured format suitable for an AnsweredQuestion object as defined in our model.
        
        Extracted Questions: {question_paper}
"""


In [117]:
from llama_index.core.program import MultiModalLLMCompletionProgram
from llama_index.core.output_parsers import PydanticOutputParser

image_documents_ans = SimpleDirectoryReader("/home/njui/kn_workspace/curriculum_taxonomy_extractor/data/raw/integrated/intergrated_learning_areas_martin/pg_4").load_data()

openai_ans_program = MultiModalLLMCompletionProgram.from_defaults(
    output_parser=PydanticOutputParser(AnsweredQuestionPaper),
    image_documents=image_documents_ans,
    temprature=0.3,
    prompt_template_str=question_ans_prompt_template_str,
    multi_modal_llm=openai_mm_llm,
    verbose=True,
)
# response_ans = openai_question_program()
response_ans = openai_ans_program(question_paper=json.dumps(questions_json))

[1;3;38;2;90;149;237m> Raw output: ```json
{
  "questions": [
    {
      "position_of_question": "1",
      "instruction": "Answer the following questions",
      "question_detail": "What is tannwin?",
      "max_number_of_points": "1mk",
      "question_type": "short_answer",
      "correct_answer": "",
      "provided_answer": "it is tannwin",
      "points_allocated_for_provided_answer": "0",
      "confidence_score": 10,
      "explanation": "The provided answer is a repetition of the question and does not define what 'tannwin' is."
    },
    {
      "position_of_question": "2",
      "instruction": "Answer the following questions",
      "question_detail": "Surah Al Fatiha has ______ verses.",
      "max_number_of_points": "1mk",
      "question_type": "fill_in_the_blank",
      "correct_answer": "7",
      "provided_answer": "10",
      "points_allocated_for_provided_answer": "0",
      "confidence_score": 10,
      "explanation": "The correct answer is 7 verses, and the stude

In [63]:
# prompt_template_str

"    can you extract all the questions in the image of the question paper    the questions and answers from the image are in this json_input shared here {'questions': [{'position_of_question': '1.', 'instruction': 'Colour the picture.', 'question_detail': 'Colour the picture of the tree.', 'max_number_of_points': '4', 'correct_answer': 'check if the tree is coloured within the bounds. Mark ranges is for creativity. i.e. different shades, different colours etc.', 'question_type': 'draw'}, {'position_of_question': '2.', 'instruction': 'Draw and colour Kenyan flag.', 'question_detail': 'Draw and colour Kenyan flag.', 'max_number_of_points': '2', 'correct_answer': 'A drawing of the Kenyan flag consists of black, red, and green colors with white fimbriations and a Maasai shield and two spears in the center.', 'question_type': 'draw'}, {'position_of_question': '3.', 'instruction': 'Name three materials used to make a kite.', 'question_detail': 'a. __________\\nb. __________\\nc. __________',

In [70]:
# response = openai_program()
response = openai_program(questions_json=json.dumps(json_input))
# for res in response:
    ## Save res to a file as json

[1;3;38;2;90;149;237m> Raw output: ```json
{
  "questions": [
    {
      "position_of_question": "1.",
      "instruction": "Colour the picture.",
      "question_detail": "Colour the picture of the tree.",
      "max_number_of_points": "4",
      "provided_answer": "The tree is coloured within the bounds.",
      "correct_answer_by_gpt": "check if the tree is coloured within the bounds. Mark ranges is for creativity. i.e. different shades, different colours etc.",
      "points_allocated_for_provided_answer": "4",
      "question_type": "draw",
      "confidence_score": 10
    },
    {
      "position_of_question": "2.",
      "instruction": "Draw and colour Kenyan flag.",
      "question_detail": "Draw and colour Kenyan flag.",
      "max_number_of_points": "2",
      "provided_answer": "A drawing of the Kenyan flag consists of black, red, and green colors without the white fimbriations and a Maasai shield and two spears in the center.",
      "correct_answer_by_gpt": "A drawing of

In [19]:
## save as json to file
response.json()

'{"questions":[{"position_of_question":"1.a","instruction":"Count and write.","question_detail":"Image of 3 soccer balls","number_of_points":"1","question_type":"short_answer"},{"position_of_question":"1.b","instruction":"Count and write.","question_detail":"Image of 2 faucets","number_of_points":"1","question_type":"short_answer"},{"position_of_question":"1.c","instruction":"Count and write.","question_detail":"Image of 5 chickens","number_of_points":"1","question_type":"short_answer"},{"position_of_question":"2.a","instruction":"Name the shapes.","question_detail":"Image of a triangle","number_of_points":"1","question_type":"short_answer"},{"position_of_question":"2.b","instruction":"Name the shapes.","question_detail":"Image of a circle","number_of_points":"1","question_type":"short_answer"},{"position_of_question":"2.c","instruction":"Name the shapes.","question_detail":"Image of a rectangle","number_of_points":"1","question_type":"short_answer"},{"position_of_question":"2.d","inst

In [54]:
# Serialize the instance to a JSON string
json_string = response.json()

# To save this JSON string to a file, you can use the following:
with open('question_paper_creative_activities_grade_3.json', 'w') as f:
    f.write(json_string)