# Research CoPilot
## Multimodal RAG with Code Execution (RAG-CE)

This notebook will demo the capabilities of the solution. In the below cells, we will ingest PDF documents that are located in the `cases` directory. Then we will generate questions based on **all** the information extracted from the 2 case documents. Then, with our multimodal Search that is supported by the Code Interpreter, we will generate potential answers to these questions and rate the ground truth vs the generated answer. 

This notebook is for demo purposes only, and therefore we will ignore the bias in the the below demo: by generating questions based on the extracted contents of the documents, this introduces a bias, since the questions are only limited to what the solution already has ingested. But again, this is for demo only. 

### Ingest the Directory
Import Python packages, and ingest the PDF directory.

In [2]:
%load_ext autoreload
%autoreload 2


import sys
sys.path.append('..\\code')


import os
from dotenv import load_dotenv
load_dotenv()

from IPython.display import display, Markdown, HTML
from PIL import Image
from doc_utils import *
from processor import *


def show_img(img_path, width = None):
    if width is not None:
        display(HTML(f'<img src="{img_path}" width={width}>'))  
    else:
        display(Image.open(img_path))




The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [3]:
doc_path = r"sample_data\minion-tech.pdf"

ingestion_params_dict = {
        "index_name" : 'indexer_test',
        "delete_existing_output_dir": True,
        "processing_mode_pdf" : 'hybrid',
        "doc_path" : doc_path,
        'num_threads': 1
    }

pdf1 = PdfProcessor(ingestion_params_dict)
pdf1.ingest_document()


doc_path = r"sample_data\wile_e_coyote.pdf"

ingestion_params_dict = {
        "index_name" : 'indexer_test',
        "delete_existing_output_dir": False,
        "processing_mode_pdf" : 'hybrid',
        "doc_path" : doc_path,
        'num_threads': 1
    }

pdf2 = PdfProcessor(ingestion_params_dict)
pdf2.ingest_document()

Dirname ..
Doc Path:  sample_data\minion-tech.pdf
Doc Proc Directory:  ../indexer_test\minion-tech.pdf
Ingestion Directory:  ../indexer_test
Basename:  minion-tech
Extension:  .pdf
PDF Path:  ../indexer_test\minion-tech.pdf\minion-tech.pdf
Writing file to full path: d:\PROJECTS\COMPANY_PROJECTS\NTT_DATA\multimodal-rag-code-execution\indexer_test\minion-tech.pdf\minion-tech.processing_plan.txt
Writing file to full path: d:\PROJECTS\COMPANY_PROJECTS\NTT_DATA\multimodal-rag-code-execution\indexer_test\minion-tech.pdf\minion-tech.processing_plan.txt
Writing file to full path: d:\PROJECTS\COMPANY_PROJECTS\NTT_DATA\multimodal-rag-code-execution\indexer_test\minion-tech.pdf\stages\minion-tech.create_pdf_chunks.dict.txt
Writing file to full path: d:\PROJECTS\COMPANY_PROJECTS\NTT_DATA\multimodal-rag-code-execution\indexer_test\minion-tech.pdf\minion-tech.dict.txt
Writing file to full path: d:\PROJECTS\COMPANY_PROJECTS\NTT_DATA\multimodal-rag-code-execution\indexer_test\minion-tech.pdf\minion-te

### Genereate questions based on the ingested output of the case documents

The below uses only 5 prompts to generate questions / answers pairs that are considered "ground truth" answers.

In [4]:
path_1 = r"..\indexer_test\minion-tech.pdf\minion-tech.txt"
path_2 = r"..\indexer_test\wile_e_coyote.pdf\wile_e_coyote.txt"

text_template = """
Gru's Document: Business Analysis Document for Minion Tech
## START OF GRU'S DOCUMENT
{gru_doc}
## END OF GRU'S DOCUMENT



Wile E. Coyote's Document: Investment Proposal for Gru's Enterprises
## START OF WILE E. COYOTE'S DOCUMENT
{wile_doc}
## END OF WILE E. COYOTE'S DOCUMENT

"""

text = text_template.format(gru_doc = read_asset_file(path_1)[0] , wile_doc = read_asset_file(path_2)[0])

past_questions = []

qna_pairs = recover_json(ask_LLM_with_JSON(general_prompt_template.format(context=text, past_questions=past_questions), temperature = 0.5))
past_questions.extend(qna_pairs['qna_pairs'])
logc("General Q&A", json.dumps(qna_pairs, indent=4))

qna_pairs = recover_json(ask_LLM_with_JSON(specialized_prompt_template.format(context=text, past_questions=past_questions), temperature = 0.5))
past_questions.extend(qna_pairs['qna_pairs'])
logc("Specialized Q&A", json.dumps(qna_pairs, indent=4))

qna_pairs = recover_json(ask_LLM_with_JSON(numerical_prompt_template.format(context=text, past_questions=past_questions), temperature = 0.5))
past_questions.extend(qna_pairs['qna_pairs'])
logc("Numerical Q&A", json.dumps(qna_pairs, indent=4))

qna_pairs = recover_json(ask_LLM_with_JSON(table_prompt_template.format(context=text, past_questions=past_questions), temperature = 0.5))
past_questions.extend(qna_pairs['qna_pairs'])
logc("Tables Q&A", json.dumps(qna_pairs, indent=4))
    
qna_pairs = recover_json(ask_LLM_with_JSON(image_prompt_template.format(context=text, past_questions=past_questions), temperature = 0.5))
past_questions.extend(qna_pairs['qna_pairs'])
logc("Images Q&A", json.dumps(qna_pairs, indent=4))
    


Calling OpenAI APIs with 2 messages - Model: gpt-4o - Endpoint: https://ds-openai-eastus.openai.azure.com//openai/
[0m
Messages: [{'role': 'system', 'content': 'You are a helpful assistant, who helps the user with their query. You are designed to output JSON.'}, {'role': 'user', 'content': '\n\nContext:\n## START OF CONTEXT\n\nGru\'s Document: Business Analysis Document for Minion Tech\n## START OF GRU\'S DOCUMENT\n<!-- PageHeader="2024-01-06" -->\n<!-- PageHeader="minion-tech.md" -->\n\n\n# Business Analysis Document\n\nGru\'s Enterprises: Innovation in Cartoonish "Evil" Weaponry\n\n\n<figure>\n\nGRUI\'S\nENTERPIIRSES\n\n</figure>\n\n\nAddress:\n\n123 Villain Street, Villainville, EV123\n\nContact Information:\n\n· Phone: 123-456-7890\n\n· Email: contact@grusenterprises.com\n\n· Website: www.grusenterprises.com\n\nPrepared by:\n\nGru\'s Enterprises Financial Analysis Team\n\nDate of Preparation:\n\nJanuary 6, 2024\n\n\n## Confidential and Proprietary Information\n\nThis document con

### Generate answers

With our multimodal Search that is potentially supported by the Code Interpreter, we will generate answers to the questions and rate the ground truth vs the generated answer. 

In [8]:
for index, qna_pair in enumerate(past_questions):
    gen_answer, references, _, _ = search(
        qna_pair['question'], 
        top=3, 
        computation_approach = "LocalPythonExec",  ## other options are "NoComputationTextOnly", "Taskweaver", "AssistantsAPI", or "LocalPythonExec"
        computation_decision = "LLM", 
        index_name = "indexer_test",
        vision_support = True,  
        count = False, 
        verbose = False)

    rating = recover_json(ask_LLM_with_JSON(rate_answers_prompt_template.format(question=qna_pair['question'], ground_truth_answer=qna_pair['answer'], generated_answer=gen_answer)))
    rating = int(rating['rating'])
    past_questions[index]['rating'] = rating
    logc("Rating for question " + str(index) + " is --> " + str(rating), 
         f"\nQuestion: {qna_pair['question']}\n{bc.OKCYAN}Ground truth answer: {qna_pair['answer']}\n{bc.MAGENTA}Generated answer: {gen_answer}\n{bc.LIGHT_RED}References: {json.dumps(references, indent=4)}\n\n\n")




Calling OpenAI APIs with 2 messages - Model: gpt-4o - Endpoint: https://ds-openai-eastus.openai.azure.com//openai/



RetryError: RetryError[<Future at 0x2052b96b1d0 state=finished raised HTTPError>]