# RAG Strategy

## Strategy of the RAG System

## Data Ingestion

**Objective**: Efficiently extract text and detailed descriptions from documents filled with charts, diagrams, and images using Vision LLM, optimizing for time, cost, and accuracy.

As the input document has a complex, Power-Point like a layout. So i decided to use the Vision LLMs to extract data instead of OCR, Currently OCR's are unable to fetch data in correct layout, they fetch text line by line, which missed the exact layout or sequence of text.

1. **Conversion to Images**:
   - Convert each page of the document into an image format. This is essential for processing visual elements such as charts and diagrams effectively.
   
2. **Batch Processing**:
  - Process these images in batches to reduce time and cost. By batching the images, the system can handle multiple pages simultaneously, reducing the overall processing time.
  - Utilize asynchronous function calls to run the LLM concurrently, extracting data from each batch simultaneously. This further reduces latency. Asynchronous processing allows multiple requests to be handled at the same time, speeding up the extraction process significantly.

3. **Detailed Descriptions**:
  - Request the LLM to generate detailed descriptions for each image. These descriptions provide additional context and details about the visual content.
  - These descriptions help refine extracted data, especially for complex elements like tables which might not be correctly formatted initially. If the initial extraction misses certain details or formats, the detailed descriptions can provide the necessary context to correct and enhance the extracted data.


#### Splitting Document into Smaller Chunks

**Objective**: Maintain context within chunks while ensuring they are of a manageable size for embedding generation and further processing.

1. **Recursive Character Text Splitter**:
   - Utilize Langchain’s RecursiveCharacterTextSplitter with a token limit and priority separators ["\n\n", ".", "\n"].
   - This ensures chunks are split at logical points, maintaining context and coherence within each chunk.
   - Through experimentation, a chunk size of 100 tokens was selected. This size was found to balance the need for context with the need to manage chunk size for embedding purposes. It helps in maintaining some context in each chunk, making it easier for the model to understand and process the information.

#### Embedding Generation and Upserting Vectors into Vector DB

**Objective**: Generate and store contextual embeddings for each chunk, ensuring efficient and accurate retrieval during inference.

1. **Embedding Generation**:
   - Generate embeddings for each chunk using the "voyage-large-2-instruct" model from voyageai, known for its performance in general-purpose tasks. This model provides high-quality embeddings that capture the context and meaning of the text.
   
2. **Metadata Storage**:
   - Store chunk_id, page_no, chunk_text, and page_description as metadata for each chunk. This metadata is essential for organizing and retrieving the chunks efficiently during the inference stage.
   
3. **Vector Database**:
   - Used Pinecone for vector storage due to its low latency and hybrid search functionality. Pinecone’s capabilities ensure that the embeddings can be retrieved quickly and accurately, supporting real-time query processing.
   
4. **Future Improvements**:
   - Generate sparse embeddings of the chunks and contextual embeddings of image descriptions. This dual embedding approach can enhance the retrieval accuracy by providing multiple perspectives on the data.
  - Experiment with different combinations to determine the best retrieval accuracy. This iterative process will help in finding the optimal embedding strategy for various types of data.




# Inference

**Objective**: Efficiently retrieve relevant information in response to a query and generate accurate answers using the contextual data.

1. **Query Embedding**:
   - Generate embeddings for the input query using same embedding mode. This step ensures that the query is represented in the same vector space as the chunks, facilitating accurate similarity calculations.
   
2. **Chunk Retrieval**:
   - Retrieve the top 5 relevant chunks using cosine similarity for similarity calculations, optimized for voyageai models.
  - Filter out relevant chunks by comparing the similarity score with a pre-selected threshold (68% similarity). This threshold, determined through experimentation, ensures that only highly relevant chunks are considered for the final answer.
   
3. **Metadata Extraction**:
   - For the top 3 vectors that pass the similarity score filter, extract chunk text, page number, page description, and the image of the most relevant chunks. This comprehensive extraction ensures that all necessary context is available for answer generation.
   
4. **Structured Input for Model**:
   - Pass all this information in a well-structured format to the prompt.
   - Ensure the model receives comprehensive details related to the query, allowing it to derive context from both the chunk and the associated image and description.
   
5. **Answer Generation**:
   - Use GPT-4O for generating answers from the relevant context. This model has been selected for its superior performance in generating detailed and accurate responses based on the provided context.




# Evaluation Strategy

#### Objective
To ensure the generated answers are accurate, relevant, and free from hallucinations, a two-level evaluation strategy is employed. This strategy utilizes both automated similarity checks and manual review through a judge LLM.

#### Two-Level Evaluation

**1. Cosine Similarity Check**
- **Embedding Generation**:
  - Generate embeddings for both the generated answer and the retrieved relevant context. These embeddings capture the semantic meaning of the text, allowing for an accurate comparison.
  
- **Similarity Measurement**:
  - Measure the cosine similarity between the embeddings of the generated answer and the relevant context. Cosine similarity provides a metric for how closely the two vectors align, indicating the relevance of the generated answer to the context.
  
- **Threshold Check**:
  - Compare the similarity score against a pre-defined threshold. This threshold is determined through experimentation to balance precision and recall. If the similarity score exceeds the threshold, the answer is considered relevant and is passed to the end user.
  
- **Answer Re-generation**:
  - If the similarity score does not meet the threshold, tweak the inference prompt slightly and add more context. This iterative approach helps in refining the answer by providing additional information to the model.
  
**2. Judge LLM Verification**
- **Manual Review by Judge LLM**:
  - If the first evaluation check fails, pass the generated answer to a judge LLM. This LLM is a smaller, efficient model specifically designed for evaluating the relevance and accuracy of the generated answers.
  
- **Context and Image Verification**:
  - Feed the judge LLM with the generated answer and the relevant context, potentially including images. This comprehensive input ensures the judge LLM has all necessary information to evaluate the answer accurately.
  
- **Hallucination and Relevance Check**:
  - The judge LLM checks the answer for hallucinations and verifies its relevance to the provided context. It ensures that the answer is not only accurate but also directly related to the user's query and the given context.
  
- **Final Decision**:
  - Based on the judge LLM’s evaluation, decide whether to pass the answer to the end user or to re-generate the answer with additional context. This step provides an additional layer of assurance, ensuring high-quality responses.


###Install dependencies

In [101]:
#Install all dependencies
!pip install pdfplumber langchain pinecone tiktoken voyageai

In [2]:
import os
import time
import requests
import pdfplumber
import json
import base64
import logging
from langchain_text_splitters import RecursiveCharacterTextSplitter
import tiktoken
import uuid
from pinecone import Pinecone
import asyncio
import nest_asyncio
import aiohttp
import voyageai
from google.colab import userdata

###To run this notebook, you need to set the following API keys in Google Colab Secrets:

- OpenAI API Key
- VoyageAI API Key
- Pinecone API Key

In [87]:
OPENAI_API_KEY = userdata.get('OPENAI_API_KEY')
PINECONE_API_KEY = userdata.get('PINECONE_API_KEY')
VOYAGEAI_API_KEY = userdata.get('VOYAGEAI_API_KEY')

#OR
# OPENAI_API_KEY = os.environ.get('OPENAI_API_KEY')
# PINECONE_API_KEY = os.environ.get('PINECONE_API_KEY')
# VOYAGEAI_API_KEY = os.environ.get('VOYAGEAI_API_KEY')


In [30]:
#To access the asyncio functionality in notebook
nest_asyncio.apply()

#Initializing the logger
logger = logging.getLogger(__name__)

In [4]:
#Initializing the pinecone
PINECONE_API_KEY = PINECONE_API_KEY
PINECONE_ENV = "us-east-1"
pc = Pinecone(api_key=PINECONE_API_KEY)

In [84]:
#selecting the index
index_name = "tifin-voyage"
# connect to index
index = pc.Index(index_name)
# view index stats
index.describe_index_stats()

{'dimension': 1024,
 'index_fullness': 0.0,
 'namespaces': {'Investment_Case_For_Disruptive_Innovation': {'vector_count': 91},
                'investment_case_for_disruptive_innovation': {'vector_count': 100},
                'tifin-voyageai': {'vector_count': 100}},
 'total_vector_count': 291}

# Data Ingestion

In [7]:
def remove_duplicates_preserve_order(input_list):
    seen = set()
    unique_list = [x for x in input_list if not (x in seen or seen.add(x))]
    return unique_list

## Data Ingestion Utils

In [9]:
def read_text_file(file_path):
    """
    Read a text file
    """
    try:
        with open(file_path, "r") as file:
            content = file.read()
        return content
    except FileNotFoundError:
        logger.error(f"Error: File '{file_path}' not found.")
    except Exception:
        logger.error("An error occurred while reading the file.")

# Function to encode the image
def encode_image(image_path):
    try:
        with open(image_path, "rb") as image_file:
            return base64.b64encode(image_file.read()).decode('utf-8')
    except (FileNotFoundError, PermissionError, IOError) as e:
        # Handle the error or log it
        logger.error(f"Error reading file {image_path}: {e}")
        return None

In [13]:
async def LLM_call(images_payload):
    """
    Makes an asynchronous call to the OpenAI API with the given payload and processes the response.

    Args:
        images_payload (str): The payload containing the input images to be sent to the OpenAI API.

    Returns:
        str: The processed response from the OpenAI API. If an error occurs, returns "Failed to Transcribe".
    """

    url = "https://api.openai.com/v1/chat/completions"
    # Model to be used for the API call
    model = "gpt-4o-mini"
    # Fetch the OpenAI API key from user data
    api_key = OPENAI_API_KEY

    # Prepare the payload for the API call
    payload = {
        "model": model,
        "response_format": {"type": "json_object"},
        "messages": [{"role": "user", "content": images_payload}],
        "max_tokens": 4000
    }

    # Set the headers for the API request
    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {api_key}"
    }

    # Make an asynchronous POST request to the OpenAI API
    async with aiohttp.ClientSession() as session:
        async with session.post(url, headers=headers, json=payload) as response:
            response_data = await response.json()  # Get the response data in JSON format
            try:
                # Extract the response content
                resp = response_data['choices'][0]['message']['content']
                # Clean the response content
                if 'json' in resp or '```' in resp:
                    resp = resp.replace('json\n', '').replace('```\n', "").replace(
                        '\n```', "").replace('```', '').replace('json', '').replace('<<emoji>>', '')
                    logger.info("Lease doc response content cleaned")
            except Exception as error:
                # Log any errors that occur during processing
                logger.error(f"Error while cleaning: {error}")
                resp = "Failed to Transcribe"

    return resp


In [31]:
#Prompt to extract textual data from the images and also generate the description of the image of each page
extract_data_from_images_prompt = """You are a highly capable OCR model that pays great attention to details. Your task is to extract all the text from the input image while maintaining the original layout and also generate a description of the input image if it contains any graph or chart. Output should be in JSON format. Follow these instructions carefully:

1. Preserve the original structure, especially in cases when is text arranged in boxes or blocks.
2. Ensure all paragraphs are extracted from each section without missing any.
3. Do not miss any blocks if the image contains a block layout.
4. If the input contains any graph or chart, please explain that under the description key.
5. The description must cover all important aspects of the graph or charts or tables.
6. Description should not be longer than 250 tokens
7. Take your time to analyze your output step by step.
8. Output the complete extracted text and graph description.
9. Format the output as follows:
   json
   {
     "page1": {
       "text": "all extracted text",
       "image_description": "description"
     },
     "page2": {
       "text": "all extracted text",
       "image_description": "description"
     }
     ...
   }
"""

In [15]:
#Process a document asynchronously by converting its pages to images, transcribing the content using LLM, and storing the content into the json file
async def process_input_document(input_doc_path, extract_data_from_images_prompt,images_folder_path):
    """
    Process a document by converting its pages to images, transcribing the content using OpenAI's GPT model,
    and aggregating the transcriptions.

    Args:
        doc_path (str): Path to the PDF document.
        full_prompt (str): Prompt text to be included in the request to the OpenAI API.

    Returns:
        tuple: A tuple containing the full transcription as a string and a JSON object with batch-wise transcriptions.
    """
    start = time.time()
    transcription_for_pages = ""
    transcription_for_pages_json = {}

    doc_name = input_doc_path.split('/')[-1].split(".")[0].replace(" ","_").lower()

    try:

        # Check if the upload folder exists, if not, create it
        images_folder_path = f"{images_folder_path}/{doc_name}"
        os.makedirs(images_folder_path, exist_ok=True)
        # os.makedirs(output_data_file_name,exist_ok=True)


        print("Procesing the input document asynchronously in the batchs...")
        images_payload = [{"type": "text", "text": extract_data_from_images_prompt}]
        batch = 1
        tasks = []
        with pdfplumber.open(input_doc_path) as pdf:
            for i, page in enumerate(pdf.pages):
                # if i<4:
                k = i + 1
                # Convert PDF page to image
                image = page.to_image(100)
                # Save image to output folder
                image_path = f"{images_folder_path}/page_{k}.png"
                image.save(image_path)

                # Getting the base64 string
                base64_image = encode_image(image_path)
                req_content = {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/png;base64,{base64_image}",
                        "detail": "high"
                    }
                }

                images_payload.append(req_content)

                # Process in batches of 2 images
                if k % 2 == 0:

                    print(f"Processing batch#: {batch}")
                    # print("Successfully converted PDF to images", flush=True)

                    tasks.append(asyncio.create_task(LLM_call(images_payload)))
                    images_payload = [{"type": "text", "text": extract_data_from_images_prompt}]

                    batch += 1

        final_resp = await asyncio.gather(*tasks)
        for batch,resp in enumerate(final_resp):
            transcription_for_pages += "\n\n" + resp

            try:
                transcription_for_pages_json[f"batch{batch}"] = json.loads(resp)
            except Exception as error:
                logger.error(f"Error while parsing JSON: {error}")


    except Exception as error:
        logger.error(f"Error: {error}")
        resp = "Failed to Transcribe"

    transcription_for_pages_json_sorted = {}
    page = 1

    for pages in transcription_for_pages_json.values():
        for i in range(1, 3):
            transcription_for_pages_json_sorted[f"page_{page}"] = pages[f'page{i}']
            page += 1


    try:
      output_data_file_name = f"{doc_name}.json"
      # Open the file in write mode and store the JSON data
      with open(output_data_file_name, 'w') as json_file:
          json.dump(transcription_for_pages_json_sorted, json_file, indent=4)
          print(f"Extracted Data is stored in {output_data_file_name} file")
    except Exception as error:
      logger.error(f"Error while saving the document. Error: {error}")

    logger.info(f"Lease agreement processing time: {round(float(time.time() - start), 2)} secs")
    return transcription_for_pages, transcription_for_pages_json_sorted


In [16]:
# Call the process process_input_document with the specified attributes
input_doc_path = "/content/Investment Case For Disruptive Innovation.pdf"
images_folder_path = "images"
doc_name = input_doc_path.split('/')[-1].split(".")[0].replace(" ","_").lower()
final_transcription,final_transcription_json = await process_input_document(input_doc_path, extract_data_from_images_prompt,images_folder_path)


Procesing the input document asynchronously in the batchs...
Processing batch#: 1
Processing batch#: 2
Processing batch#: 3
Processing batch#: 4
Processing batch#: 5
Processing batch#: 6
Processing batch#: 7
Processing batch#: 8
Processing batch#: 9
Processing batch#: 10
Processing batch#: 11
Extracted Data is stored in investment_case_for_disruptive_innovation.json file


### Chunking

In [34]:
#json reading
output_data_file_name = f"{doc_name}.json"
with open(output_data_file_name, 'r') as file:
    data = json.load(file)

In [35]:
#Intializing the recursive character Text splitter
# Create the text splitter
enc = tiktoken.get_encoding("cl100k_base")
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    model_name="gpt-4",
    chunk_size=100,
    chunk_overlap=0,
    is_separator_regex=["\n\n", ".", "\n"],
)

In [68]:
total_no_chunks = 0
documents = []
metadata = []
ids = []
for page_no, content in data.items():
    page_no = page_no.split("_")[-1]
    if page_no == '1':
        continue
    # Split the text
    text = content['text']
    description = content["image_description"]

    text_to_embed = text
    chunks = text_splitter.split_text(text_to_embed)
    print(f"\nPage#{page_no}: total no of chunks#:{len(chunks)}")
    total_no_chunks = total_no_chunks + len(chunks)
    for c_id,chunk in enumerate(chunks):
        print(f"  Total tokens in chunk#{c_id+1}: {len(enc.encode(chunk))}")
        documents.append(chunk)
        chunk_metadata = {"chunk_id":c_id+1,
                          "chunk_text":chunk,
                          "text":text,
                          "description":description,
                          "page_no":page_no}
        metadata.append(chunk_metadata)
        ids.append(str(uuid.uuid4()))



Page#2: total no of chunks#:4
  Total tokens in chunk#1: 11
  Total tokens in chunk#2: 94
  Total tokens in chunk#3: 91
  Total tokens in chunk#4: 39

Page#3: total no of chunks#:16
  Total tokens in chunk#1: 14
  Total tokens in chunk#2: 3
  Total tokens in chunk#3: 99
  Total tokens in chunk#4: 9
  Total tokens in chunk#5: 3
  Total tokens in chunk#6: 98
  Total tokens in chunk#7: 37
  Total tokens in chunk#8: 2
  Total tokens in chunk#9: 99
  Total tokens in chunk#10: 36
  Total tokens in chunk#11: 2
  Total tokens in chunk#12: 99
  Total tokens in chunk#13: 8
  Total tokens in chunk#14: 4
  Total tokens in chunk#15: 99
  Total tokens in chunk#16: 15

Page#4: total no of chunks#:5
  Total tokens in chunk#1: 49
  Total tokens in chunk#2: 98
  Total tokens in chunk#3: 3
  Total tokens in chunk#4: 99
  Total tokens in chunk#5: 20

Page#5: total no of chunks#:3
  Total tokens in chunk#1: 82
  Total tokens in chunk#2: 99
  Total tokens in chunk#3: 50

Page#6: total no of chunks#:3
  Tot

## Upsert chunk vectors into pinecone

In [71]:

def generate_embeddings(documents,input_type):

    # Initialize Voyageai
    voyageai.api_key = VOYAGEAI_API_KEY
    vo = voyageai.Client()

    # Generate embeddings
    batch_size = 128
    embeddings = []

        # logger.info(f"Generating embeddings of {len(documents)} documents...")
    for i in range(0, len(documents), batch_size):
        embeddings += vo.embed(
            documents[i:i + batch_size], model="voyage-large-2-instruct", input_type=input_type
        ).embeddings

    return embeddings

['investment_case_for_disruptive_innovation',
 'tifin-voyageai',
 'Investment_Case_For_Disruptive_Innovation']

In [96]:
#VoyageAI embedding push
def data_insertion_to_VectorDB(documents,metadata,ids,doc_name):
    #generating the data embeddings
    voyage_embs = generate_embeddings(documents,"document")

    #deleting if namespace already created
    namespaces = index.describe_index_stats()['namespaces']
    namespaces = list(namespaces.keys())
    if doc_name in namespaces:
      print("First deleting all vectors in the namespace")
      index.delete(delete_all=True, namespace=doc_name)

    print(f"Now inserting vectors into the {doc_name} namespace")
    batch_size = 256
    for i in range(0, len(voyage_embs), batch_size):
        i_end = min(i+batch_size, len(voyage_embs))
        batch_ids,batch_embeddings,batch_metadata = ids[i:i_end],voyage_embs[i:i_end],metadata[i:i_end]
        to_upsert = list(zip(batch_ids, batch_embeddings, batch_metadata))
        index.upsert(vectors=to_upsert, namespace=doc_name)
        print(index.describe_index_stats())

In [None]:
print(index.describe_index_stats())

In [97]:
#To delete vectors from the index from specific namespace
# index.delete(delete_all=True, namespace=doc_name)

In [98]:
#Generating embs of the chunks and storing them into Vector DB
data_insertion_to_VectorDB(documents,metadata,ids,doc_name)

deleting all vectors in the namespace
{'dimension': 1024,
 'index_fullness': 0.0,
 'namespaces': {'Investment_Case_For_Disruptive_Innovation': {'vector_count': 91},
                'tifin-voyageai': {'vector_count': 100}},
 'total_vector_count': 191}


In [99]:
index.describe_index_stats()

{'dimension': 1024,
 'index_fullness': 0.0,
 'namespaces': {'Investment_Case_For_Disruptive_Innovation': {'vector_count': 91},
                'investment_case_for_disruptive_innovation': {'vector_count': 100},
                'tifin-voyageai': {'vector_count': 100}},
 'total_vector_count': 291}

# Inference

In [79]:
def inference(query, QA_prompt, images_folder_path, doc_name, pages, descriptions, chunk_limits):
    """
    Performs inference by sending a query and associated image data to the OpenAI API and returns the response.

    Args:
        query (str): The user's specific question.
        QA_prompt (str): The prompt template for generating the query.
        images_folder_path (str): The folder path where images are stored.
        doc_name (str): The name of the document containing the images.
        pages (list): A list of page numbers to be processed.
        descriptions (list): A list of descriptions for each image.
        chunk_limits (list): A list of chunk limits to be included in the prompt.

    Returns:
        str: The processed response content from the OpenAI API.
    """

    # Merge all image descriptions into a single string
    merged_description = ""
    for i, desc in enumerate(descriptions):
        merged_description = merged_description + f"\nImage_{i+1} description:\n{desc}\n\n"

    # Replace placeholders in the QA prompt with actual query and descriptions
    QA_prompt = QA_prompt.replace("__query__", query).replace("__description__", merged_description).replace("__chunks__", "\n\n".join(chunk_limits))

    # Get the API key from user data
    api_key = OPENAI_API_KEY
    model = "gpt-4o"
    # Set the headers for the API request
    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {api_key}"
    }

    # Set the path to the folder containing the images
    images_folder_path = f"{images_folder_path}/{doc_name}"

    # Initialize the payload with the QA prompt
    images_payload = [
        {
            "type": "text",
            "text": f"{QA_prompt}"
        }
    ]

    # Process each page, convert it to base64, and add it to the payload
    for page_no in pages:
        image_path = f"{images_folder_path}/page_{page_no}.png"
        base64_image = encode_image(image_path)
        req_content = {
            "type": "image_url",
            "image_url": {
                "url": f"data:image/png;base64,{base64_image}",
                "detail": "high"
            }
        }
        # print("Successfully converted PDF to images", flush=True)
        images_payload.append(req_content)

    # Prepare the final payload for the API request
    payload = {
        "model": model,
        "messages": [
            {
                "role": "user",
                "content": images_payload
            }
        ],
        "max_tokens": 1000
    }

    # Make the API request
    response = requests.post("https://api.openai.com/v1/chat/completions", headers=headers, json=payload)
    # print(f"Uncleaned Response:\n\n{response}", flush=True)

    # Extract and return the content of the response
    return response.json()['choices'][0]['message']['content']


In [80]:
def retriever(query,doc_name,verbose):
    query_emb = generate_embeddings([query],"query")
    resp = index.query(namespace=doc_name,
                            vector=query_emb[0],
                            top_k=5,
                            include_metadata=True,
                            include_values=False
                        )
    len(resp['matches'])
    # print(resp["matches"])

    #Vector Database response cleaning
    pages = []
    page_text = []
    chunk_texts = []
    descriptions = []
    chunk_ids = []

    for match  in resp['matches']:
        if match['score']>0.680:
            metadata = match["metadata"]
            pages.append(metadata['page_no'])
            page_text.append(metadata['text'])
            descriptions.append(metadata['description'])
            chunk_texts.append(metadata['chunk_text'])
            chunk_ids.append(int(metadata['chunk_id']))

    limit = 3
    print(f"Relevant all pages: {pages}")
    pages = remove_duplicates_preserve_order(pages)[:limit]
    descriptions = remove_duplicates_preserve_order(descriptions)[:limit]
    total_inputs_vectors = len(pages)
    print(f"Length of input vectors: {total_inputs_vectors}")
    print(f"Unique Relevant pages: {pages}")
    merged_chunk_texts = '\n\n'.join(chunk_texts)
    merged_descriptions =  '\n\n'.join(descriptions)
    if verbose:
        print(f"\nRelevant Chunks:\n {merged_chunk_texts}\n\n")
        print(f"\nRelevant Descriptions:\n {merged_descriptions}\n\n")
    return pages,page_text,chunk_texts[:limit],descriptions,chunk_ids

In [81]:
QA_prompt = """You are a helpful assistant, your task is to respond to user queries in a clear, and helpful manner, following a U.S. language style. You will be provided with key details like **context**, **query**, **images_descriptions** and **images** to craft your responses, focusing specifically on what the user has asked. Follow the guidelines while responding to user queries.

### Information Provided:
- **Context**: Relevant information.
- **Query**: The user’s specific question.
- **images**: images containing relevant information to query.
- **images_descriptions**: Detailed description of the images

Instructions to follow:

1. Take your time, carefully analyze the input images and the accompanying description and context to extract necessary details.
2. If the required information is not present in either the input images or the description, refrain from answering the query just say something like provided content not enough information to give answer to the query.
3. Ensure your answer is concise and directly addresses the query.
4. Do not include any additional information or context outside of what is provided in the input images and description and context.
5. Take your time to analyze carefully all the input information especially images.
6. Analyze all your steps before giving the final answer.

Input:

    Query: __query__

    Description: __description__

    Context: __chunks__

Output:

    Answer:"""

In [82]:
questions = [
    "What is the core objective of investing in disruptive innovation according to ARK?",
    "What are the significant risks associated with investing in innovation as highlighted by ARK?",
    "Can you list the converging innovation platforms identified by ARK?",
    "How does ARK describe the impact of Artificial Intelligence on technology's integration into economic sectors?",
    "What transformative potential does Multiomic Sequencing hold according to ARK?",
    "What are the implications of declining battery technology costs as outlined by ARK?",
    "How is the field of Robotics anticipated to evolve with the advancements in AI?",
    "What does the ARK's Convergence Scoring Framework illustrate about innovation platforms?",
    "How do neural networks serve as a catalyst for other technologies?",
    "What unique view does ARK have towards Autonomous Mobility and its market potential?",
    "How do AI Chatbots contribute to the development of robotaxis?",
    "What are breakthroughs in DNA Sequencing, particularly with neural networks?",
    "How does the application of AI language models in robotics enhance general task completion rates?",
    "In what ways are battery advances critical to the future of intelligent devices and augmented reality?",
    "How do reusable rockets contribute to global connectivity?",
    "What economic implications do disruptive innovations have according to ARK?",
    "What are the top 10 holdings of ARK Innovation ETF (ARKK)?",
    "What thematic strategies do ARK ETFs focus on?",
    "What is ARK's strategy for capturing the benefits of disruptive innovation in its investment approach?",
    "How does ARK ensure its investment strategies align with reality of disruptive innovation trends?"
]
query = questions[9]
print(f"Question: {query}")

Question: What unique view does ARK have towards Autonomous Mobility and its market potential?


In [83]:

verbose = False   # if you want he input relevant chunks and input image descriptions then set verbose=True
pages,page_text,chunk_texts,descriptions,chunk_ids = retriever(query,doc_name,verbose)
answer = inference(query,QA_prompt,images_folder_path,doc_name,pages,descriptions,chunk_texts)
print(f"\n\nQuestion:\n{query}\n\nAnswer:\n{answer}")

Relevant all pages: ['3', '6', '5', '11', '2']
Length of input vectors: 3
Unique Relevant pages: ['3', '6', '5']


Question:
What unique view does ARK have towards Autonomous Mobility and its market potential?

Answer:
ARK's unique view on Autonomous Mobility highlights that declining costs of advanced battery technology will enable a significant expansion in form factors, facilitating systems that drastically reduce the cost of transporting people and goods. This cost decline is expected to unlock micro-mobility and aerial systems, such as flying taxis, transforming city landscapes. ARK anticipates that autonomy will reduce the costs for taxi, delivery, and surveillance services by an order of magnitude, promoting frictionless transport, boosting e-commerce speed, and reducing individual car ownership.
