### Querying a

In [1]:
import os
from dotenv import load_dotenv

load_dotenv()

False

In [2]:
import nest_asyncio

nest_asyncio.apply()

In [3]:

from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.tools import QueryEngineTool
from llama_index.core import SimpleDirectoryReader
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import SummaryIndex, VectorStoreIndex
from llama_index.core import Settings
from llama_index.llms.groq import Groq

# bge-base embedding model
Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-base-en-v1.5")

# Llama 3
llm = Groq(model="llama-3.1-70b-versatile")
Settings.llm = llm

def get_doc_tools(file_path: str) -> str: 

    # load documents
    documents = SimpleDirectoryReader(input_files=[file_path]).load_data()

    splitter = SentenceSplitter(chunk_size=1024)
    nodes = splitter.get_nodes_from_documents(documents)

    # summary_index = SummaryIndex(nodes)
    vector_index = VectorStoreIndex(nodes)

    # summary_query_engine = summary_index.as_query_engine(
    #     response_mode="tree_summarize",
    #     use_async=True,
    # )
    vector_query_engine = vector_index.as_query_engine()

    # summary_tool = QueryEngineTool.from_defaults(
    #     query_engine=summary_query_engine,
    #     description=(
    #         "Useful for summarization questions related to paul_graham_essay"
    #     ),
    # )

    vector_tool = QueryEngineTool.from_defaults(
        query_engine=vector_query_engine,
        description=(
            "Useful for retrieving specific context from the paul_graham_essay."
        ),
    )

    # return(vector_tool, summary_tool)
    return(vector_tool)

In [4]:
vector_tool = get_doc_tools("data/codet_avaliacao.pdf")
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner

agent_worker = FunctionCallingAgentWorker.from_tools(
    [vector_tool], 
    llm=llm, 
    verbose=True
)

agent = AgentRunner(agent_worker)

In [5]:
response = agent.chat(
    """
    Answer the following questions about CodeT:
    - What is the problem CodeT is solving?
    - Why is solving this problem important?
    - What is CodeT's contribution?
    - Explain CodeT method
    - How does CodeT perform compared to other frameworks?
    """
)

response

Added user message to memory: 
    Answer the following questions about CodeT:
    - What is the problem CodeT is solving?
    - Why is solving this problem important?
    - What is CodeT's contribution?
    - Explain CodeT method
    - How does CodeT perform compared to other frameworks?
    
=== Calling Function ===
Calling function: query_engine_tool with args: {"input": "What is the problem CodeT is solving?"}
=== Function Output ===
CodeT is solving the problem of code generation and test case generation for programming problems, with the goal of improving the accuracy and efficiency of code generation models.
=== Calling Function ===
Calling function: query_engine_tool with args: {"input": "Why is solving the problem CodeT is addressing important?"}
=== Function Output ===
Solving the problem that CodeT is addressing is important because it can improve the performance of code generation models in solving difficult programming problems. The problem of code generation is significan

AgentChatResponse(response="CodeT is solving the problem of code generation and test case generation for programming problems, with the goal of improving the accuracy and efficiency of code generation models. Solving the problem that CodeT is addressing is important because it can improve the performance of code generation models in solving difficult programming problems. CodeT's contribution is that it can significantly improve the performance of code generation models by distinguishing correct code solutions, especially when the models have not fully understood the semantics of the example input-output cases provided in the contexts. CodeT method is based on test-driven execution agreement, which takes test case information into consideration. It uses a dual execution agreement approach to improve the performance of pre-trained language models. CodeT consistently outperforms other frameworks, such as AlphaCode-C, on various benchmarks, including HumanEval, MBPP, APPS, and CodeContest

In [6]:
print(response.source_nodes[0].get_content(metadata_mode="all"))

page_label: 9
file_name: codet_avaliacao.pdf
file_path: data\codet_avaliacao.pdf
file_type: application/pdf
file_size: 1272933
creation_date: 2024-10-29
last_modified_date: 2024-10-29

Published as a conference paper at ICLR 2023
of import statements, while the remaining problems are attributed to the failure of the model to
understand the problem descriptions. Figure 5b shows an error case caused by ambiguity. The
correct understanding of the description “sum(first index value, last index value)” is to add the first
and last values, while the code solutions that sum all values from the first to the last are ranked top
1. More real cases can be found in Appendix J. And hope the error analysis can provide inspiration
for future studies on improving code generation for more difficult programming problems.
5 R ELATED WORK
Code Generation with Large Models Recently, a number of large pre-trained language mod-
els have been proposed for code generation. Benefiting from billions of trainable