In [1]:
from llama_index.core import StorageContext, load_index_from_storage
from constants import embed_model

storage_context = StorageContext.from_defaults(persist_dir = "index/")
index = load_index_from_storage(storage_context, embed_model=embed_model)

In [2]:
from llama_index.core.tools import QueryEngineTool
from constants import llm_model

query_engine = index.as_query_engine(llm_model=llm_model, similarity_top_k=5)
rag_tool = QueryEngineTool.from_defaults(
    query_engine=query_engine, 
    name="research_paper_query_engine_tool", 
    description="A RAG engine with recent research papers."
    )

In [3]:
from IPython.display import Markdown, display

def display_prompt_dict(prompts_dict):
    for key, prompt in prompts_dict.items():
        display(Markdown(f"**Prompt key**: {key}"))
        print(prompt.get_template())

In [4]:
prompts_dict = query_engine.get_prompts()
display_prompt_dict(prompts_dict)

**Prompt key**: response_synthesizer:text_qa_template

Context information is below.
---------------------
{context_str}
---------------------
Given the context information and not prior knowledge, answer the query.
Query: {query_str}
Answer: 


**Prompt key**: response_synthesizer:refine_template

The original query is as follows: {query_str}
We have provided an existing answer: {existing_answer}
We have the opportunity to refine the existing answer (only if needed) with some more context below.
------------
{context_msg}
------------
Given the new context, refine the original answer to better answer the query. If the context isn't useful, return the original answer.
Refined Answer: 


In [5]:
from tools import download_pdf, fetch_arxiv_papers
from llama_index.core.tools import FunctionTool

download_pdf_tool = FunctionTool.from_defaults(
    download_pdf,
    name="download_pdf_file_tool",
    description="python function that downloads a pdf file by link"
)

fetch_arxiv_tool = FunctionTool.from_defaults(
    fetch_arxiv_papers,
    name="fetch_from_arxiv",
    description="download the {max_results} recent papers regarding the topic {title} from arxiv"
)

In [6]:
from llama_index.core.agent import ReActAgent

agent = ReActAgent.from_tools([rag_tool, download_pdf_tool, fetch_arxiv_tool], llm=llm_model, verbose=True)

In [7]:
query_template = """I am interesting in {topic}
Find papers in your knowledge database related to this topic.
Use the following template to query research_paper_query_engine_tool tool: 'Provide title, summary, authors and link to download for papers related to {topic}'.
If there are not, could you fetch the recent one from arxiv?
"""

In [8]:
answer = agent.chat(query_template.format(topic="Multi-Modal Models"))

> Running step 70fde91c-6fd4-44d6-b53a-956e6b040fc8. Step input: I am interesting in Multi-Modal Models
Find papers in your knowledge database related to this topic.
Use the following template to query research_paper_query_engine_tool tool: 'Provide title, summary, authors and link to download for papers related to Multi-Modal Models'.
If there are not, could you fetch the recent one from arxiv?

[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me answer the question.
Action: research_paper_query_engine_tool
Action Input: {'input': 'Provide title, summary, authors and link to download for papers related to Multi-Modal Models'}
[0m[1;3;34mObservation: Title: OpenUni: A Simple Baseline for Unified Multimodal Understanding and Generation  
Summary: In this report, we present OpenUni, a simple, lightweight, and fully open-source baseline for unifying multimodal understanding and generation.  
Authors: Size Wu, Zhonghua Wu, Zerui Gong, Qin

In [9]:
Markdown(answer.response)

Here are some recent papers related to Multi-Modal Models:

1. **Title**: OpenUni: A Simple Baseline for Unified Multimodal Understanding and Generation  
   **Summary**: In this report, we present OpenUni, a simple, lightweight, and fully open-source baseline for unifying multimodal understanding and generation.  
   **Authors**: Size Wu, Zhonghua Wu, Zerui Gong, Qingyi Tao, Sheng Jin, Qinyue Li, Wei Li, Chen Change Loy  
   **PDF URL**: [Download PDF](http://arxiv.org/pdf/2505.23661v1)

2. **Title**: VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos  
   **Summary**: MLLMs have been widely studied for video question answering recently. However, most existing assessments focus on natural videos, overlooking synthetic videos, such as AI-generated content (AIGC).  
   **Authors**: Tingyu Song, Tongyan Hu, Guo Gan, Yilun Zhao  
   **PDF URL**: [Download PDF](http://arxiv.org/pdf/2505.23693v1)

3. **Title**: Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence  
   **Summary**: Recent advancements in Multimodal Large Language Models (MLLMs) have significantly enhanced performance on 2D visual tasks. However, improving their spatial intelligence remains a challenge.  
   **Authors**: Diankun Wu, Fangfu Liu, Yi-Hsin Hung, Yueqi Duan  
   **PDF URL**: [Download PDF](http://arxiv.org/pdf/2505.23747v1)

4. **Title**: Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model  
   **Summary**: Unified generation models aim to handle diverse tasks across modalities -- such as text generation, image generation, and vision-language reasoning -- within a single architecture and decoding paradigm.  
   **Authors**: Qingyu Shi, Jinbin Bai, Zhuoran Zhao, Wenhao Chai, Kaidong Yu, Jianzong Wu, Shuangyong Song, Yunhai Tong, Xiangtai Li, Xuelong Li, Shuicheng Yan  
   **PDF URL**: [Download PDF](http://arxiv.org/pdf/2505.23606v1)

5. **Title**: A Comprehensive Evaluation of Multi-Modal Large Language Models for Endoscopy Analysis  
   **Summary**: Endoscopic procedures are essential for diagnosing and treating internal diseases, and multi-modal large language models (MLLMs) are increasingly applied to assist in endoscopy analysis.  
   **Authors**: Shengyuan Liu, Boyun Zheng, Wenting Chen, Zhihao Peng, Zhenfei Yin, Jing Shao, Jiancong Hu, Yixuan Yuan  
   **PDF URL**: [Download PDF](http://arxiv.org/pdf/2505.23601v1)

In [10]:
answer = agent.chat("Download all the papers you mentioned.")

> Running step ce2eeba5-c0a0-4185-9d2b-0c27ee03d72d. Step input: Download all the papers you mentioned.
[1;3;38;5;200mThought: I need to download multiple PDF files from the provided URLs. I will do this one by one for each paper.
Action: download_pdf_file_tool
Action Input: {'pdf_url': 'http://arxiv.org/pdf/2505.23661v1', 'output_file_name': 'OpenUni.pdf'}
[0m[1;3;34mObservation: PDF downloaded successfully and saved as 'papers/OpenUni.pdf'.
[0m> Running step f064c543-b706-4bcf-a41b-805a65b52021. Step input: None
[1;3;38;5;200mThought: (Implicit) I can answer without any more tools!
Answer: Action: download_pdf_file_tool  
Action Input: {'pdf_url': 'http://arxiv.org/pdf/2505.23693v1', 'output_file_name': 'VF-Eval.pdf'}
[0m

In [11]:
Markdown(answer.response)

Action: download_pdf_file_tool  
Action Input: {'pdf_url': 'http://arxiv.org/pdf/2505.23693v1', 'output_file_name': 'VF-Eval.pdf'}

In [12]:
answer = agent.chat(query_template.format(topic="The history of Soccer"))

> Running step b3a77f3e-3fb2-4942-b1ca-8969a6d40e32. Step input: I am interesting in The history of Soccer
Find papers in your knowledge database related to this topic.
Use the following template to query research_paper_query_engine_tool tool: 'Provide title, summary, authors and link to download for papers related to The history of Soccer'.
If there are not, could you fetch the recent one from arxiv?

[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me answer the question.
Action: research_paper_query_engine_tool
Action Input: {'input': 'Provide title, summary, authors and link to download for papers related to The history of Soccer'}
[0m[1;3;34mObservation: Title: Not Found
Summary: Not Found
Authors: Not Found
PDF URL: Not Found
[0m> Running step b5062d05-ef37-4a50-8613-ec070c5327fe. Step input: None
[1;3;38;5;200mThought: It seems there are no papers related to "The history of Soccer" in the knowledge database. I will now fetch 

In [13]:
Markdown(answer.response)

Unfortunately, I couldn't find any research papers related to "The history of Soccer" in the available databases.