# Building an AI Research Assistant with OpenAI and LlamaIndex

Recently, I've been exploring how to leverage AI to help with my research efforts. In this article, I'll show you how I built a simple yet powerful AI research assistant that can search and summarize scientific papers from arXiv.

## The Problem I Wanted to Solve

As someone who tries to stay current with scientific research, I often find myself overwhelmed by the volume of papers published daily. I wanted a tool that could:
1. Search for relevant papers on a topic
2. Summarize the key findings
3. Answer my specific questions about a research area

## My Solution: An AI Research Agent

I created a simple but effective tool using OpenAI's language models and LlamaIndex's tooling capabilities. Here's how it works:





## Import necessary libraries

In [1]:
import openai
from dotenv import load_dotenv
from llama_index.agent.openai import OpenAIAgent
from llama_index.tools.arxiv.base import ArxivToolSpec
from llama_index.llms.openai import OpenAI
import os

## Setup

In [2]:
load_dotenv(override=True)  
openai.api_key = os.getenv("OPENAI_API_KEY")

## Configure LLM

In [3]:
llm = OpenAI(model="gpt-4o-mini")

## Create arXiv tool

In [4]:
arxiv_tool = ArxivToolSpec()

## Build agent

In [5]:
agent = OpenAIAgent.from_tools(
    arxiv_tool.to_tool_list(),
    llm=llm,
    verbose=True,
)

## Ask a research question

In [6]:
print(agent.chat("Whats going on with the superconductor lk-99"))

Added user message to memory: Whats going on with the superconductor lk-99
=== Calling Function ===
Calling function: arxiv_query with args: {"query":"lk-99 superconductor","sort_by":"recent"}
Got output: [Document(id_='9280e9e3-e80e-4ec6-a683-8606477a27fd', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, metadata_template='{key}: {value}', metadata_separator='\n', text_resource=MediaResource(embeddings=None, data=None, text='http://arxiv.org/pdf/2504.16885v1: High Field Magnet Programme -- European Strategy Input\nIn this submission, we describe research goals, implementation, and timelines\nof the High Field Magnet Programme, hosted by CERN. The programme pursues\naccelerator-magnet R&D with low-temperature- and high-temperature\nsuperconductor technology with a main focus on the FCC-hh. Following a long\ntradition of magnet R&D for high-energy particle colliders, HFM R&D fosters\nimportant societal impact through synergi

With this code, I've created an agent that can:
- Connect to the arXiv database of scientific papers
- Use GPT-4o-mini (a smaller, efficient version of OpenAI's GPT-4)
- Search for relevant papers on any topic I ask about
- Generate a coherent summary of the findings

## Real-World Example

I was curious about LK-99, a material that created buzz in the scientific community as a potential room-temperature superconductor. Instead of spending hours reading dozens of papers, I simply asked my agent: "What's going on with the superconductor LK-99?"

The agent then:
1. Searched arXiv for relevant papers
2. Read and analyzed the key findings
3. Provided me with a comprehensive summary of the current state of LK-99 research

## Why This Matters

This simple integration showcases how AI can transform research workflows. What used to take hours of manual searching and reading can now be accomplished in minutes.

The best part is that this approach is extensible - in my notebook, I've also created similar agents for Wikipedia searches, and you could easily add tools for other data sources.

Building this has changed how I approach staying up-to-date with scientific research, and I hope it helps you too!