# OpenAI Agent with LlamaIndex

## Install Dependencies

In [3]:
from openai import OpenAI
from dotenv import load_dotenv
from os import environ

load_dotenv()
OPENROUTER_API_KEY = environ["OPENROUTER_API_KEY"]

client = OpenAI(
  base_url="https://openrouter.ai/api/v1",
  api_key=OPENROUTER_API_KEY,
)

completion = client.chat.completions.create(
  extra_headers={
    "HTTP-Referer": "<YOUR_SITE_URL>", # Optional. Site URL for rankings on openrouter.ai.
    "X-Title": "<YOUR_SITE_NAME>", # Optional. Site title for rankings on openrouter.ai.
  },
  extra_body={},
  model="deepseek/deepseek-r1-0528:free",
  messages=[
    {
      "role": "user",
      "content": "What is the meaning of life?"
    }
  ]
)
print(completion.choices[0].message.content)

That's one of humanity's oldest and most profound questions! **There isn't a single, universally agreed-upon answer,** as the meaning of life is deeply personal, philosophical, and often spiritual. Here's a breakdown of different perspectives:

1.  **Philosophical Perspectives:**
    *   **Existentialism (e.g., Sartre, Camus):** Argues there is *no inherent, pre-defined meaning* given by a god, nature, or the universe. **Meaning is created by the individual** through their choices, actions, commitments, and how they live authentically in the face of an indifferent or absurd world. "Life has whatever meaning *you* give it."
    *   **Hedonism:** Suggests the meaning of life is to **maximize pleasure and minimize pain**.
    *   **Utilitarianism:** Proposes meaning comes from **maximizing well-being and happiness for the greatest number of people**.
    *   **Humanism:** Focuses on human potential, reason, and ethics. Meaning is found in **human flourishing, empathy, progress, creativity

In [None]:
# !pip install uv
# !uv pip install --system -qU llama-index==0.11.6 llama-index-llms-openai llama-index-readers-file llama-index-embeddings-openai "openinference-instrumentation-llama-index>=2" arize-phoenix python-dotenv

## Setup API Keys


In [4]:
from os import environ
from dotenv import load_dotenv

load_dotenv()

# OPENAI_API_KEY = environ["OPENAI_API_KEY"]
OPENAI_API_KEY = environ["OPENROUTER_API_KEY"]


## Import libraries and setup LlamaIndex

In [17]:
!pip install -U llama-index

Collecting llama-index
  Downloading llama_index-0.12.44-py3-none-any.whl.metadata (12 kB)
Collecting llama-index-agent-openai<0.5,>=0.4.0 (from llama-index)
  Downloading llama_index_agent_openai-0.4.11-py3-none-any.whl.metadata (439 bytes)
Collecting llama-index-cli<0.5,>=0.4.2 (from llama-index)
  Downloading llama_index_cli-0.4.3-py3-none-any.whl.metadata (1.4 kB)
Collecting llama-index-embeddings-openai<0.4,>=0.3.0 (from llama-index)
  Downloading llama_index_embeddings_openai-0.3.1-py3-none-any.whl.metadata (684 bytes)
Collecting llama-index-llms-openai<0.5,>=0.4.0 (from llama-index)
  Downloading llama_index_llms_openai-0.4.7-py3-none-any.whl.metadata (3.0 kB)
Collecting llama-index-multi-modal-llms-openai<0.6,>=0.5.0 (from llama-index)
  Downloading llama_index_multi_modal_llms_openai-0.5.1-py3-none-any.whl.metadata (440 bytes)
Collecting llama-index-program-openai<0.4,>=0.3.0 (from llama-index)
  Downloading llama_index_program_openai-0.3.2-py3-none-any.whl.metadata (473 bytes

In [20]:
# 方式二（如果方式一报错，试试这个）
from llama_index.llms.generic.base import GenericLLM

ModuleNotFoundError: No module named 'llama_index.llms.generic'

In [18]:
from llama_index.core import (
    SimpleDirectoryReader,
    VectorStoreIndex,
    StorageContext,
    load_index_from_storage,
)
from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core.agent import ReActAgent
from llama_index.llms.openai import OpenAI
# 替换 LLM 为 GenericLLM 支持 OpenRouter deepseek
from llama_index.llms.generic import GenericLLM
# from dotenv import load_dotenv
# from os import environ

# load_dotenv()
# OPENROUTER_API_KEY = environ["OPENROUTER_API_KEY"]

# Create an llm object to use for the QueryEngine and the ReActAgent
# llm = OpenAI(model="gpt-4") // Uncomment this line if you want to use GPT-4, otherwise use the following line
llm = GenericLLM(
    api_base="https://openrouter.ai/api/v1",
    api_key=OPENROUTER_API_KEY,
    model="deepseek/deepseek-r1-0528:free",
    is_chat_model=True,
)

ModuleNotFoundError: No module named 'llama_index.llms.generic'

# Set up Phoenix

In [6]:
import phoenix as px
session = px.launch_app()

  from .autonotebook import tqdm as notebook_tqdm
  next(self.gen)
  next(self.gen)
  next(self.gen)
  next(self.gen)


🌍 To view the Phoenix app in your browser, visit http://localhost:6006/
📖 For more information on how to use Phoenix, check out https://arize.com/docs/phoenix


In [4]:
from openinference.instrumentation.llama_index import LlamaIndexInstrumentor
from phoenix.otel import register

tracer_provider = register()
LlamaIndexInstrumentor().instrument(tracer_provider=tracer_provider)

🔭 OpenTelemetry Tracing Details 🔭
|  Phoenix Project: default
|  Span Processor: SimpleSpanProcessor
|  Collector Endpoint: localhost:4317
|  Transport: gRPC
|  Transport Headers: {'user-agent': '****'}
|  
|  Using a default SpanProcessor. `add_span_processor` will overwrite this default.
|  
|  `register` has set this TracerProvider as the global OpenTelemetry default.
|  To disable this behavior, call `register` with `set_global_tracer_provider=False`.



## Load Documents

In [7]:
try:
    storage_context = StorageContext.from_defaults(
        persist_dir="./storage/lyft"
    )
    lyft_index = load_index_from_storage(storage_context)

    storage_context = StorageContext.from_defaults(
        persist_dir="./storage/uber"
    )
    uber_index = load_index_from_storage(storage_context)

    index_loaded = True
except:
    index_loaded = False

This is the point we create our vector indexes, by calculating the embedding vectors for each of the chunks. You only need to run this once.

In [12]:
# 安装 HuggingFace embedding 支持
!pip install llama-index-embeddings-huggingface

Collecting llama-index-embeddings-huggingface
  Downloading llama_index_embeddings_huggingface-0.5.5-py3-none-any.whl.metadata (458 bytes)
Collecting huggingface-hub>=0.19.0 (from huggingface-hub[inference]>=0.19.0->llama-index-embeddings-huggingface)
  Downloading huggingface_hub-0.33.1-py3-none-any.whl.metadata (14 kB)
Collecting llama-index-core<0.13,>=0.12.0 (from llama-index-embeddings-huggingface)
  Downloading llama_index_core-0.12.44-py3-none-any.whl.metadata (2.5 kB)
Collecting sentence-transformers>=2.6.1 (from llama-index-embeddings-huggingface)
  Downloading sentence_transformers-4.1.0-py3-none-any.whl.metadata (13 kB)
Collecting banks<3,>=2.0.0 (from llama-index-core<0.13,>=0.12.0->llama-index-embeddings-huggingface)
  Downloading banks-2.1.2-py3-none-any.whl.metadata (12 kB)
Collecting llama-index-workflows<2,>=1.0.1 (from llama-index-core<0.13,>=0.12.0->llama-index-embeddings-huggingface)
  Downloading llama_index_workflows-1.0.1-py3-none-any.whl.metadata (5.5 kB)
Collec

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
llama-index 0.11.6 requires llama-index-core<0.12.0,>=0.11.6, but you have llama-index-core 0.12.44 which is incompatible.
llama-index-agent-openai 0.3.4 requires llama-index-core<0.12.0,>=0.11.0, but you have llama-index-core 0.12.44 which is incompatible.
llama-index-cli 0.3.1 requires llama-index-core<0.12.0,>=0.11.0, but you have llama-index-core 0.12.44 which is incompatible.
llama-index-embeddings-openai 0.2.5 requires llama-index-core<0.12.0,>=0.11.0, but you have llama-index-core 0.12.44 which is incompatible.
llama-index-indices-managed-llama-cloud 0.6.0 requires llama-index-core<0.12.0,>=0.11.13.post1, but you have llama-index-core 0.12.44 which is incompatible.
llama-index-llms-openai 0.2.16 requires llama-index-core<0.12.0,>=0.11.7, but you have llama-index-core 0.12.44 which is incompatible.
llama-ind

In [13]:
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en")

if not index_loaded:
    # load data
    lyft_docs = SimpleDirectoryReader(
        input_files=["./10k/lyft_2021.pdf"]
    ).load_data()
    uber_docs = SimpleDirectoryReader(
        input_files=["./10k/uber_2021.pdf"]
    ).load_data()

    # build index
    lyft_index = VectorStoreIndex.from_documents(lyft_docs, show_progress=True, embed_model=embed_model)
    uber_index = VectorStoreIndex.from_documents(uber_docs, show_progress=True, embed_model=embed_model)

    # persist index
    lyft_index.storage_context.persist(persist_dir="./storage/lyft")
    uber_index.storage_context.persist(persist_dir="./storage/uber")

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
Parsing nodes: 100%|██████████| 238/238 [00:00<00:00, 893.32it/s]
Parsing nodes: 100%|██████████| 238/238 [00:00<00:00, 893.32it/s]
Generating embeddings: 100%|██████████| 343/343 [00:47<00:00,  7.16it/s]
Parsing nodes:   0%|          | 0/307 [00:00<?, ?it/s]
Parsing nodes: 

你的报错信息如下：


#%% vscode.cell [id=4451fb12] [language=]
ValueError: 
******
Could not load OpenAI embedding model. If you intended to use OpenAI, please check your OPENAI_API_KEY.
Original error:
No API key found for OpenAI.
Please set either the OPENAI_API_KEY environment variable or openai.api_key prior to initialization.
API keys can be found or created at https://platform.openai.com/account/api-keys

Consider using embed_model='local'.
Visit our documentation for more embedding options: https://docs.llamaindex.ai/en/stable/module_guides/models/embeddings.html#modules
******



**原因分析：**
- LlamaIndex 默认使用 OpenAI 的 embedding（嵌入）模型（如 text-embedding-ada-002），但你没有设置 `OPENAI_API_KEY`，而是设置了 `OPENROUTER_API_KEY`。
- OpenRouter 的 key 不能直接用于 OpenAI embedding，除非 OpenRouter 也代理了 embedding 模型（目前大多数只代理 chat/completions）。
- 所以，LlamaIndex 在创建向量索引时找不到 OpenAI 的 API key，导致报错。

**解决方法：**

1. **推荐：使用本地或其他 embedding 模型**  
   在创建 `VectorStoreIndex` 时，指定 `embed_model` 参数为 `'local'` 或其他你已安装的 embedding 模型。例如：
   ````python
   # filepath: d:\GitHub\ai_agents_cookbooks\llamaindex\openai_rag_agent_w_evals_deepseekR1.ipynb
   from llama_index.embeddings.huggingface import HuggingFaceEmbedding

   embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en")  # 你可以换成其他 huggingface embedding
   lyft_index = VectorStoreIndex.from_documents(lyft_docs, show_progress=True, embed_model=embed_model)
   uber_index = VectorStoreIndex.from_documents(uber_docs, show_progress=True, embed_model=embed_model)
   ````

2. **或者：设置 OPENAI_API_KEY 环境变量**  
   如果你确实想用 OpenAI 的 embedding，需要在 .env 文件中添加你的 OpenAI API key：
   ```
   OPENAI_API_KEY=sk-xxxxxx
   ```
   并确保 `pip install llama-index-embeddings-openai` 已安装。

3. **或者：用 OpenRouter 支持的 embedding（如果有）**  
   你可以查查 OpenRouter 是否支持 embedding API，如果支持，需在 LlamaIndex 里手动指定 embedding 的 base_url 和 api_key。

---

**总结**：  
你现在只有 OpenRouter 的 key，建议直接用 HuggingFace 的 embedding（如上代码），这样无需 OpenAI 的 key。

Similar code found with 1 license type

**Embedding模型是什么？**

Embedding模型是一种将文本（如单词、句子、段落）转换为固定长度、高维向量（数值数组）的模型。这个向量能捕捉文本的语义信息，使得计算机可以“理解”文本之间的相似性和关系。

**作用：**
- **文本相似度计算**：可以用向量之间的距离（如余弦相似度）来衡量两个文本的语义相似度。
- **向量检索/知识库检索**：将大量文档转为向量后，可以高效地查找与查询最相关的内容（如RAG、知识问答、搜索等）。
- **聚类、分类等NLP任务**：向量可以作为机器学习模型的输入。

**与GPT-4、Deepseek R1等大模型的区别：**

| 类型         | 主要用途                | 输入/输出         | 典型代表                |
|--------------|------------------------|-------------------|-------------------------|
| Embedding模型 | 语义向量化、检索、相似度 | 文本 → 向量       | text-embedding-ada-002、BAAI/bge-small-en |
| 大语言模型（LLM） | 生成、理解、推理、对话    | 文本 → 文本        | GPT-4、deepseek R1      |

- **Embedding模型**只负责把文本变成“语义向量”，本身不生成新文本。
- **GPT-4、deepseek R1等LLM**是生成式模型，可以理解上下文、生成回答、推理等，输入文本输出文本（如对话、写作、代码等）。
- 在RAG（检索增强生成）等应用中，通常**embedding模型用于检索相关内容，LLM用于生成最终答案**。

**总结：**
- Embedding模型=“文本转向量”，用于检索和相似度。
- GPT-4/Deepseek R1=“文本转文本”，用于理解和生成。
- 两者常常配合使用，分别负责“找资料”和“写答案”。

**Embedding模型是什么？**

Embedding模型是一种将文本（如单词、句子、段落）转换为固定长度、高维向量（数值数组）的模型。这个向量能捕捉文本的语义信息，使得计算机可以“理解”文本之间的相似性和关系。

**作用：**
- **文本相似度计算**：可以用向量之间的距离（如余弦相似度）来衡量两个文本的语义相似度。
- **向量检索/知识库检索**：将大量文档转为向量后，可以高效地查找与查询最相关的内容（如RAG、知识问答、搜索等）。
- **聚类、分类等NLP任务**：向量可以作为机器学习模型的输入。

**与GPT-4、Deepseek R1等大模型的区别：**

| 类型         | 主要用途                | 输入/输出         | 典型代表                |
|--------------|------------------------|-------------------|-------------------------|
| Embedding模型 | 语义向量化、检索、相似度 | 文本 → 向量       | text-embedding-ada-002、BAAI/bge-small-en |
| 大语言模型（LLM） | 生成、理解、推理、对话    | 文本 → 文本        | GPT-4、deepseek R1      |

- **Embedding模型**只负责把文本变成“语义向量”，本身不生成新文本。
- **GPT-4、deepseek R1等LLM**是生成式模型，可以理解上下文、生成回答、推理等，输入文本输出文本（如对话、写作、代码等）。
- 在RAG（检索增强生成）等应用中，通常**embedding模型用于检索相关内容，LLM用于生成最终答案**。

**总结：**
- Embedding模型=“文本转向量”，用于检索和相似度。
- GPT-4/Deepseek R1=“文本转文本”，用于理解和生成。
- 两者常常配合使用，分别负责“找资料”和“写答案”。

Now create the query engines.

In [14]:
lyft_engine = lyft_index.as_query_engine(similarity_top_k=3, llm=llm)
uber_engine = uber_index.as_query_engine(similarity_top_k=3, llm=llm)

ValueError: Unknown model 'deepseek/deepseek-r1-0528:free'. Please provide a valid OpenAI model name in: o1-preview, o1-preview-2024-09-12, o1-mini, o1-mini-2024-09-12, gpt-4, gpt-4-32k, gpt-4-1106-preview, gpt-4-0125-preview, gpt-4-turbo-preview, gpt-4-vision-preview, gpt-4-1106-vision-preview, gpt-4-turbo-2024-04-09, gpt-4-turbo, gpt-4o, gpt-4o-2024-05-13, gpt-4o-2024-08-06, chatgpt-4o-latest, gpt-4o-mini, gpt-4o-mini-2024-07-18, gpt-4-0613, gpt-4-32k-0613, gpt-4-0314, gpt-4-32k-0314, gpt-3.5-turbo, gpt-3.5-turbo-16k, gpt-3.5-turbo-0125, gpt-3.5-turbo-1106, gpt-3.5-turbo-0613, gpt-3.5-turbo-16k-0613, gpt-3.5-turbo-0301, text-davinci-003, text-davinci-002, gpt-3.5-turbo-instruct, text-ada-001, text-babbage-001, text-curie-001, ada, babbage, curie, davinci, gpt-35-turbo-16k, gpt-35-turbo, gpt-35-turbo-0125, gpt-35-turbo-1106, gpt-35-turbo-0613, gpt-35-turbo-16k-0613

In [22]:
pip install --upgrade llama-index


Note: you may need to restart the kernel to use updated packages.


In [28]:
!pip install llama-index-llms-generic


ERROR: Could not find a version that satisfies the requirement llama-index-llms-generic (from versions: none)
ERROR: No matching distribution found for llama-index-llms-generic


In [None]:
import llama_index
print(llama_index.__version__)


We can now define the query engines as tools that will be used by the agent.

As there is a query engine per document we need to also define one tool for each of them.

In [None]:
query_engine_tools = [
    QueryEngineTool(
        query_engine=lyft_engine,
        metadata=ToolMetadata(
            name="lyft_10k",
            description=(
                "Provides information about Lyft financials for year 2021. "
                "Use a detailed plain text question as input to the tool."
            ),
        ),
    ),
    QueryEngineTool(
        query_engine=uber_engine,
        metadata=ToolMetadata(
            name="uber_10k",
            description=(
                "Provides information about Uber financials for year 2021. "
                "Use a detailed plain text question as input to the tool."
            ),
        ),
    ),
]

## Creating the Agent
Now we have all the elements to create a LlamaIndex ReactAgent

In [None]:
agent = ReActAgent.from_tools(
    query_engine_tools,
    llm=llm,
    verbose=True,
    max_turns=10,
)

Now we can interact with the agent and ask a question.

In [None]:
response = agent.chat("Which do you like better? Lyft or Uber?")
print(str(response))

> Running step f096df50-a6f3-4889-b875-529dbed43324. Step input: Which do you like better? Lyft or Uber?
[1;3;38;5;200mThought: As an artificial intelligence, I don't have personal experiences or preferences. I can provide information and analysis based on data, but I don't have personal opinions or feelings.
Answer: As an artificial intelligence, I don't have personal preferences or feelings, so I don't have a preference between Lyft and Uber. I'm here to provide objective information and help answer your questions to the best of my ability.
[0mAs an artificial intelligence, I don't have personal preferences or feelings, so I don't have a preference between Lyft and Uber. I'm here to provide objective information and help answer your questions to the best of my ability.
