# 第二章 高级的RAG管道

接下来，导入该课程需要的工具包utils，然后设置openai的API密钥。
有三种方式设置API密钥：
1. 在环境变量中设置`OPENAI_API_KEY`，然后使用utils直接获取；
2. 显式设置api_key，直接赋值给openai.api_key；
3. 如果没有openai的密钥的话，也可以选择使用第三方服务，修改openai.api_base即可；

In [1]:
import utils
# 导入自定义的工具包

import os
import openai
# openai.api_key = utils.get_openai_api_key()
# 设置OpenAI的API密钥，从环境变量中获取

# openai.api_key = "" 
# 或者这里填入你的OpenAI API密钥

# openai.api_key = "sk- "  
# openai.api_base = " "
# 或者自定义API密钥和API基础地址，可适用第三方API服务


✅ In Answer Relevance, input prompt will be set to __record__.main_input or `Select.RecordInput` .
✅ In Answer Relevance, input response will be set to __record__.main_output or `Select.RecordOutput` .
✅ In Context Relevance, input prompt will be set to __record__.main_input or `Select.RecordInput` .
✅ In Context Relevance, input response will be set to __record__.app.query.rets.source_nodes[:].node.text .
✅ In Groundedness, input source will be set to __record__.app.query.rets.source_nodes[:].node.text .
✅ In Groundedness, input statement will be set to __record__.main_output or `Select.RecordOutput` .


载入文本数据

In [2]:
from llama_index import SimpleDirectoryReader

documents = SimpleDirectoryReader(
    input_files=["data/人工智能.pdf"]
).load_data()

In [3]:
print(type(documents), "\n")
print(len(documents), "\n")
print(type(documents[0]))
print(documents[0])

<class 'list'> 

7 

<class 'llama_index.schema.Document'>
Doc ID: c449baf2-03a3-4f83-a713-f888f7afd7a7
Text: 2/2/24, 2:43 PM ⼈⼯智能  - 维基百科，⾃由的百科全书
https://zh.wikipedia.org/wiki/ ⼈⼯智能 2/13“⼈⼯智能”的各地常⽤名称 中国⼤陆⼈⼯智能 台湾⼈⼯智慧
港澳⼈⼯智能 新⻢⼈⼯智能、⼈⼯智慧 ⽇韩⼈⼯知能 越南智慧⼈造 [展开] [展开] [展开] [展开] [展开] [展开]⼈⼯智能系列内容
主要⽬标 实现⽅式 ⼈⼯智能哲学 历史 技术 术语⼈⼯智能（英语：artiﬁcial intelligence ，缩写为
AI）亦称机器智能，指由⼈制造出来的机器所表现出来的智能。通常⼈⼯
智能是指⽤普通计算机程序来呈现⼈类智能的技术。该词也指出研究这样的智能系统是否能够实现，以及如何实现。同 时，通过 医学 、神经科学
、机器⼈学 及...


In [4]:
from llama_index import SimpleDirectoryReader

documents_en = SimpleDirectoryReader(
    input_files=["data/eBook-How-to-Build-a-Career-in-AI.pdf"]
).load_data()

In [5]:
print(type(documents_en), "\n")
print(len(documents_en), "\n")
print(type(documents_en[0]))
print(documents_en[0])

<class 'list'> 

41 

<class 'llama_index.schema.Document'>
Doc ID: cd516cc9-b646-4394-ab4e-0465c9090af8
Text: PAGE 1Founder, DeepLearning.AICollected Insights from Andrew Ng
How to  Build Your Career in AIA Simple Guide


## 一、基础RAG通道

这里通过将 documents 中各个文档的文本连接成一个字符串，然后创建了一个 Document 实例，该实例代表了整个文档集合。

In [6]:
from llama_index import Document

# 将documents中的内容合并成一个大文档，而不是每一页都是一个文档
document = Document(text="\n\n".join([doc.text for doc in documents]))
document_en = Document(text="\n\n".join([doc.text for doc in documents_en]))

In [7]:
# 将中文标点符号替换成英文标点符号，方便后续处理
# 如果是英文文档，可以跳过这一步
# 不处理的话，会导致无法正确切分中文句子，会影响后续sentence_window的大小，导致输入长度大于gpt-3.5-turbo的最大限制
document.text=document.text.replace('。','. ')
document.text=document.text.replace('！','! ')
document.text=document.text.replace('？','? ')

llm-使用 OpenAI 类创建了一个 GPT-3.5-turbo 模型的实例，并设置了温度参数为 0.1。  
service_context-使用 ServiceContext 类创建了一个服务上下文实例，包含了前面创建的 GPT-3.5-turbo 模型以及指定的嵌入模型。  
index-使用 VectorStoreIndex.from_documents 方法，基于之前创建的文档和服务上下文，创建了一个向量存储索引。

In [8]:
from llama_index import VectorStoreIndex
from llama_index import ServiceContext
from llama_index.llms import OpenAI

# 设置使用的大模型
# "gpt-3.5-turbo"是模型的名称
# temperature是温度，用来控制文本生成过程中的多样性
llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)

# 设置embedding模型
# 这里是在本地使用BAAI/bge-small-zh-v1.5
# document的所有的内容会索引到sentence index对象中
# 国内使用可以切换huggingface镜像站
service_context = ServiceContext.from_defaults(
    llm=llm, embed_model="local:BAAI/bge-small-zh-v1.5"
)
index = VectorStoreIndex.from_documents([document],
                                        service_context=service_context)

In [9]:
# llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)
service_context_en = ServiceContext.from_defaults(
    llm=llm, embed_model="local:BAAI/bge-small-en-v1.5"
)
index_en = VectorStoreIndex.from_documents([document_en],
                                        service_context=service_context_en)

将之前创建的向量存储索引转换为查询引擎，以便后续进行查询操作。

In [10]:
query_engine = index.as_query_engine()
query_engine_en = index_en.as_query_engine()

使用查询引擎执行了一个查询操作，查询给定的问题。

In [11]:
response = query_engine.query(
    "在寻找项目以积累经验时应采取哪些步骤?"
)
print(str(response))

在寻找项目以积累经验时，应该首先明确制定目标并确保能够实现这些目标。然后需要建立一个可预测的世界模型，将整个世界状态用数学模型表现出来，并能够预测它们的行为将如何改变这个世界。在多Agent中，可以通过合作和竞争的方式去完成一定的目标，利用演化算法和群体智能来达成一个整体的突现行为目标。最后，需要通过智能推理来得到新的知识，结合先验知识和特定的推理规则，以便积累更多经验。


In [12]:
response_en = query_engine_en.query(
    "What are steps to take when finding projects to build your experience?"
)
print(str(response_en))

Develop a side hustle, ensure the project will help you grow technically, collaborate with good teammates, and consider if the project can serve as a stepping stone to larger projects.


## 二、使用TruLens进行评测

In [13]:
eval_questions = []
with open('data/eval_questions.txt', 'r') as file:
    for line in file:
        # Remove newline character and convert to integer
        item = line.strip()
        print(item)
        eval_questions.append(item)

人工智能中的先验知识是如何被存储的？
人工智能的自我更新和自我提升是否可能导致其脱离人类的控制？
管理者如何管理AI？
强人工智能是什么？
人工智能被滥用带来的危害？


In [14]:
eval_questions_en = []
with open('data/eval_questions_en.txt', 'r') as file:
    for line in file:
        # Remove newline character and convert to integer
        item = line.strip()
        print(item)
        eval_questions_en.append(item)

What are the keys to building a career in AI?
How can teamwork contribute to success in AI?
What is the importance of networking in AI?
What are some good habits to develop for a successful career?
How can altruism be beneficial in building a career?
What is imposter syndrome and how does it relate to AI?
Who are some accomplished individuals who have experienced imposter syndrome?
What is the first step to becoming good at AI?
What are some common challenges in AI?
Is it normal to find parts of AI challenging?


加上自定义问题。

In [15]:
# You can try your own question:
new_question = "什么是适合我的人工智能工作?"
eval_questions.append(new_question)

In [16]:
eval_questions

['人工智能中的先验知识是如何被存储的？',
 '人工智能的自我更新和自我提升是否可能导致其脱离人类的控制？',
 '管理者如何管理AI？',
 '强人工智能是什么？',
 '人工智能被滥用带来的危害？',
 '什么是适合我的人工智能工作?']

In [17]:
# You can try your own question:
new_question_en = "What is the right AI job for me?"
eval_questions_en.append(new_question_en)
eval_questions_en

['What are the keys to building a career in AI?',
 'How can teamwork contribute to success in AI?',
 'What is the importance of networking in AI?',
 'What are some good habits to develop for a successful career?',
 'How can altruism be beneficial in building a career?',
 'What is imposter syndrome and how does it relate to AI?',
 'Who are some accomplished individuals who have experienced imposter syndrome?',
 'What is the first step to becoming good at AI?',
 'What are some common challenges in AI?',
 'Is it normal to find parts of AI challenging?',
 'What is the right AI job for me?']

通过调用 reset_database() 方法重置 Trulens 数据库。清空之前的记录和反馈数据。

首先需要安装本课程中需要的评估框架，如果已经安装就可以跳过这一步骤。

In [None]:
# requirements
# pip install trulens_eval

In [18]:
# 导入Tru类
from trulens_eval import Tru


# 实例化Tru类
tru = Tru()

# 重置数据库
# 数据库之后会用来存储问题、中间召回结果、答案以及评估结果
tru.reset_database()


🦑 Tru initialized with db url sqlite:///default.sqlite .
🛑 Secret keys may be written to the database. See the `database_redact_keys` option of `Tru` to prevent this.


使用 get_prebuilt_trulens_recorder 函数创建一个 Trulens 记录器 (tru_recorder)，该记录器与给定的查询引擎 (query_engine) 相关联。同时，指定了应用程序的标识为 "Direct Query Engine"。

In [19]:
from utils import get_prebuilt_trulens_recorder

tru_recorder = get_prebuilt_trulens_recorder(query_engine,
                                             app_id="Direct Query Engine")
tru_recorder_en = get_prebuilt_trulens_recorder(query_engine_en,
                                             app_id="Direct Query Engine_en")

使用 tru_recorder 记录器开始记录过程，遍历 eval_questions 列表，对每个问题进行查询，并将查询引擎的响应记录下来。

In [20]:
with tru_recorder as recording:
    for question in eval_questions:
        response = query_engine.query(question)

In [21]:
with tru_recorder_en as recording_en:
    for question in eval_questions_en:
        response_en = query_engine_en.query(question)

获取 Trulens 记录和反馈数据。用于后续分析和评估。

In [22]:
records, feedback = tru.get_records_and_feedback(app_ids=[])

In [23]:
records.head()

Unnamed: 0,app_id,app_json,type,record_id,input,output,tags,record_json,cost_json,perf_json,ts,Answer Relevance,Context Relevance,Groundedness,Answer Relevance_calls,Context Relevance_calls,Groundedness_calls,latency,total_tokens,total_cost
0,Direct Query Engine,"{""tru_class_info"": {""name"": ""TruLlama"", ""modul...",RetrieverQueryEngine(llama_index.query_engine....,record_hash_8631870172f29b7facfc339a7c52c465,"""\u4eba\u5de5\u667a\u80fd\u4e2d\u7684\u5148\u9...","""\u5148\u9a8c\u77e5\u8bc6\u5728\u4eba\u5de5\u6...",-,"{""record_id"": ""record_hash_8631870172f29b7facf...","{""n_requests"": 0, ""n_successful_requests"": 0, ...","{""start_time"": ""2024-03-12T13:22:02.775333"", ""...",2024-03-12T13:22:06.003752,1.0,0.6,1.0,"[{'args': {'prompt': '人工智能中的先验知识是如何被存储的？', 're...","[{'args': {'prompt': '人工智能中的先验知识是如何被存储的？', 're...",[{'args': {'source': '[10] 早期的⼈⼯智能研究⼈员直接模仿⼈类进⾏...,3,0,0.0
1,Direct Query Engine,"{""tru_class_info"": {""name"": ""TruLlama"", ""modul...",RetrieverQueryEngine(llama_index.query_engine....,record_hash_c7d94d33ddf370a8172a8ce8da3b70ec,"""\u4eba\u5de5\u667a\u80fd\u7684\u81ea\u6211\u6...","""\u4eba\u5de5\u667a\u80fd\u7684\u81ea\u6211\u6...",-,"{""record_id"": ""record_hash_c7d94d33ddf370a8172...","{""n_requests"": 0, ""n_successful_requests"": 0, ...","{""start_time"": ""2024-03-12T13:22:06.220520"", ""...",2024-03-12T13:22:08.290158,1.0,0.8,0.733333,[{'args': {'prompt': '人工智能的自我更新和自我提升是否可能导致其脱离人...,[{'args': {'prompt': '人工智能的自我更新和自我提升是否可能导致其脱离人...,[{'args': {'source': '他认为各国应该强制订定规定AI机器只能⽤于⼈类不...,2,0,0.0
2,Direct Query Engine,"{""tru_class_info"": {""name"": ""TruLlama"", ""modul...",RetrieverQueryEngine(llama_index.query_engine....,record_hash_80e60e443d0b66da0c5879fde60f2163,"""\u7ba1\u7406\u8005\u5982\u4f55\u7ba1\u7406AI\...","""Management of AI by managers involves treatin...",-,"{""record_id"": ""record_hash_80e60e443d0b66da0c5...","{""n_requests"": 0, ""n_successful_requests"": 0, ...","{""start_time"": ""2024-03-12T13:22:08.469294"", ""...",2024-03-12T13:22:11.210570,0.8,0.6,0.0,"[{'args': {'prompt': '管理者如何管理AI？', 'response':...","[{'args': {'prompt': '管理者如何管理AI？', 'response':...",[{'args': {'source': '创造⼒ 伦理管理 经济冲击 AI对⼈类的威胁 悲...,2,0,0.0
3,Direct Query Engine,"{""tru_class_info"": {""name"": ""TruLlama"", ""modul...",RetrieverQueryEngine(llama_index.query_engine....,record_hash_1b323b65311a8cdfa6512765d88c3abb,"""\u5f3a\u4eba\u5de5\u667a\u80fd\u662f\u4ec0\u4...","""\u5f3a\u4eba\u5de5\u667a\u80fd\u662f\u4e00\u7...",-,"{""record_id"": ""record_hash_1b323b65311a8cdfa65...","{""n_requests"": 0, ""n_successful_requests"": 0, ...","{""start_time"": ""2024-03-12T13:22:11.565023"", ""...",2024-03-12T13:22:14.306188,1.0,0.9,0.5,"[{'args': {'prompt': '强人工智能是什么？', 'response': ...","[{'args': {'prompt': '强人工智能是什么？', 'response': ...",[{'args': {'source': '⾮⼈类的⼈⼯智能，即机器产⽣了和⼈完全不⼀样的知...,2,0,0.0
4,Direct Query Engine,"{""tru_class_info"": {""name"": ""TruLlama"", ""modul...",RetrieverQueryEngine(llama_index.query_engine....,record_hash_54ac38298eed725ca9879c20b65b6ad0,"""\u4eba\u5de5\u667a\u80fd\u88ab\u6ee5\u7528\u5...","""The misuse of artificial intelligence technol...",-,"{""record_id"": ""record_hash_54ac38298eed725ca98...","{""n_requests"": 0, ""n_successful_requests"": 0, ...","{""start_time"": ""2024-03-12T13:22:14.764807"", ""...",2024-03-12T13:22:17.776310,0.9,0.9,0.666667,"[{'args': {'prompt': '人工智能被滥用带来的危害？', 'respons...","[{'args': {'prompt': '人工智能被滥用带来的危害？', 'respons...",[{'args': {'source': '截⾄2020年12⽉，Sensity检测到的相关...,3,0,0.0


运行 Trulens 仪表板以可视化评估结果。

In [None]:
# launches on http://localhost:8501/
tru.run_dashboard()

## 三、高级的RAG通道

### 3.1 滑窗句子检索

创建 OpenAI 的 GPT-3.5-turbo 语言模型实例：

In [26]:
from llama_index.llms import OpenAI

llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)

使用辅助函数 build_sentence_window_index 创建基于窗口的句子索引：

In [27]:
from utils import build_sentence_window_index

sentence_index = build_sentence_window_index(
    document,
    llm,
    embed_model="local:BAAI/bge-small-zh-v1.5",
    save_dir="sentence_index"
)

In [28]:
sentence_index_en = build_sentence_window_index(
    document_en,
    llm,
    embed_model="local:BAAI/bge-small-en-v1.5",
    save_dir="sentence_index_en"
)

使用辅助函数 get_sentence_window_query_engine 获取基于句子窗口的查询引擎：

In [29]:
from utils import get_sentence_window_query_engine

# 根据sentence_index对象创建一个搜索引擎
# 之后会被用于在RAG应用中进行召回
sentence_window_engine = get_sentence_window_query_engine(sentence_index)

In [30]:
sentence_window_engine_en = get_sentence_window_query_engine(sentence_index_en)

对一个特定的问题进行查询并打印结果：

In [31]:
window_response = sentence_window_engine.query(
    "如何开始人工智能个人项目?"
)
str(window_response)

'通过模仿人类思考模式，尝试逐步推理是一个开始人工智能个人项目的方法。另外，利用概率和经济学概念处理不确定或不完整的信息也是一个成功的方法。在解决困难问题时，寻找更有效的算法是优先考虑的。最终，强调感知运动的重要性也是一个可以考虑的方向。'

In [32]:
window_response_en = sentence_window_engine_en.query(
    "how do I get started on a personal project in AI?"
)
str(window_response_en)

"To get started on a personal project in AI, it is important to first identify a project that aligns with your career goals and interests. Once you have chosen a project, you can begin by scoping it out, defining the objectives, and outlining the steps needed to achieve them. It's essential to ensure that the project is responsible, ethical, and beneficial to people. As you work on the project, aim to grow in terms of scope, complexity, and impact over time. Building a portfolio of projects that demonstrate skill progression can also be valuable for your career development in AI."

重置 Trulens 数据库，  
使用 Trulens 记录器对基于窗口的句子索引进行评估，记录查询结果：

In [33]:
tru.reset_database()

tru_recorder_sentence_window = get_prebuilt_trulens_recorder(
    sentence_window_engine,
    app_id = "Sentence Window Query Engine"
)


In [34]:
tru_recorder_sentence_window_en = get_prebuilt_trulens_recorder(
    sentence_window_engine_en,
    app_id = "Sentence Window Query Engine_en"
)

In [35]:
for question in eval_questions:
    with tru_recorder_sentence_window as recording:
        response = sentence_window_engine.query(question)
        print(question)
        print(str(response))

人工智能中的先验知识是如何被存储的？
人工智能中的先验知识是通过某种方式告知机器的知识，可以描述目标、特征、种类及对象之间的关系，也可以描述事件、时间、状态、原因和结果，以及任何需要机器存储的知识。
人工智能的自我更新和自我提升是否可能导致其脱离人类的控制？
人工智能的自我更新和自我提升可能导致其脱离人类的控制。
管理者如何管理AI？
Managers can manage AI by considering the following suggestions:
1. Delegate administrative tasks.
2. Focus on enhancing their comprehensive judgment and creativity in the field of analysis and prediction.
3. Treat AI as a colleague and form a collaborative team.
4. Recognize that all technologies, including AI, have limitations and may face bottlenecks.
强人工智能是什么？
强人工智能是一种观点，认为计算机本身具有思维，而不仅仅是用来模拟人类思维的工具。根据这个观点，只要计算机运行适当的程序，它们就具有自己的思维能力。
人工智能被滥用带来的危害？
The misuse of artificial intelligence could potentially lead to violations of copyright laws and other legal regulations. There have been cases where artificial intelligence technology has been used to remove mosaic from explicit videos, alter the appearance of individuals in videos, and other related incidents. Additionally, experts have raised concerns about the potential

In [36]:
for question in eval_questions_en:
    with tru_recorder_sentence_window_en as recording:
        response = sentence_window_engine.query(question)
        print(question)
        print(str(response))

A new object of type <class 'llama_index.query_engine.retriever_query_engine.RetrieverQueryEngine'> at 0x1e71800dbd0 is calling an instrumented method <function BaseQueryEngine.query at 0x000001E7720134C0>. The path of this call may be incorrect.
Guessing path of new object is app based on other object (0x1e7167b0190) using this function.
A new object of type <class 'llama_index.query_engine.retriever_query_engine.RetrieverQueryEngine'> at 0x1e71800dbd0 is calling an instrumented method <function RetrieverQueryEngine.retrieve at 0x000001E779B307C0>. The path of this call may be incorrect.
Guessing path of new object is app based on other object (0x1e7167b0190) using this function.
A new object of type <class 'llama_index.indices.vector_store.retrievers.retriever.VectorIndexRetriever'> at 0x1e716c1fa90 is calling an instrumented method <function BaseRetriever.retrieve at 0x000001E77483AD40>. The path of this call may be incorrect.
Guessing path of new object is app.retriever based on ot

What are the keys to building a career in AI?
Understanding the characteristics of intelligent systems, studying introductory materials on artificial intelligence, and gaining expertise in problem solving, puzzle solving, game playing, and deduction are key elements to building a career in AI.


A new object of type <class 'llama_index.query_engine.retriever_query_engine.RetrieverQueryEngine'> at 0x1e71800dbd0 is calling an instrumented method <function RetrieverQueryEngine.retrieve at 0x000001E779B307C0>. The path of this call may be incorrect.
Guessing path of new object is app based on other object (0x1e7167b0190) using this function.
A new object of type <class 'llama_index.response_synthesizers.compact_and_refine.CompactAndRefine'> at 0x1e717f6fd90 is calling an instrumented method <function Refine.get_response at 0x000001E774838860>. The path of this call may be incorrect.
Guessing path of new object is app._response_synthesizer based on other object (0x1e7167b01d0) using this function.


How can teamwork contribute to success in AI?
Teamwork can contribute to success in AI by treating AI as a colleague and forming a collaborative team. This approach fosters synergy and cooperation, allowing for a more effective utilization of AI technologies within various processes and creative endeavors.


A new object of type <class 'llama_index.query_engine.retriever_query_engine.RetrieverQueryEngine'> at 0x1e71800dbd0 is calling an instrumented method <function RetrieverQueryEngine.retrieve at 0x000001E779B307C0>. The path of this call may be incorrect.
Guessing path of new object is app based on other object (0x1e7167b0190) using this function.
A new object of type <class 'llama_index.response_synthesizers.compact_and_refine.CompactAndRefine'> at 0x1e717f6fd90 is calling an instrumented method <function Refine.get_response at 0x000001E774838860>. The path of this call may be incorrect.
Guessing path of new object is app._response_synthesizer based on other object (0x1e7167b01d0) using this function.


What is the importance of networking in AI?
Networking in AI is crucial as it allows for the exchange of information and collaboration among researchers, experts, and professionals in the field. Through networking, individuals can share insights, best practices, and advancements in artificial intelligence, fostering innovation and progress in the development of AI technologies. Additionally, networking provides opportunities for partnerships, joint projects, and knowledge sharing, which are essential for the growth and evolution of AI applications across various industries and domains.


A new object of type <class 'llama_index.query_engine.retriever_query_engine.RetrieverQueryEngine'> at 0x1e71800dbd0 is calling an instrumented method <function RetrieverQueryEngine.retrieve at 0x000001E779B307C0>. The path of this call may be incorrect.
Guessing path of new object is app based on other object (0x1e7167b0190) using this function.
A new object of type <class 'llama_index.response_synthesizers.compact_and_refine.CompactAndRefine'> at 0x1e717f6fd90 is calling an instrumented method <function Refine.get_response at 0x000001E774838860>. The path of this call may be incorrect.
Guessing path of new object is app._response_synthesizer based on other object (0x1e7167b01d0) using this function.


What are some good habits to develop for a successful career?
Developing habits such as relinquishing administrative tasks and focusing on enhancing comprehensive judgment skills can be beneficial for a successful career.


A new object of type <class 'llama_index.query_engine.retriever_query_engine.RetrieverQueryEngine'> at 0x1e71800dbd0 is calling an instrumented method <function RetrieverQueryEngine.retrieve at 0x000001E779B307C0>. The path of this call may be incorrect.
Guessing path of new object is app based on other object (0x1e7167b0190) using this function.
A new object of type <class 'llama_index.response_synthesizers.compact_and_refine.CompactAndRefine'> at 0x1e717f6fd90 is calling an instrumented method <function Refine.get_response at 0x000001E774838860>. The path of this call may be incorrect.
Guessing path of new object is app._response_synthesizer based on other object (0x1e7167b01d0) using this function.


How can altruism be beneficial in building a career?
Altruism can be beneficial in building a career by fostering positive relationships, creating a supportive network, and enhancing one's reputation within a professional community. It can also lead to opportunities for collaboration, mentorship, and personal growth, ultimately contributing to long-term career success.


A new object of type <class 'llama_index.query_engine.retriever_query_engine.RetrieverQueryEngine'> at 0x1e71800dbd0 is calling an instrumented method <function RetrieverQueryEngine.retrieve at 0x000001E779B307C0>. The path of this call may be incorrect.
Guessing path of new object is app based on other object (0x1e7167b0190) using this function.
A new object of type <class 'llama_index.response_synthesizers.compact_and_refine.CompactAndRefine'> at 0x1e717f6fd90 is calling an instrumented method <function Refine.get_response at 0x000001E774838860>. The path of this call may be incorrect.
Guessing path of new object is app._response_synthesizer based on other object (0x1e7167b01d0) using this function.


What is imposter syndrome and how does it relate to AI?
Imposter syndrome is a psychological pattern where individuals doubt their accomplishments and have a persistent fear of being exposed as a fraud. In the context of AI, imposter syndrome can manifest among researchers or professionals who may feel inadequate or fraudulent in their work within the highly technical and specialized field of artificial intelligence. This feeling may arise due to the complexity and vast scope of AI research, leading individuals to question their own abilities and knowledge in comparison to the breadth of the field.


A new object of type <class 'llama_index.query_engine.retriever_query_engine.RetrieverQueryEngine'> at 0x1e71800dbd0 is calling an instrumented method <function RetrieverQueryEngine.retrieve at 0x000001E779B307C0>. The path of this call may be incorrect.
Guessing path of new object is app based on other object (0x1e7167b0190) using this function.
A new object of type <class 'llama_index.response_synthesizers.compact_and_refine.CompactAndRefine'> at 0x1e717f6fd90 is calling an instrumented method <function Refine.get_response at 0x000001E774838860>. The path of this call may be incorrect.
Guessing path of new object is app._response_synthesizer based on other object (0x1e7167b01d0) using this function.


Who are some accomplished individuals who have experienced imposter syndrome?
Stephen Hawking and Elon Musk are some accomplished individuals who have experienced imposter syndrome.


A new object of type <class 'llama_index.query_engine.retriever_query_engine.RetrieverQueryEngine'> at 0x1e71800dbd0 is calling an instrumented method <function RetrieverQueryEngine.retrieve at 0x000001E779B307C0>. The path of this call may be incorrect.
Guessing path of new object is app based on other object (0x1e7167b0190) using this function.
A new object of type <class 'llama_index.response_synthesizers.compact_and_refine.CompactAndRefine'> at 0x1e717f6fd90 is calling an instrumented method <function Refine.get_response at 0x000001E774838860>. The path of this call may be incorrect.
Guessing path of new object is app._response_synthesizer based on other object (0x1e7167b01d0) using this function.


What is the first step to becoming good at AI?
Studying and understanding the complex mathematical tools developed in artificial intelligence research to solve specific branch problems is the first step to becoming proficient in AI.


A new object of type <class 'llama_index.query_engine.retriever_query_engine.RetrieverQueryEngine'> at 0x1e71800dbd0 is calling an instrumented method <function RetrieverQueryEngine.retrieve at 0x000001E779B307C0>. The path of this call may be incorrect.
Guessing path of new object is app based on other object (0x1e7167b0190) using this function.
A new object of type <class 'llama_index.response_synthesizers.compact_and_refine.CompactAndRefine'> at 0x1e717f6fd90 is calling an instrumented method <function Refine.get_response at 0x000001E774838860>. The path of this call may be incorrect.
Guessing path of new object is app._response_synthesizer based on other object (0x1e7167b01d0) using this function.


What are some common challenges in AI?
Some common challenges in AI include the fragmentation of AI into subfields that may not communicate effectively with each other, difficulties in areas such as vision, natural language processing, decision theory, genetic algorithms, and robotics, and the potential ethical concerns surrounding the development and deployment of AI technologies.


A new object of type <class 'llama_index.query_engine.retriever_query_engine.RetrieverQueryEngine'> at 0x1e71800dbd0 is calling an instrumented method <function RetrieverQueryEngine.retrieve at 0x000001E779B307C0>. The path of this call may be incorrect.
Guessing path of new object is app based on other object (0x1e7167b0190) using this function.
A new object of type <class 'llama_index.response_synthesizers.compact_and_refine.CompactAndRefine'> at 0x1e717f6fd90 is calling an instrumented method <function Refine.get_response at 0x000001E774838860>. The path of this call may be incorrect.
Guessing path of new object is app._response_synthesizer based on other object (0x1e7167b01d0) using this function.


Is it normal to find parts of AI challenging?
It is common for individuals to find certain aspects of Artificial Intelligence challenging.


A new object of type <class 'llama_index.query_engine.retriever_query_engine.RetrieverQueryEngine'> at 0x1e71800dbd0 is calling an instrumented method <function RetrieverQueryEngine.retrieve at 0x000001E779B307C0>. The path of this call may be incorrect.
Guessing path of new object is app based on other object (0x1e7167b0190) using this function.
A new object of type <class 'llama_index.response_synthesizers.compact_and_refine.CompactAndRefine'> at 0x1e717f6fd90 is calling an instrumented method <function Refine.get_response at 0x000001E774838860>. The path of this call may be incorrect.
Guessing path of new object is app._response_synthesizer based on other object (0x1e7167b01d0) using this function.


What is the right AI job for me?
A suitable AI job for you would involve working in the field of data science or artificial intelligence. These areas are in high demand and offer various opportunities for individuals with the right skills and expertise. Consider roles such as data scientist, AI specialist, machine learning engineer, or AI researcher, depending on your interests and qualifications.


获取性能评估的排行榜：

In [37]:
tru.get_leaderboard(app_ids=[])

Unnamed: 0_level_0,Groundedness,Answer Relevance,Context Relevance,latency,total_cost
app_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Sentence Window Query Engine,0.811111,0.916667,0.741667,4.0,0.0
Sentence Window Query Engine_en,0.3,0.92,0.07,3.727273,0.0


In [None]:
# launches on http://localhost:8501/
tru.run_dashboard()

### 3.2 自动合并检索

In [39]:
from utils import build_automerging_index

automerging_index = build_automerging_index(
    documents,
    llm,
    embed_model="local:BAAI/bge-small-zh-v1.5",
    save_dir="merging_index"
)

In [40]:
automerging_index_en = build_automerging_index(
    documents_en,
    llm,
    embed_model="local:BAAI/bge-small-en-v1.5",
    save_dir="merging_index_en"
)

In [41]:
from utils import get_automerging_query_engine

automerging_query_engine = get_automerging_query_engine(
    automerging_index,
)


In [42]:
automerging_query_engine_en = get_automerging_query_engine(
    automerging_index_en,
)

In [43]:
auto_merging_response = automerging_query_engine.query(
    "如何开始人工智能个人项目?"
)

print(str(auto_merging_response))


Begin a personal artificial intelligence project by first identifying a specific problem or application you want to work on. Then, explore various tools and technologies that can help you achieve your project goals. Design and create your project with a focus on utilizing different tools to implement the desired application effectively.


In [44]:
auto_merging_response_en = automerging_query_engine_en.query(
    "how do I get started on a personal project in AI?"
)
print(str(auto_merging_response_en))

To get started on a personal project in AI, communicating the value of what you hope to build can help bring colleagues, mentors, and managers onboard. This will also help them point out any flaws in your reasoning. After finishing the project, being able to clearly explain what you accomplished will help convince others to open the door to larger projects. Additionally, starting with small projects in your spare time and achieving initial successes, no matter how small, can help build your skills and increase your ability to come up with better ideas. Joining existing projects or finding someone with an idea to collaborate with can also be a good way to kickstart your personal project in AI.


In [45]:
tru.reset_database()

tru_recorder_automerging = get_prebuilt_trulens_recorder(automerging_query_engine,
                                                         app_id="Automerging Query Engine")


In [46]:


tru_recorder_automerging_en = get_prebuilt_trulens_recorder(automerging_query_engine_en,
                                                         app_id="Automerging Query Engine_en")

In [47]:
for question in eval_questions:
    with tru_recorder_automerging as recording:
        response = automerging_query_engine.query(question)
        print(question)
        print(response)


> Merging 4 nodes into parent node.
> Parent node id: 610d624d-381b-4f98-a6df-b6d88b59cca8.
> Parent node text: [16]
⼈类解决问题的模式通常是⽤最快捷、直观的判断，⽽不是有意识的、⼀步⼀步的推导，早期⼈⼯智能研究通常使⽤逐步推导的⽅式。[17]⼈⼯智能研究已经
于这种“次表征性的”解决问题⽅法获取进展...

人工智能中的先验知识是如何被存储的？
In artificial intelligence, prior knowledge is stored in a way that allows machines to retain descriptions of objectives, features, categories, and relationships between objects. It can also encompass descriptions of events, time, states, causes, results, or any knowledge that one wishes the machine to store. This stored knowledge can include both explicitly provided information and knowledge derived through intelligent reasoning processes.
人工智能的自我更新和自我提升是否可能导致其脱离人类的控制？
Yes, the self-updating and self-improving capabilities of artificial intelligence could potentially lead to it surpassing human control.
> Merging 2 nodes into parent node.
> Parent node id: 0e4eeea0-999b-4905-9fce-9f7fa045325d.
> Parent node text: 依⽬前的研究⽅向，电脑⽆法突变、苏醒、产⽣⾃我意志，AI也不可能具有创意与智能、同情⼼

In [49]:
for question in eval_questions_en:
    with tru_recorder_automerging_en as recording:
        response = automerging_query_engine_en.query(question)
        print(question)
        print(response)

> Merging 2 nodes into parent node.
> Parent node id: 4daa536d-f162-4f96-9724-825d766babe2.
> Parent node text: PAGE 3Table of 
ContentsIntroduction: Coding AI is the New Literacy.
Chapter 1: Three Steps to Ca...

> Merging 1 nodes into parent node.
> Parent node id: 9a1060af-8768-4591-9737-e6295af6ad39.
> Parent node text: PAGE 3Table of 
ContentsIntroduction: Coding AI is the New Literacy.
Chapter 1: Three Steps to Ca...

What are the keys to building a career in AI?
The keys to building a career in AI are learning foundational technical skills, working on projects to deepen skills and create impact, and finding a job in the field. Being part of a community also supports these steps.
How can teamwork contribute to success in AI?
Teamwork can contribute to success in AI by enabling individuals to collaborate effectively, influence others, and be influenced by team members. Working in teams on large projects in AI can enhance the overall success as it allows for diverse perspectives, s

In [None]:
tru.get_leaderboard(app_ids=[])

Unnamed: 0_level_0,Context Relevance,Groundedness,Answer Relevance,latency,total_cost
app_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Automerging Query Engine,0.8,0.908333,0.916667,3.666667,0.0
Automerging Query Engine_en,0.0375,0.333333,0.84,3.545455,0.0


In [None]:
# launches on http://localhost:8501/
tru.run_dashboard()

Starting dashboard ...
Config file already exists. Skipping writing process.
Credentials file already exists. Skipping writing process.
Dashboard already running at path:   Network URL: http://198.18.0.1:8501



<Popen: returncode: None args: ['streamlit', 'run', '--server.headless=True'...>