# 第六章 生成回答

在本章课程中，我们将在检索系统的最后一个阶段引入 LLM 生成步骤。这样，我们就可以获得对问题的回答，而不仅仅是搜索结果。比如，在一些应用场景中，用户可以与文档、书籍，或者是在本课程中所涉及的文章进行对话。大型语言模型在很多方面都表现出色，但在某些情况下，它们可能需要一些额外的辅助。

In [None]:
question = "Are side projects important when you are starting to learn about AI?"

让我们来看一个例子：假设你有这样一个问题：在你开始学习人工智能时，小型项目是否重要？你可以向大型语言模型询问这个问题。一些模型可能会给出有趣的答案，但更有趣的是，如果你向专家提问或查阅专家的著作。比如，你可以向吴恩达老师提问，或者查阅他关于这类问题的一些著作。幸运的是，我们可以获取到吴恩达老师的一些著作。你可以在 DeepLearning AI 中找到一个名为 ["The Batch"](https://www.deeplearning.ai/the-batch/) 的新闻简报，以及一系列关于 ["How to Build a Career in AI.（如何规划 AI 的职业生涯）"](https://www.deeplearning.ai/the-batch/tag/letters/) 的文章。

![How to Build a Career in AI.](./images/6-1.png)

In [2]:
text = """
The rapid rise of AI has led to a rapid rise in AI jobs, and many people are building exciting careers in this field. A career is a decades-long journey, and the path is not always straightforward. Over many years, I’ve been privileged to see thousands of students as well as engineers in companies large and small navigate careers in AI. In this and the next few letters, I’d like to share a few thoughts that might be useful in charting your own course.

Three key steps of career growth are learning (to gain technical and other skills), working on projects (to deepen skills, build a portfolio, and create impact) and searching for a job. These steps stack on top of each other:
aa
Initially, you focus on gaining foundational technical skills.
After having gained foundational skills, you lean into project work. During this period, you’ll probably keep learning.
Later, you might occasionally carry out a job search. Throughout this process, you’ll probably continue to learn and work on meaningful projects.
These phases apply in a wide range of professions, but AI involves unique elements. For example:

AI is nascent, and many technologies are still evolving. While the foundations of machine learning and deep learning are maturing — and coursework is an efficient way to master them — beyond these foundations, keeping up-to-date with changing technology is more important in AI than fields that are more mature.
Project work often means working with stakeholders who lack expertise in AI. This can make it challenging to find a suitable project, estimate the project’s timeline and return on investment, and set expectations. In addition, the highly iterative nature of AI projects leads to special challenges in project management: How can you come up with a plan for building a system when you don’t know in advance how long it will take to achieve the target accuracy? Even after the system has hit the target, further iteration may be necessary to address post-deployment drift.
While searching for a job in AI can be similar to searching for a job in other sectors, there are some differences. Many companies are still trying to figure out which AI skills they need and how to hire people who have them. Things you’ve worked on may be significantly different than anything your interviewer has seen, and you’re more likely to have to educate potential employers about some elements of your work.
Throughout these steps, a supportive community is a big help. Having a group of friends and allies who can help you — and whom you strive to help — makes the path easier. This is true whether you’re taking your first steps or you’ve been on the journey for years.

I’m excited to work with all of you to grow the global AI community, and that includes helping everyone in our community develop their careers. I’ll dive more deeply into these topics in the next few weeks.

Last week, I wrote about key steps for building a career in AI: learning technical skills, doing project work, and searching for a job, all of which is supported by being part of a community. In this letter, I’d like to dive more deeply into the first step.

More papers have been published on AI than any person can read in a lifetime. So, in your efforts to learn, it’s critical to prioritize topic selection. I believe the most important topics for a technical career in machine learning are:

Foundational machine learning skills. For example, it’s important to understand models such as linear regression, logistic regression, neural networks, decision trees, clustering, and anomaly detection. Beyond specific models, it’s even more important to understand the core concepts behind how and why machine learning works, such as bias/variance, cost functions, regularization, optimization algorithms, and error analysis.
Deep learning. This has become such a large fraction of machine learning that it’s hard to excel in the field without some understanding of it! It’s valuable to know the basics of neural networks, practical skills for making them work (such as hyperparameter tuning), convolutional networks, sequence models, and transformers.
Math relevant to machine learning. Key areas include linear algebra (vectors, matrices, and various manipulations of them) as well as probability and statistics (including discrete and continuous probability, standard probability distributions, basic rules such as independence and Bayes rule, and hypothesis testing). In addition, exploratory data analysis (EDA) — using visualizations and other methods to systematically explore a dataset — is an underrated skill. I’ve found EDA particularly useful in data-centric AI development, where analyzing errors and gaining insights can really help drive progress! Finally, a basic intuitive understanding of calculus will also help. In a previous letter, I described how the math needed to do machine learning well has been changing. For instance, although some tasks require calculus, improved automatic differentiation software makes it possible to invent and implement new neural network architectures without doing any calculus. This was almost impossible a decade ago.
Software development. While you can get a job and make huge contributions with only machine learning modeling skills, your job opportunities will increase if you can also write good software to implement complex AI systems. These skills include programming fundamentals, data structures (especially those that relate to machine learning, such as data frames), algorithms (including those related to databases and data manipulation), software design, familiarity with Python, and familiarity with key libraries such as TensorFlow or PyTorch, and scikit-learn.
This is a lot to learn! Even after you master everything in this list, I hope you’ll keep learning and continue to deepen your technical knowledge. I’ve known many machine learning engineers who benefitted from deeper skills in an application area such as natural language processing or computer vision, or in a technology area such as probabilistic graphical models or building scalable software systems.

How do you gain these skills? There’s a lot of good content on the internet, and in theory reading dozens of web pages could work. But when the goal is deep understanding, reading disjointed web pages is inefficient because they tend to repeat each other, use inconsistent terminology (which slows you down), vary in quality, and leave gaps. That’s why a good course — in which a body of material has been organized into a coherent and logical form — is often the most time-efficient way to master a meaningful body of knowledge. When you’ve absorbed the knowledge available in courses, you can switch over to research papers and other resources.

Finally, keep in mind that no one can cram everything they need to know over a weekend or even a month. Everyone I know who’s great at machine learning is a lifelong learner. In fact, given how quickly our field is changing, there’s little choice but to keep learning if you want to keep up. How can you maintain a steady pace of learning for years? I’ve written about the value of habits. If you cultivate the habit of learning a little bit every week, you can make significant progress with what feels like less effort.

In the last two letters, I wrote about developing a career in AI and shared tips for gaining technical skills. This time, I’d like to discuss an important step in building a career: project work.

It goes without saying that we should only work on projects that are responsible and ethical, and that benefit people. But those limits leave a large variety to choose from. I wrote previously about how to identify and scope AI projects. This and next week’s letter have a different emphasis: picking and executing projects with an eye toward career development.

A fruitful career will include many projects, hopefully growing in scope, complexity, and impact over time. Thus, it is fine to start small. Use early projects to learn and gradually step up to bigger projects as your skills grow.

When you’re starting out, don’t expect others to hand great ideas or resources to you on a platter. Many people start by working on small projects in their spare time. With initial successes — even small ones — under your belt, your growing skills increase your ability to come up with better ideas, and it becomes easier to persuade others to help you step up to bigger projects.

What if you don’t have any project ideas? Here are a few ways to generate them:

Join existing projects. If you find someone else with an idea, ask to join their project.
Keep reading and talking to people. I come up with new ideas whenever I spend a lot of time reading, taking courses, or talking with domain experts. I’m confident that you will, too.
Focus on an application area. Many researchers are trying to advance basic AI technology — say, by inventing the next generation of transformers or further scaling up language models — so, while this is an exciting direction, it is hard. But the variety of applications to which machine learning has not yet been applied is vast! I’m fortunate to have been able to apply neural networks to everything from autonomous helicopter flight to online advertising, partly because I jumped in when relatively few people were working on those applications. If your company or school cares about a particular application, explore the possibilities for machine learning. That can give you a first look at a potentially creative application — one where you can do unique work — that no one else has done yet.
Develop a side hustle. Even if you have a full-time job, a fun project that may or may not develop into something bigger can stir the creative juices and strengthen bonds with collaborators. When I was a full-time professor, working on online education wasn’t part of my “job” (which was doing research and teaching classes). It was a fun hobby that I often worked on out of passion for education. My early experiences recording videos at home helped me later in working on online education in a more substantive way. Silicon Valley abounds with stories of startups that started as side projects. So long as it doesn’t create a conflict with your employer, these projects can be a stepping stone to something significant.
Given a few project ideas, which one should you jump into? Here’s a quick checklist of factors to consider:

Will the project help you grow technically? Ideally, it should be challenging enough to stretch your skills but not so hard that you have little chance of success. This will put you on a path toward mastering ever-greater technical complexity.
Do you have good teammates to work with? If not, are there people you can discuss things with? We learn a lot from the people around us, and good collaborators will have a huge impact on your growth.
Can it be a stepping stone? If the project is successful, will its technical complexity and/or business impact make it a meaningful stepping stone to larger projects? (If the project is bigger than those you’ve worked on before, there’s a good chance it could be such a stepping stone.)
Finally, avoid analysis paralysis. It doesn’t make sense to spend a month deciding whether to work on a project that would take a week to complete. You'll work on multiple projects over the course of your career, so you’ll have ample opportunity to refine your thinking on what’s worthwhile. Given the huge number of possible AI projects, rather than the conventional “ready, aim, fire” approach, you can accelerate your progress with “ready, fire, aim.”

"""

我们将利用本课程学到的知识来对这篇文章的内容搜索，并运用大语言模型生成一个答案。

让我们通过可视化来理解这部分。

![Search can help LLMs in multiple ways](./images/6-2.png)

当我们问一个大型语言模型一个问题，它能够产生多种不同的回答。

但有时我们希望它能够根据特定的文件或文档进行回答。这时，你可以在生成步骤之前添加一个搜索组件来改进这些生成结果。

![Where is the information stored?](./images/6-3.png)
当你仅依赖于大型语言模型的直接答案时，你实际上是依赖于其内部存储的全局信息。

![Search can add some context](./images/6-4.png)
但是，你可以通过预先提供上下文来为它提供上下文的支持，例如在 prompt 中提供上下文，这样可以改善当你想要将模型锚定到特定领域、文章或文档时的生成质量。这也可以改善事实性的生成。

所以，在许多场合，当你希望从模型获取事实，并将其与这样的上下文联系在一起，可以提高模型在生成过程中的事实概率。


这两个步骤之间的区别在于，我们首先将问题提交给一个搜索系统，就像我们在本课程中早期构建的那样。然后我们检索其中的一些结果，将它们连同问题一起放入 prompt 中，再由生成模型获得相应的响应，这个响应是由上下文来提供的。

接下来我们将看一下如何在代码中实现这一点。这是我们的问题。让我们建立我们的文本档案。对于这个用例，我们只需打开这些文章并复制文本。我们可以把所有的内容都复制进去，我们可以复制三份，因此这是第二篇文章。在这里我们有一个包含三篇文章文本的变量。你可以做更多。这个系列非常值得一读。可能有七八部分，但我们可以用三部分来进行这个示例。

## 环境设置

载入对应的 Python 库和对应的 API-KEY

我们可以在这里导入 Cohere，因为接下来我们将嵌入这段文本。



In [3]:
import os
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

In [None]:
import cohere

import numpy as np
import warnings
warnings.filterwarnings('ignore')

## 分块 Chunking

1. 分块
2. 生成 embedding
3. 构建语义索引



In [4]:
# 将文本拆分成段落列表
texts = text.split('\n\n')

# 移除空格和换行符
texts = np.array([t.strip(' \n') for t in texts if t])

现在，让我们看一下前三个示例。这些是前三个分块。

In [6]:
texts[:3]

array(['The rapid rise of AI has led to a rapid rise in AI jobs, and many people are building exciting careers in this field. A career is a decades-long journey, and the path is not always straightforward. Over many years, I’ve been privileged to see thousands of students as well as engineers in companies large and small navigate careers in AI. In this and the next few letters, I’d like to share a few thoughts that might be useful in charting your own course.',
       'Three key steps of career growth are learning (to gain technical and other skills), working on projects (to deepen skills, build a portfolio, and create impact) and searching for a job. These steps stack on top of each other:',
       'Initially, you focus on gaining foundational technical skills.\nAfter having gained foundational skills, you lean into project work. During this period, you’ll probably keep learning.\nLater, you might occasionally carry out a job search. Throughout this process, you’ll probably continue t

In [None]:
人工智能的迅速发展导致了人工智能岗位的迅速增长。职业发展的三个关键步骤最初。

## 生成 embedding

接下来，我们用 Cohere 生成文本的 embedding

In [7]:
co = cohere.Client(os.environ['COHERE_API_KEY'])

# 获取 embedding
response = co.embed(
    texts=texts.tolist(),
).embeddings


## 构建检索索引

导入对应 Python 库。这是Annoy，这是向量搜索库。NumPy，Pandas 将不会使用正则表达式，但在处理文本时总是很方便。
我们只是把它变成了一个NumPy数组。这些是我们获得的向量。这些就是嵌入。我们创建了一个新的索引，一个向量索引。我们将向量插入其中，然后构建它并保存到文件中。

In [8]:
from annoy import AnnoyIndex
import numpy as np
import pandas as pd

In [None]:
# 将返回的响应转成 array 的数据格式，并且同时对 embedding 的维度进行校验，确保一致
embeds = np.array(response)

# 根据 embedding 的大小创建索引
search_index = AnnoyIndex(embeds.shape[1], 'angular')
# 将 embeddings 添加到搜索索引中
for i in range(len(embeds)):
    search_index.add_item(i, embeds[i])

search_index.build(10) # 10 trees
search_index.save('test.ann')

True

## 搜索文章

这段代码也在这里运行，我们将其转换为 NumPy 数组。这些就是我们得到的向量。这些就是嵌入。我们创建了一个新的索引，一个向量索引。我们将向量插入其中，然后构建它并保存到文件中。现在我们有了我们的向量搜索。接下来让我们定义一个函数。我们称这个函数为“搜索 Andrew 的文章”。

In [10]:
def search_andrews_article(query):
    """
    根据给定的查询，搜索与之最相似的文章。

    参数:
    query (str): 用户的查询字符串。

    返回:
    search_results (list): 与查询最相似的文章列表。

    """

    # 获取查询的嵌入
    query_embed = co.embed(texts=[query]).embeddings
    
    # 检索最近的邻居
    similar_item_ids = search_index.get_nns_by_vector(query_embed[0],
                                                    10,
                                                  include_distances=True)

    # 从文本中获取搜索结果
    search_results = texts[similar_item_ids[0]]
    
    return search_results

我们给它一个查询，它将在这个数据集上进行搜索。为了做到这一点，这些步骤正是我们过去所见到的：我们首先进行嵌入查询，然后在文档存档中进行向量搜索，比较查询与文本中每个段落的嵌入，然后返回结果。现在我们可以像这样向这个搜索系统提问，例如：“试图在人工智能领域发展职业时，是否进行副业项目是个好主意？” 我很想知道 Andrew 对此会有什么看法。在这里我们返回了第一个结果。这是一个很长的段落，它和这个问题是最接近的匹配。如果你看这里，在这个地方写着开发副业项目，即使你已经有了全职工作。可能成为或者不成为更大的事物的一个有趣项目可以激发创造力。所以这就是这篇大段文字中心的答案。这是我们可以使用大型语言模型回答这个问题的一个很好案例。我们可以把这个问题提交给它，让它为我们提取相关信息。接下来我们就来做这个。

In [11]:
results = search_andrews_article(
    "Are side projects a good idea when trying to build a career in AI?"
)

print(results[0])

Join existing projects. If you find someone else with an idea, ask to join their project.
Keep reading and talking to people. I come up with new ideas whenever I spend a lot of time reading, taking courses, or talking with domain experts. I’m confident that you will, too.
Focus on an application area. Many researchers are trying to advance basic AI technology — say, by inventing the next generation of transformers or further scaling up language models — so, while this is an exciting direction, it is hard. But the variety of applications to which machine learning has not yet been applied is vast! I’m fortunate to have been able to apply neural networks to everything from autonomous helicopter flight to online advertising, partly because I jumped in when relatively few people were working on those applications. If your company or school cares about a particular application, explore the possibilities for machine learning. That can give you a first look at a potentially creative applicatio

## Generating Answers

不再进行搜索，我们要定义一个新函数，名为“问安德鲁的文章”，在这里我们给出一个问题，并且设置“num_generations=1”。在这里有一些事情需要做，在我们做任何事情之前，我们首先进行搜索，获取文章中的相关上下文，由于这篇文章只有一个结果，所以这是您可以选择的设计方式，您希望在提示2或3中注入一个结果，但我们将使用1，因为这是在这里最简单的操作。我们可以使用的提示看起来可能是这样：“从一篇标题为《如何在人工智能领域发展职业》的文章中摘录，作者是 Andrew Ang。” 通常来说，我们为模型提供的上下文越多，它就能更好地完成任务。

在这里，我们将注入所收到的上下文。所以这是文章中的段萂。然后我们对其提出问题。我们给模型提供指示或命令，告诉它从所提供的文本中提取答案。如果答案不在其中，就告诉我们它不可用。然后我们说我们需要发送给模型的预测。现在我们有了提示，我们说"co.generate"，“prompt=prompt”，“max_tokens”，假设为70。其中一些倾向于使用较长的模型。我们想使用一个被称为command nightly的模型。这是Cohere公司的生成模型，最快更新。因此，如果您使用command nightly，您正在使用平台上可用的最新模型。这些往往是一些实验性的模型，但它们是最新且通常是最好的。所以我们可以停在这里。我们还没有使用Gnome Generations，但我们之后可以使用。然后我们将返回"prediction.generations"。这就是我们的代码。

现在，正是这个问题，让我们在这里提出它。而不是进行搜索练习，我们希望这是一个结合搜索并面向语言模型提出的对话练习。如果我们执行它，我们得到这个答案。是的，在尝试构建人工智能职业生涯时，兴趣项目是一个不错的主意。它们可以帮助你发展技能和知识，也可以成为与其他人建立网络的好方法。但是，你应该小心，不要与雇主产生冲突，并确保你没有违反任何条款，这样我们的token用完了，所以我们可以增加这里的tokens数量，如果我们想要一个更长的答案。所以这是演示它如何工作的一个快速例子。你可以尝试一下，问它几个问题。一些可能需要稍微调整问题的方式，但这是对一些应用的高层次概述。

In [43]:
def ask_andrews_article(question, num_generations=1):
    """
    根据给定的问题，从Andrew Ng的文章中生成答案。

    参数:
    question (str): 用户的问题字符串。
    num_generations (int): 生成答案的次数，默认为1。

    返回:
    prediction.generations (list): 生成的答案列表。

    """
        
    # 检索文章
    results = search_andrews_article(question)

    # 获取最佳结果
    context = results[0]

    # 准备 prompt
    prompt = f"""
    Excerpt from the article titled "How to Build a Career in AI" 
    by Andrew Ng: 
    {context}
    Question: {question}
    
    Extract the answer of the question from the text provided. 
    If the text doesn't contain the answer, 
    reply that the answer is not available."""
    
    # 生成预测
    prediction = co.generate(
        prompt=prompt,
        max_tokens=70,
        model="command-nightly",
        temperature=0.5,
        num_generations=num_generations
    )

    return prediction.generations

In [22]:
results = ask_andrews_article(
    "Are side projects a good idea when trying to build a career in AI?",

)

print(results[0])

有很多人正在做一些有趣的事情，例如，询问Lex Fridman播客任何问题，这正是这一流程。因此，与整个播客的文字记录进行语义搜索。有人也这样做安德鲁·胡伯曼的播客。你也可以在YouTube视频和书籍的文字记录中看到这种情况。因此，人们正在利用大型语言模型构建这样的东西，通常是通过搜索然后生成来实现的，你也可以加入重新排名来改善搜索组件。请随意暂停一下并自己尝试一下，运行代码直到这一点，并更改您想传递给模型的问题或获取您感兴趣的另一个数据集。您并不总是必须复制代码，这只是一个非常快速的例子。

如果你想要进行更大规模的工作，你可以使用Llama Index和LangChain等工具从PDF导入文本。因此，请记住这个"num_generations"参数。当你正在开发时，这是一个不错的技巧，如果你想测试模型对一个提示的行为多次，每次调用API。因此，你可以说，这是一个我们可以传递给“code.generate”的参数。所以我们可以说“num_generations=num_generations”。然后在这里问问题时，我们可以说"num_generations=3"。我们不一定想打印这个。我们想要多个。因此，这里发生的情况是这个问题会被提供给语言模型。并且要求语言模型同时为我们提供三种不同的生成结果，而不仅仅是一个。因此，它们会一起运行。

In [25]:
results = ask_andrews_article(
    "Are side projects a good idea when trying to build a career in AI?",
    num_generations=3
)

for gen in results:
    print(gen)
    print('--')

 Yes, side projects are a good idea when trying to build a career in AI. They can help you develop your skills and knowledge, and can also be a stepping stone to a more substantive project. However, it is important to ensure that your side project does not create a conflict with your employer.
--
 Yes. Side projects are a good idea when trying to build a career in AI. They can help you to develop new ideas and to strengthen bonds with collaborators. They can also be a stepping stone to something more significant. However, it is important to note that you should not create a conflict with your employer.
--
 Yes. A side hustle can stir the creative juices and strengthen bonds with collaborators.
--


In [44]:
results = ask_andrews_article(
    "What is the most viewed televised event?",
    num_generations=5
)


In [45]:
for gen in results:
    print(gen)
    print('--')

 The most viewed televised event is the Super Bowl.
--
 The most viewed televised event is the Super Bowl.
--
 The most viewed televised event is the Super Bowl.
--
 The most viewed televised event is the Super Bowl.
--
 The most viewed televised event is the Super Bowl.
--


## Congratulations on finishing the course!

To start building with the Cohere LLMs, get your API key by registering [here](https://dashboard.cohere.ai/welcome/register?utm_source=partner&utm_medium=website&utm_campaign=DeeplearningAI). 

Learn more about LLMs at Cohere’s [LLM.University](https://LLM.University).