## AI Agent智能应用从0到1定制开发 
## AI Agent Intelligent Application Custom Development from 0 to 1
******
- 此代码为网课《AI Agent智能应用从0到1定制开发》的配套代码，需要注意本套代码建议与网课适配配合食用。
- This code for the online course <AI Agent Intelligent Applications from 0 to 1 custom development> supporting code, need to pay attention to this set of code is recommended with the online course adapted to work with consumption.
- 需要注意由于课程开发周期的原因，langchain版本跨越了3个大版本，部分代码会与视频演示有差别!
- Note that due to the course development cycle, the langchain version spans 3 major releases and some of the code will differ from the video demo!
- 课程地址：https://coding.imooc.com/class/822.html
- Course address: https://coding.imooc.com/class/822.html

### 从环境变量中读取密钥
### Read the key from the environment variable
- 注意：尽量将你的OpenAI Key存储在类似.env文件中，而不是明文暴露在代码里，这是一种基本的安全措施
- Note: Try to store your OpenAI Key in something like an .env file, rather than exposing it explicitly in code, as a basic safety measure!
******

In [1]:

import os
from dotenv import load_dotenv
# Load environment variables from openai.env file
load_dotenv("asset/openai.env")

# Read the OPENAI_API_KEY from the environment
api_key = os.getenv("OPENAI_API_KEY")
api_base = os.getenv("OPENAI_API_BASE")
os.environ["OPENAI_API_KEY"] = api_key
os.environ["OPENAI_API_BASE"] = api_base

In [None]:
! pip uninstall -y openai
! pip install openai==1.14.1

### Embed_documents
****

In [4]:
from langchain_openai import OpenAIEmbeddings

e_model = OpenAIEmbeddings()
ebeddings = e_model.embed_documents(
     [
        "你好",
        "你好啊",
        "你叫什么名字?",
        "我叫王大锤",
        "很高兴认识你大锤",
    ]
)
ebeddings

[[0.0003033637165591021,
  -0.006206678987224718,
  -0.0024142365290074326,
  -0.029224303348719356,
  -0.043773087841599015,
  0.013877108201673597,
  -0.022444163941953928,
  -0.008351611048538182,
  -0.015258482024296211,
  -0.01942795129607854,
  0.035510189915810376,
  0.0014867198991881882,
  0.004673227188678734,
  -0.0024807706725777346,
  -0.008332601825416655,
  -0.016652530214537307,
  0.0341668354707533,
  -0.015055711700172544,
  0.01821132748869158,
  -0.020010916327180012,
  -1.675728172901144e-05,
  0.0013425627045345109,
  0.009207049372068864,
  0.0008910814317907126,
  -0.007743299610505942,
  -0.007946070400290849,
  0.010043477541155537,
  -0.018350733425302666,
  0.00728072943024907,
  -0.008510025276948314,
  0.013180084106553048,
  0.010727828199980083,
  -0.029857960262359883,
  -0.0028387873316987723,
  0.012020490762498106,
  -0.01345889411713026,
  -0.012299299841752839,
  -0.0018930529864499516,
  0.020821999486319635,
  0.004920353608515935,
  0.0170073789

### embed_query
<hr>

In [5]:
embedded_query = e_model.embed_query("这段对话中提到了什么名字?")
embedded_query[:5]

[0.0040477336089548566,
 0.0009470974041103835,
 0.02967681273328674,
 -0.0063358683082484464,
 -0.02480508870743324]

### 嵌入向量缓存
### Embedded Vector Cache
<hr>

In [6]:
# 安装faiss-cpu
# install faiss-cpu
! pip install faiss-cpu==1.8.0.post1




[notice] A new release of pip is available: 23.2.1 -> 24.2
[notice] To update, run: python.exe -m pip install --upgrade pip


In [7]:
from langchain.embeddings import CacheBackedEmbeddings
from langchain.storage import  LocalFileStore
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
u_embeddings = OpenAIEmbeddings()
fs = LocalFileStore("./cache/")
cached_embeddings = CacheBackedEmbeddings.from_bytes_store(
    u_embeddings,
    fs,
    namespace=u_embeddings.model,
)
list(fs.yield_keys())

['text-embedding-ada-0024250f053-4b1e-5c34-927d-a7857749217f',
 'text-embedding-ada-0029286d74c-b3fc-56ff-8b08-9071a193f724',
 'text-embedding-ada-002b0c54c27-a009-50b4-9ccc-661d5478b195',
 'text-embedding-ada-002c63ea318-3b5d-533b-960b-46434f8b3c22',
 'text-embedding-ada-002e94acbbe-7d17-5331-8310-4e37bdc56d31',
 'text-embedding-ada-002f05b40fb-a095-546e-9c5d-49e069720828']

In [8]:
#加载文档，切分文档，将切分文档向量化病存储在缓存中
#load documents, split documents, embed split documents and store in cache
raw_documents = TextLoader("asset/letter.txt").load()
text_splitter = CharacterTextSplitter(chunk_size=600,chunk_overlap=0)
documents = text_splitter.split_documents(raw_documents)

Created a chunk of size 610, which is longer than the specified 600


In [9]:
from langchain.vectorstores import  FAISS
%timeit -r  1 -n 1 db= FAISS.from_documents(documents,cached_embeddings)

56.3 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


In [10]:
#查看缓存中的键
#view keys in cache
list(fs.yield_keys())

['text-embedding-ada-0024250f053-4b1e-5c34-927d-a7857749217f',
 'text-embedding-ada-0029286d74c-b3fc-56ff-8b08-9071a193f724',
 'text-embedding-ada-002b0c54c27-a009-50b4-9ccc-661d5478b195',
 'text-embedding-ada-002c63ea318-3b5d-533b-960b-46434f8b3c22',
 'text-embedding-ada-002e94acbbe-7d17-5331-8310-4e37bdc56d31',
 'text-embedding-ada-002f05b40fb-a095-546e-9c5d-49e069720828']