参考文档：
- [Azure OpenAI 入门教程 - LangChain 篇 ：实战 - 构建企业内部知识库问答机器人 - 向量搜索](https://zhuanlan.zhihu.com/p/646808742)
- [Retrieval Question/Answering](https://langchain-fanyi.readthedocs.io/en/latest/modules/chains/index_examples/vector_db_qa.html)
- [Retrieval Question Answering with Sources](https://langchain-fanyi.readthedocs.io/en/latest/modules/chains/index_examples/vector_db_qa_with_sources.html)

In [None]:
! pip install chromadb

In [1]:
import dotenv
from langchain.chat_models import ChatOpenAI

dotenv.load_dotenv()
llm = ChatOpenAI()

In [2]:
from langchain.document_loaders import DirectoryLoader
loader = DirectoryLoader("data", glob="**/*.txt")
documents = loader.load()

In [3]:
from langchain.text_splitter import CharacterTextSplitter
text_splitter = CharacterTextSplitter(chunk_size=100, chunk_overlap=0)
split_docs = text_splitter.split_documents(documents)

Created a chunk of size 107, which is longer than the specified 100
Created a chunk of size 144, which is longer than the specified 100
Created a chunk of size 104, which is longer than the specified 100


In [4]:
from langchain.embeddings.openai import OpenAIEmbeddings
embeddings = OpenAIEmbeddings(deployment="lqembedding")

In [5]:
from langchain.vectorstores import Chroma
docsearch = Chroma.from_documents(split_docs, embeddings)

In [19]:
from langchain.chains import RetrievalQA
qa = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=docsearch.as_retriever())

result = qa({"query": "杭州正高级工程师评定条件"})
print(result)

{'query': '杭州正高级工程师评定条件', 'result': '评定条件为：取得高级工程师资格后，实际聘任高级工程师职务5年以上。'}


In [20]:
from langchain.chains import RetrievalQAWithSourcesChain
qa = RetrievalQAWithSourcesChain.from_chain_type(llm=llm, chain_type="stuff", retriever=docsearch.as_retriever())

result = qa({"question": "杭州正高级工程师评定条件"}, return_only_outputs=True)
print(result)

Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised ServiceUnavailableError: The server is overloaded or not ready yet..


{'answer': 'The criteria for the evaluation of the title of Senior Engineer in Hangzhou is as follows: \n1. After obtaining the qualification of Senior Engineer, the individual must have served in the position of Senior Engineer for more than 5 years.\n', 'sources': 'data\\long-text.txt'}
