## 代码理解与编写

*[LangChain Code Understanding Docs](https://python.langchain.com/docs/use_cases/code_understanding)*

One of the most exciting abilities of LLMs is code undestanding. People around the world are leveling up their output in both speed & quality due to AI help. A big part of this is having a LLM that can understand code and help you with a particular task.

* **Deep Dive** - Coming Soon
* **Examples** - TBD
* **Use Cases:** Co-Pilot-esque functionality that can help answer questions from a specific library, help you generate new code


In [None]:
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())

In [15]:
import os
# 向量存储
from langchain.vectorstores import FAISS
from langchain.embeddings.openai import OpenAIEmbeddings
# 模型
from langchain.chat_models import ChatOpenAI
# 文本处理
from langchain.text_splitter import CharacterTextSplitter
from langchain.document_loaders import TextLoader
# 问答链
from langchain.chains import RetrievalQA

chat = ChatOpenAI()

In [16]:
embeddings = OpenAIEmbeddings(disallowed_special=())

In [17]:
root_dir = 'data/thefuzz/thefuzz' # git clone https://github.com/seatgeek/thefuzz
docs = []

# 浏览每个文件夹
for dirpath, dirnames, filenames in os.walk(root_dir):
    
    # 浏览每个文件
    for file in filenames:
        try: 
            # 将文件加载为doc并进行分割
            loader = TextLoader(os.path.join(dirpath, file), encoding='utf-8')
            docs.extend(loader.load_and_split())
        except Exception as e: 
            pass

In [18]:
print (f"有 {len(docs)} 个文档\n")
print ("------ 文档开头部分 ------")
print (docs[0].page_content[:300])

有 11 个文档

------ 文档开头部分 ------
from collections.abc import Mapping
import typing
from typing import Any, Callable, Union, Tuple, Generator, TypeVar, Sequence


ChoicesT = Union[Mapping[str, str], Sequence[str]]
T = TypeVar('T')
ProcessorT = Union[Callable[[str, bool], str], Callable[[Any], Any]]
ScorerT = Callable[[str, str, bool


Embed and store them in a docstore. This will make an API call to OpenAI

In [19]:
docsearch = FAISS.from_documents(docs, embeddings)

In [21]:
# Get our retriever ready
qa = RetrievalQA.from_chain_type(llm=chat, chain_type="stuff", retriever=docsearch.as_retriever())

In [25]:
query = "如果我想在选项列表中找到分数上的最佳匹配项，我该使用什么函数?"
output = qa.run(query)
print(output)

您可以使用`extractOne`函数来找到在选项列表中分数最高的匹配项。该函数将返回一个包含匹配项和分数的元组。


In [26]:
query = "请编写代码使用 process.extractOne() 函数? 只回复代码，不要回复其他文字或解释"
output = qa.run(query)
print(output)

from fuzzywuzzy import process

choices = ["apple", "banana", "orange"]
query = "appel"
result = process.extractOne(query, choices)
print(result)
