# LangChain: 평가

## 개요:

* 예제 생성 
* 수동 평가(및 디버깅) 
* LLM 지원 평가

In [1]:
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_ENDPOINT"] = "http://localhost:1984"
os.environ["LANGCHAIN_PROJECT"] = "WebSquare API"

## QandA 애플리케이션 만들기

In [2]:
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI, PromptLayerChatOpenAI
from langchain.document_loaders import CSVLoader
from langchain.indexes import VectorstoreIndexCreator
from langchain.vectorstores import DocArrayInMemorySearch

In [3]:
file = 'api_ko.csv'
loader = CSVLoader(file_path=file)
data = loader.load()

In [4]:
index = VectorstoreIndexCreator(
    vectorstore_cls=DocArrayInMemorySearch
).from_loaders([loader])

Retrying langchain.embeddings.openai.embed_with_retry.<locals>._embed_with_retry in 4.0 seconds as it raised APIError: OpenAI API returned an empty embedding.
Retrying langchain.embeddings.openai.embed_with_retry.<locals>._embed_with_retry in 4.0 seconds as it raised APIError: OpenAI API returned an empty embedding.
Retrying langchain.embeddings.openai.embed_with_retry.<locals>._embed_with_retry in 4.0 seconds as it raised APIError: OpenAI API returned an empty embedding.
Retrying langchain.embeddings.openai.embed_with_retry.<locals>._embed_with_retry in 8.0 seconds as it raised APIError: OpenAI API returned an empty embedding.


In [5]:
# llm = ChatOpenAI(temperature = 0.0)
llm = PromptLayerChatOpenAI(pl_tags=["api_qa", "2023-07-08"], temperature=0.0)
qa = RetrievalQA.from_chain_type(
    llm=llm, 
    chain_type="stuff", 
    retriever=index.vectorstore.as_retriever(), 
    verbose=True,
    chain_type_kwargs = {
        "document_separator": "<<<<>>>>>"
    }
)

### 테스트 데이터 포인트 마련하기

In [6]:
data[10]

Document(page_content=': 10\n유형: method\ncomponent: $p\nname: deleteSubmission\ndescription: submission을 삭제합니다.\nparameter: submissionID\tString\tY\t삭제하고자 하는 submission의 ID\nreturn: \nexception: \nsample: <xmp  class=\'js sample\'>$p.deleteSubmission( "submission1" );\n//"submission1"에 해당하는 submssion이 삭제됩니다. 이후 $p.executeSubmission("submission1");을 호출하면 아무 동작을 하지 않게 됩니다.</xmp>\nbuilt since: 5.0_3.3377A.20181128.161740\nbuilt last: 5.0_5.4811B.20230203.095105', metadata={'source': 'api_ko.csv', 'row': 10})

In [7]:
data[11]

Document(page_content=': 11\n유형: method\ncomponent: $p\nname: download\ndescription: download 모듈이 구현된 서버의 URL을 호출하여 다운로드 가능한 인터페이스를 화면에서 제공합니다.\nparameter: actionUrl\tString\tY\t파일 다운로드가 구현되어있는 url.\nXML\tString\tN\t문자열은 xmlValue라는 이름으로 서버로 올라간다. 값을 지정하지 않은 경우(undefined인 경우) xmlValue라는 값은 제외하고 서버로 전송한다.\nsendMethod\tString\tN\tget, post와 같은 전송 방식, 기본값은 post이다.\nisXHR\tString\tY\txhr 통신 유무 (기본값은 false)\nreturn: \nexception: \nsample: <xmp  class=\'js sample\'>var url = "/download.do"        //파일 다운로드가 구현 되어있는 서버 url. ( 웹스퀘어의 기본 모듈에는 제공되지 않는다)\n$p.download( url );</xmp>\nbuilt since: 5.0_3.3377A.20181128.161740\nbuilt last: 5.0_5.4811B.20230203.095105', metadata={'source': 'api_ko.csv', 'row': 11})

### 하드코딩된 예제

In [8]:
examples = [
]

### LLM으로 생성된 예제

In [9]:
from langchain.evaluation.qa import QAGenerateChain

In [10]:
# example_gen_chain = QAGenerateChain.from_llm(ChatOpenAI())
example_gen_chain = QAGenerateChain.from_llm(PromptLayerChatOpenAI(pl_tags=["api_qa", "2023-07-08"]))

In [11]:
len(data)

6375

In [12]:
[{"doc": t.page_content} for t in data[:5]]

[{'doc': ': 0\n유형: method\ncomponent: $p\nname: $\ndescription: jQuery selector를 인자로 받아 jQuery 객체를 반환한다. <br />id selector를 인자로 받은 경우 해당 id가 함수를 호출한 페이지에 있는 웹스퀘어 객체인 경우 웹스퀘어 객체의 실제 id로 변환한 다음 함수를 실행한다.\nparameter: \nreturn: Object\tjQuery 객체\nexception: \nsample: $p.$("#group1").wq("invoke", "setDisabled", "true"); // 스크립트가 실행된 페이지의 group1 객체를 찾아 group1.invoke("setDisabled", "true"); 를 실행\nbuilt since: 5.0_3.3377A.20181128.161740\nbuilt last: 5.0_5.4811B.20230203.095105'},
 {'doc': ': 1\n유형: method\ncomponent: $p\nname: URLEncoder\ndescription: 주어진 문자열을 `application/x-www-form-urlencoded` MIME 형식의 문자열로 변환합니다.\nparameter: str\tString\tY\t문자열\nreturn: String\t변환된 application/x-www-form-urlencoded MIME Format문자열을 반환합니다\nexception: \nsample: <xmp  class=\'js sample\'>var encodeStr = $p.URLEncoder( "문자열" );\n//return 예시 ) "%b9%ae%c0%da%bf%ad"</xmp>\nbuilt since: 5.0_3.3377A.20181128.161740\nbuilt last: 5.0_5.4811B.20230203.095105'},
 {'doc': ': 2\n유형: method\ncomponent: $p\nname: ajax\ndesc

In [None]:
new_examples = example_gen_chain.apply_and_parse(
    [{"doc": t} for t in data[100:200]]
)



In [None]:
# new_examples = example_gen_chain.apply_and_parse(
#     [{"doc": t.page_content} for t in data[:5]]
# )

In [None]:
new_examples[0]

In [None]:
new_examples[1]

In [None]:
new_examples[2]

In [None]:
data[1]

In [None]:
new_examples

### Combine examples

In [None]:
examples += new_examples

In [None]:
qa.run(examples[3]["query"])

In [None]:
examples[3]["answer"]

In [None]:
examples[3]["query"]

## Manual Evaluation
qa.run으로 실행한 다음 기존에 생성한 answer와 비교한다.

In [None]:
import langchain
langchain.debug = True

In [None]:
qa.run(examples[0]["query"])

In [None]:
qa.run(examples[2]["query"])

In [None]:
examples[2]

In [None]:
# Turn off the debug mode
langchain.debug = False

## LLM assisted evaluation

In [None]:
predictions = qa.apply(examples)

In [None]:
from langchain.evaluation.qa import QAEvalChain

In [None]:
# llm = ChatOpenAI(temperature=0)
llm = PromptLayerChatOpenAI(pl_tags=["api_qa", "2023-07-08"], temperature=0.0)
eval_chain = QAEvalChain.from_llm(llm)

In [None]:
graded_outputs = eval_chain.evaluate(examples, predictions)

In [None]:
examples

In [None]:
predictions

In [None]:
graded_outputs

In [None]:
for i, eg in enumerate(examples):
    print(f"Example {i}:")
    print("Question: " + predictions[i]['query'])
    print("Real Answer: " + predictions[i]['answer'])
    print("Predicted Answer: " + predictions[i]['result'])
    print("Predicted Grade: " + graded_outputs[i]['text'])
    print()

In [None]:
import os

from langchain.chat_models import ChatOpenAI
from langchain.client import run_on_dataset

llm = ChatOpenAI(temperature=0)

chain_results = run_on_dataset(
dataset_name="ds-granular-windscreen-29",
llm_or_chain_factory=llm,
project_name="pt-spotless-bondsman-92",
)