# LangChain: 평가

## 개요:

* 예제 생성 
* 수동 평가(및 디버깅) 
* LLM 지원 평가

In [1]:
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_ENDPOINT"] = "http://localhost:1984"
os.environ["LANGCHAIN_PROJECT"] = "WebSquare API"

## QandA 애플리케이션 만들기

In [2]:
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI, PromptLayerChatOpenAI
from langchain.document_loaders import CSVLoader
from langchain.indexes import VectorstoreIndexCreator
from langchain.vectorstores import DocArrayInMemorySearch

In [3]:
file = 'api_ko.csv'
loader = CSVLoader(file_path=file)
data = loader.load()

In [4]:
index = VectorstoreIndexCreator(
    vectorstore_cls=DocArrayInMemorySearch
).from_loaders([loader])

In [5]:
# llm = ChatOpenAI(temperature = 0.0)
llm = PromptLayerChatOpenAI(pl_tags=["api_qa", "2023-07-08"], temperature=0.0)
qa = RetrievalQA.from_chain_type(
    llm=llm, 
    chain_type="stuff", 
    retriever=index.vectorstore.as_retriever(), 
    verbose=True,
    chain_type_kwargs = {
        "document_separator": "<<<<>>>>>"
    }
)

### 테스트 데이터 포인트 마련하기

In [6]:
data[10]

Document(page_content=': 10\n유형: method\ncomponent: $p\nname: deleteSubmission\ndescription: submission을 삭제합니다.\nparameter: submissionID\tString\tY\t삭제하고자 하는 submission의 ID\nreturn: \nexception: \nsample: <xmp  class=\'js sample\'>$p.deleteSubmission( "submission1" );\n//"submission1"에 해당하는 submssion이 삭제됩니다. 이후 $p.executeSubmission("submission1");을 호출하면 아무 동작을 하지 않게 됩니다.</xmp>\nbuilt since: 5.0_3.3377A.20181128.161740\nbuilt last: 5.0_5.4811B.20230203.095105', metadata={'source': 'api_ko.csv', 'row': 10})

In [7]:
data[11]

Document(page_content=': 11\n유형: method\ncomponent: $p\nname: download\ndescription: download 모듈이 구현된 서버의 URL을 호출하여 다운로드 가능한 인터페이스를 화면에서 제공합니다.\nparameter: actionUrl\tString\tY\t파일 다운로드가 구현되어있는 url.\nXML\tString\tN\t문자열은 xmlValue라는 이름으로 서버로 올라간다. 값을 지정하지 않은 경우(undefined인 경우) xmlValue라는 값은 제외하고 서버로 전송한다.\nsendMethod\tString\tN\tget, post와 같은 전송 방식, 기본값은 post이다.\nisXHR\tString\tY\txhr 통신 유무 (기본값은 false)\nreturn: \nexception: \nsample: <xmp  class=\'js sample\'>var url = "/download.do"        //파일 다운로드가 구현 되어있는 서버 url. ( 웹스퀘어의 기본 모듈에는 제공되지 않는다)\n$p.download( url );</xmp>\nbuilt since: 5.0_3.3377A.20181128.161740\nbuilt last: 5.0_5.4811B.20230203.095105', metadata={'source': 'api_ko.csv', 'row': 11})

### 하드코딩된 예제

In [8]:
examples = [
    # {
    #     "query": "Do the Cozy Comfort Pullover Set\
    #     have side pockets?",
    #     "answer": "Yes"
    # },
    # {
    #     "query": "What collection is the Ultra-Lofty \
    #     850 Stretch Down Hooded Jacket from?",
    #     "answer": "The DownTek collection"
    # }
]

### LLM으로 생성된 예제

In [9]:
from langchain.evaluation.qa import QAGenerateChain

In [10]:
# example_gen_chain = QAGenerateChain.from_llm(ChatOpenAI())
example_gen_chain = QAGenerateChain.from_llm(PromptLayerChatOpenAI(pl_tags=["api_qa", "2023-07-08"]))

In [11]:
len(data)

6375

In [12]:
[{"doc": t.page_content} for t in data[:5]]

[{'doc': ': 0\n유형: method\ncomponent: $p\nname: $\ndescription: jQuery selector를 인자로 받아 jQuery 객체를 반환한다. <br />id selector를 인자로 받은 경우 해당 id가 함수를 호출한 페이지에 있는 웹스퀘어 객체인 경우 웹스퀘어 객체의 실제 id로 변환한 다음 함수를 실행한다.\nparameter: \nreturn: Object\tjQuery 객체\nexception: \nsample: $p.$("#group1").wq("invoke", "setDisabled", "true"); // 스크립트가 실행된 페이지의 group1 객체를 찾아 group1.invoke("setDisabled", "true"); 를 실행\nbuilt since: 5.0_3.3377A.20181128.161740\nbuilt last: 5.0_5.4811B.20230203.095105'},
 {'doc': ': 1\n유형: method\ncomponent: $p\nname: URLEncoder\ndescription: 주어진 문자열을 `application/x-www-form-urlencoded` MIME 형식의 문자열로 변환합니다.\nparameter: str\tString\tY\t문자열\nreturn: String\t변환된 application/x-www-form-urlencoded MIME Format문자열을 반환합니다\nexception: \nsample: <xmp  class=\'js sample\'>var encodeStr = $p.URLEncoder( "문자열" );\n//return 예시 ) "%b9%ae%c0%da%bf%ad"</xmp>\nbuilt since: 5.0_3.3377A.20181128.161740\nbuilt last: 5.0_5.4811B.20230203.095105'},
 {'doc': ': 2\n유형: method\ncomponent: $p\nname: ajax\ndesc

In [None]:
new_examples = example_gen_chain.apply_and_parse(
    [{"doc": t} for t in data[100:]]
)

Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 1.0 seconds as it raised ServiceUnavailableError: The server is overloaded or not ready yet..
Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 1.0 seconds as it raised APIError: Bad gateway. {"error":{"code":502,"message":"Bad gateway.","param":null,"type":"cf_bad_gateway"}} 502 {'error': {'code': 502, 'message': 'Bad gateway.', 'param': None, 'type': 'cf_bad_gateway'}} {'Date': 'Sat, 08 Jul 2023 00:29:45 GMT', 'Content-Type': 'application/json', 'Content-Length': '84', 'Connection': 'keep-alive', 'X-Frame-Options': 'SAMEORIGIN', 'Referrer-Policy': 'same-origin', 'Cache-Control': 'private, max-age=0, no-store, no-cache, must-revalidate, post-check=0, pre-check=0', 'Expires': 'Thu, 01 Jan 1970 00:00:01 GMT', 'Server': 'cloudflare', 'CF-RAY': '7e342f5bba08eda5-ICN', 'alt-svc': 'h3=":443"; ma=86400'}.
Retrying langchain.chat_mod

In [14]:
# new_examples = example_gen_chain.apply_and_parse(
#     [{"doc": t.page_content} for t in data[:5]]
# )

In [15]:
new_examples[0]

{'query': 'What does the method described in the document do?',
 'answer': 'The method described in the document takes a jQuery selector as an argument and returns a jQuery object. If the argument is an id selector and it corresponds to a web square object on the page where the function is called, it converts the id to the actual id of the web square object and then executes the function.'}

In [16]:
new_examples[1]

{'query': 'What is the purpose of the URLEncoder method?',
 'answer': 'The URLEncoder method is used to convert a given string into a string in the `application/x-www-form-urlencoded` MIME format.'}

In [17]:
new_examples[2]

{'query': 'What are the parameters that can be passed to the "options" object when using the "ajax" method?',
 'answer': 'The parameters that can be passed to the "options" object when using the "ajax" method are as follows: '}

In [18]:
data[1]

Document(page_content=': 1\n유형: method\ncomponent: $p\nname: URLEncoder\ndescription: 주어진 문자열을 `application/x-www-form-urlencoded` MIME 형식의 문자열로 변환합니다.\nparameter: str\tString\tY\t문자열\nreturn: String\t변환된 application/x-www-form-urlencoded MIME Format문자열을 반환합니다\nexception: \nsample: <xmp  class=\'js sample\'>var encodeStr = $p.URLEncoder( "문자열" );\n//return 예시 ) "%b9%ae%c0%da%bf%ad"</xmp>\nbuilt since: 5.0_3.3377A.20181128.161740\nbuilt last: 5.0_5.4811B.20230203.095105', metadata={'source': 'api_ko.csv', 'row': 1})

In [19]:
new_examples

[{'query': 'What does the method described in the document do?',
  'answer': 'The method described in the document takes a jQuery selector as an argument and returns a jQuery object. If the argument is an id selector and it corresponds to a web square object on the page where the function is called, it converts the id to the actual id of the web square object and then executes the function.'},
 {'query': 'What is the purpose of the URLEncoder method?',
  'answer': 'The URLEncoder method is used to convert a given string into a string in the `application/x-www-form-urlencoded` MIME format.'},
 {'query': 'What are the parameters that can be passed to the "options" object when using the "ajax" method?',
  'answer': 'The parameters that can be passed to the "options" object when using the "ajax" method are as follows: '},
 {'query': 'What is the purpose of the clearInterval method?',
  'answer': 'The clearInterval method is used to release the Interval object registered with the setInterva

### Combine examples

In [20]:
examples += new_examples

In [21]:
qa.run(examples[3]["query"])



[1m> Entering new  chain...[0m

[1m> Finished chain.[0m


'The clearInterval method is used to stop or clear a timer that was set using the setInterval method.'

In [22]:
examples[3]["answer"]

'The clearInterval method is used to release the Interval object registered with the setInterval method.'

In [23]:
examples[3]["query"]

'What is the purpose of the clearInterval method?'

## Manual Evaluation
qa.run으로 실행한 다음 기존에 생성한 answer와 비교한다.

In [24]:
import langchain
langchain.debug = True

In [25]:
qa.run(examples[0]["query"])

[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA] Entering Chain run with input:
[0m{
  "query": "What does the method described in the document do?"
}
[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA > 3:chain:StuffDocumentsChain] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA > 3:chain:StuffDocumentsChain > 4:chain:LLMChain] Entering Chain run with input:
[0m{
  "question": "What does the method described in the document do?",
  "context": "var doc = WebSquare.ModelUtil.findInstanceNode( \"bookstore/book\" );\nWebSquare.xml.setValue(doc, \"title\" , \"Harry Potter and the Philosopher's Stone\");\nWebSquare.xml.setValue(doc, \"price\" , \"value\" , \"USD\");<<<<>>>>></w2:data>\n</w2:dataList><<<<>>>>>WebSquare.xml.setAttribute(doc, \"formats\" , \"Hardcover\");\nWebSquare.xml.setAttribute(doc, \"price\" , \"value\" , \"USD\");<<<<>>>>>built last: 5.0_5.4811B.20230203.095105"
}
[32;1m[1;3m[llm/start][0m [1m[1:chai

'The method described in the document is used to set the value of the "title" and "price" elements in an XML document. It also sets the attribute "formats" to "Hardcover" and the attribute "price" to "USD".'

In [26]:
qa.run(examples[2]["query"])

[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA] Entering Chain run with input:
[0m{
  "query": "What are the parameters that can be passed to the \"options\" object when using the \"ajax\" method?"
}
[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA > 3:chain:StuffDocumentsChain] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA > 3:chain:StuffDocumentsChain > 4:chain:LLMChain] Entering Chain run with input:
[0m{
  "question": "What are the parameters that can be passed to the \"options\" object when using the \"ajax\" method?",
  "context": ": 151\n유형: method\ncomponent: WebSquare.net\nname: ajax\ndescription: Ajax 통신을 위한 함수\nparameter: options\tObject\tY\tJSON형태 객체\n<br />\n<xmp class='js description'>options.action : ajax 요청 주소\noptions.mode : asynchronous(default)/synchronous\noptions.mediatype : application/x-www-form-urlencoded, application/json, application/xml, text/xml\noptions.method : get/post/put/delete\nopt

'The parameters that can be passed to the "options" object when using the "ajax" method are:\n\n- action: The URL for the ajax request.\n- mode: The mode of the request, either asynchronous or synchronous.\n- mediatype: The media type of the request, such as application/x-www-form-urlencoded, application/json, etc.\n- method: The HTTP method of the request, such as get, post, put, delete.\n- requestData: The request body.\n- requestHeader: Additional content to be added to the request header.\n- timeout: The timeout duration for the ajax request.\n- type: The type of the response, either xml or json.\n- beforeAjax: A function that is executed before the request is made.\n- success: A callback function that is executed when the request is successful.\n- error: A callback function that is executed when the request fails.'

In [27]:
examples[2]

{'query': 'What are the parameters that can be passed to the "options" object when using the "ajax" method?',
 'answer': 'The parameters that can be passed to the "options" object when using the "ajax" method are as follows: '}

In [28]:
# Turn off the debug mode
langchain.debug = False

## LLM assisted evaluation

In [29]:
predictions = qa.apply(examples)



[1m> Entering new  chain...[0m

[1m> Finished chain.[0m


[1m> Entering new  chain...[0m

[1m> Finished chain.[0m


[1m> Entering new  chain...[0m

[1m> Finished chain.[0m


[1m> Entering new  chain...[0m

[1m> Finished chain.[0m


[1m> Entering new  chain...[0m

[1m> Finished chain.[0m


[1m> Entering new  chain...[0m

[1m> Finished chain.[0m


[1m> Entering new  chain...[0m

[1m> Finished chain.[0m


[1m> Entering new  chain...[0m

[1m> Finished chain.[0m


[1m> Entering new  chain...[0m

[1m> Finished chain.[0m


[1m> Entering new  chain...[0m

[1m> Finished chain.[0m


[1m> Entering new  chain...[0m

[1m> Finished chain.[0m


[1m> Entering new  chain...[0m

[1m> Finished chain.[0m


[1m> Entering new  chain...[0m

[1m> Finished chain.[0m


[1m> Entering new  chain...[0m

[1m> Finished chain.[0m


[1m> Entering new  chain...[0m

[1m> Finished chain.[0m


[1m> Entering new  chain...[0m

[1m> Finished chain.[0m


[1m> 

In [30]:
from langchain.evaluation.qa import QAEvalChain

In [31]:
# llm = ChatOpenAI(temperature=0)
llm = PromptLayerChatOpenAI(pl_tags=["api_qa", "2023-07-08"], temperature=0.0)
eval_chain = QAEvalChain.from_llm(llm)

In [32]:
graded_outputs = eval_chain.evaluate(examples, predictions)

Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 1.0 seconds as it raised ServiceUnavailableError: The server is overloaded or not ready yet..


In [33]:
examples

[{'query': 'What does the method described in the document do?',
  'answer': 'The method described in the document takes a jQuery selector as an argument and returns a jQuery object. If the argument is an id selector and it corresponds to a web square object on the page where the function is called, it converts the id to the actual id of the web square object and then executes the function.'},
 {'query': 'What is the purpose of the URLEncoder method?',
  'answer': 'The URLEncoder method is used to convert a given string into a string in the `application/x-www-form-urlencoded` MIME format.'},
 {'query': 'What are the parameters that can be passed to the "options" object when using the "ajax" method?',
  'answer': 'The parameters that can be passed to the "options" object when using the "ajax" method are as follows: '},
 {'query': 'What is the purpose of the clearInterval method?',
  'answer': 'The clearInterval method is used to release the Interval object registered with the setInterva

In [34]:
predictions

[{'query': 'What does the method described in the document do?',
  'answer': 'The method described in the document takes a jQuery selector as an argument and returns a jQuery object. If the argument is an id selector and it corresponds to a web square object on the page where the function is called, it converts the id to the actual id of the web square object and then executes the function.',
  'result': 'The method described in the document is used to set the value of the "title" and "price" elements in an XML document. It also sets the attribute "formats" to "Hardcover" and the attribute "price" to "USD".'},
 {'query': 'What is the purpose of the URLEncoder method?',
  'answer': 'The URLEncoder method is used to convert a given string into a string in the `application/x-www-form-urlencoded` MIME format.',
  'result': 'The purpose of the URLEncoder method is to convert a given string into an `application/x-www-form-urlencoded` MIME format string.'},
 {'query': 'What are the parameters

In [35]:
graded_outputs

[{'text': 'INCORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text': 'CORRECT'},
 {'text'

In [36]:
for i, eg in enumerate(examples):
    print(f"Example {i}:")
    print("Question: " + predictions[i]['query'])
    print("Real Answer: " + predictions[i]['answer'])
    print("Predicted Answer: " + predictions[i]['result'])
    print("Predicted Grade: " + graded_outputs[i]['text'])
    print()

Example 0:
Question: What does the method described in the document do?
Real Answer: The method described in the document takes a jQuery selector as an argument and returns a jQuery object. If the argument is an id selector and it corresponds to a web square object on the page where the function is called, it converts the id to the actual id of the web square object and then executes the function.
Predicted Answer: The method described in the document is used to set the value of the "title" and "price" elements in an XML document. It also sets the attribute "formats" to "Hardcover" and the attribute "price" to "USD".
Predicted Grade: INCORRECT

Example 1:
Question: What is the purpose of the URLEncoder method?
Real Answer: The URLEncoder method is used to convert a given string into a string in the `application/x-www-form-urlencoded` MIME format.
Predicted Answer: The purpose of the URLEncoder method is to convert a given string into an `application/x-www-form-urlencoded` MIME format s

In [None]:
import os

from langchain.chat_models import ChatOpenAI
from langchain.client import run_on_dataset

llm = ChatOpenAI(temperature=0)

chain_results = run_on_dataset(
dataset_name="ds-granular-windscreen-29",
llm_or_chain_factory=llm,
project_name="pt-spotless-bondsman-92",
)