# LangChain: 평가

## 개요:

* 예제 생성 
* 수동 평가(및 디버깅) 
* LLM 지원 평가

In [1]:
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_ENDPOINT"] = "http://localhost:1984"
os.environ["LANGCHAIN_PROJECT"] = "DEEPLEARNING.AI"

## QandA 애플리케이션 만들기

In [2]:
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI, PromptLayerChatOpenAI
from langchain.document_loaders import CSVLoader
from langchain.indexes import VectorstoreIndexCreator
from langchain.vectorstores import DocArrayInMemorySearch

In [3]:
file = 'OutdoorClothingCatalog_1000.csv'
loader = CSVLoader(file_path=file)
data = loader.load()

In [4]:
index = VectorstoreIndexCreator(
    vectorstore_cls=DocArrayInMemorySearch
).from_loaders([loader])

In [5]:
# llm = ChatOpenAI(temperature = 0.0)
llm = PromptLayerChatOpenAI(pl_tags=["langchain_5_230707"], temperature=0.0)
qa = RetrievalQA.from_chain_type(
    llm=llm, 
    chain_type="stuff", 
    retriever=index.vectorstore.as_retriever(), 
    verbose=True,
    chain_type_kwargs = {
        "document_separator": "<<<<>>>>>"
    }
)

### 테스트 데이터 포인트 마련하기

In [6]:
data[10]

Document(page_content=": 10\nname: Cozy Comfort Pullover Set, Stripe\ndescription: Perfect for lounging, this striped knit set lives up to its name. We used ultrasoft fabric and an easy design that's as comfortable at bedtime as it is when we have to make a quick run out.\n\nSize & Fit\n- Pants are Favorite Fit: Sits lower on the waist.\n- Relaxed Fit: Our most generous fit sits farthest from the body.\n\nFabric & Care\n- In the softest blend of 63% polyester, 35% rayon and 2% spandex.\n\nAdditional Features\n- Relaxed fit top with raglan sleeves and rounded hem.\n- Pull-on pants have a wide elastic waistband and drawstring, side pockets and a modern slim leg.\n\nImported.", metadata={'source': 'OutdoorClothingCatalog_1000.csv', 'row': 10})

In [7]:
data[11]

Document(page_content=': 11\nname: Ultra-Lofty 850 Stretch Down Hooded Jacket\ndescription: This technical stretch down jacket from our DownTek collection is sure to keep you warm and comfortable with its full-stretch construction providing exceptional range of motion. With a slightly fitted style that falls at the hip and best with a midweight layer, this jacket is suitable for light activity up to 20° and moderate activity up to -30°. The soft and durable 100% polyester shell offers complete windproof protection and is insulated with warm, lofty goose down. Other features include welded baffles for a no-stitch construction and excellent stretch, an adjustable hood, an interior media port and mesh stash pocket and a hem drawcord. Machine wash and dry. Imported.', metadata={'source': 'OutdoorClothingCatalog_1000.csv', 'row': 11})

### 하드코딩된 예제

In [8]:
examples = [
    {
        "query": "Do the Cozy Comfort Pullover Set\
        have side pockets?",
        "answer": "Yes"
    },
    {
        "query": "What collection is the Ultra-Lofty \
        850 Stretch Down Hooded Jacket from?",
        "answer": "The DownTek collection"
    }
]

### LLM으로 생성된 예제

In [9]:
from langchain.evaluation.qa import QAGenerateChain

In [10]:
# example_gen_chain = QAGenerateChain.from_llm(ChatOpenAI())
example_gen_chain = QAGenerateChain.from_llm(PromptLayerChatOpenAI(pl_tags=["langchain_5_230707"]))

In [11]:
[{"doc": t.page_content} for t in data[:1]]

[{'doc': ": 0\nname: Women's Campside Oxfords\ndescription: This ultracomfortable lace-to-toe Oxford boasts a super-soft canvas, thick cushioning, and quality construction for a broken-in feel from the first time you put them on. \n\nSize & Fit: Order regular shoe size. For half sizes not offered, order up to next whole size. \n\nSpecs: Approx. weight: 1 lb.1 oz. per pair. \n\nConstruction: Soft canvas material for a broken-in feel and look. Comfortable EVA innersole with Cleansport NXT® antimicrobial odor control. Vintage hunt, fish and camping motif on innersole. Moderate arch contour of innersole. EVA foam midsole for cushioning and support. Chain-tread-inspired molded rubber outsole with modified chain-tread pattern. Imported. \n\nQuestions? Please contact us for any inquiries."}]

In [12]:
new_examples = example_gen_chain.apply_and_parse(
    [{"doc": t} for t in data[:5]]
)



In [13]:
# new_examples = example_gen_chain.apply_and_parse(
#     [{"doc": t.page_content} for t in data[:5]]
# )

In [14]:
new_examples[0]

{'query': "What is the weight of one pair of Women's Campside Oxfords?",
 'answer': "The approximate weight of one pair of Women's Campside Oxfords is 1 lb. 1 oz."}

In [15]:
new_examples[1]

{'query': 'What are the dimensions of the Small and Medium sizes of the Recycled Waterhog Dog Mat, Chevron Weave?',
 'answer': 'The dimensions of the Small size are 18" x 28" and the dimensions of the Medium size are 22.5" x 34.5".'}

In [16]:
new_examples[2]

{'query': "What are some features of the Infant and Toddler Girls' Coastal Chill Swimsuit, Two-Piece?",
 'answer': "Some features of the Infant and Toddler Girls' Coastal Chill Swimsuit, Two-Piece include bright colors, ruffles, exclusive whimsical prints, four-way-stretch and chlorine-resistant fabric, UPF 50+ rated fabric for sun protection, crossover no-slip straps, fully lined bottom for a secure fit and maximum coverage."}

In [17]:
data[1]

Document(page_content=': 1\nname: Recycled Waterhog Dog Mat, Chevron Weave\ndescription: Protect your floors from spills and splashing with our ultradurable recycled Waterhog dog mat made right here in the USA. \n\nSpecs\nSmall - Dimensions: 18" x 28". \nMedium - Dimensions: 22.5" x 34.5".\n\nWhy We Love It\nMother nature, wet shoes and muddy paws have met their match with our Recycled Waterhog mats. Ruggedly constructed from recycled plastic materials, these ultratough mats help keep dirt and water off your floors and plastic out of landfills, trails and oceans. Now, that\'s a win-win for everyone.\n\nFabric & Care\nVacuum or hose clean.\n\nConstruction\n24 oz. polyester fabric made from 94% recycled materials.\nRubber backing.\n\nAdditional Features\nFeatures an -exclusive design.\nFeatures thick and thin fibers for scraping dirt and absorbing water.\nDries quickly and resists fading, rotting, mildew and shedding.\nUse indoors or out.\nMade in the USA.\n\nHave questions? Reach out to

In [18]:
new_examples

[{'query': "What is the weight of one pair of Women's Campside Oxfords?",
  'answer': "The approximate weight of one pair of Women's Campside Oxfords is 1 lb. 1 oz."},
 {'query': 'What are the dimensions of the Small and Medium sizes of the Recycled Waterhog Dog Mat, Chevron Weave?',
  'answer': 'The dimensions of the Small size are 18" x 28" and the dimensions of the Medium size are 22.5" x 34.5".'},
 {'query': "What are some features of the Infant and Toddler Girls' Coastal Chill Swimsuit, Two-Piece?",
  'answer': "Some features of the Infant and Toddler Girls' Coastal Chill Swimsuit, Two-Piece include bright colors, ruffles, exclusive whimsical prints, four-way-stretch and chlorine-resistant fabric, UPF 50+ rated fabric for sun protection, crossover no-slip straps, fully lined bottom for a secure fit and maximum coverage."},
 {'query': 'What is the composition of the body and lining of the Refresh Swimwear, V-Neck Tankini Contrasts?',
  'answer': 'The body of the Refresh Swimwear, V

### Combine examples

In [None]:
examples += new_examples

In [None]:
qa.run(examples[0]["query"])

In [None]:
examples[0]["query"]

## Manual Evaluation

In [None]:
import langchain
langchain.debug = True

In [None]:
qa.run(examples[0]["query"])

In [None]:
qa.run(examples[2]["query"])

In [None]:
examples[2]

In [None]:
# Turn off the debug mode
langchain.debug = False

## LLM assisted evaluation

In [None]:
predictions = qa.apply(examples)

In [None]:
from langchain.evaluation.qa import QAEvalChain

In [None]:
llm = ChatOpenAI(temperature=0)
eval_chain = QAEvalChain.from_llm(llm)

In [None]:
graded_outputs = eval_chain.evaluate(examples, predictions)

In [None]:
examples

In [None]:
predictions

In [None]:
graded_outputs

In [None]:
for i, eg in enumerate(examples):
    print(f"Example {i}:")
    print("Question: " + predictions[i]['query'])
    print("Real Answer: " + predictions[i]['answer'])
    print("Predicted Answer: " + predictions[i]['result'])
    print("Predicted Grade: " + graded_outputs[i]['text'])
    print()