### RAG 품질 평가
- faithfulness : 답변이 컨텍스트에 사실적으로 답을 했는가?
- answer_relevancy : 질문과 답변이 잘 맞는가?
- context_precision : 가져온 문맥 중에서 필요한 부분이 얼마나 잘 포함됐나?
- context_call : 정답에 필요한 문맥을 얼마나 빠짐없이 가져왔나?

### 품질 평가 단계
1. 테스트 데이터 셋 만들기
2. RAG 구축
3. 평가
4. 개선 반복

In [1]:
# 1. 문서 로드
# 문서 로드를 위한 모듈
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter

# 언어 모델 및 임베딩 모델 사용을 위한 모듈
from langchain_openai import ChatOpenAI, OpenAIEmbeddings

### 테스트 데이터셋 만드는 컨셉
1. 페르소나
    1. 데이터 셋에 맞는 페르소나
    2. 내가 넣고 싶은 페르소나
2. 시나리오
    1. 각 청킹(docs) 를 1개 참고해서 답변을 만들것인지
    2. 각 청킹(docs) 를 여러 개 참고해서 답변을 만들것인지
3. 평가 요소 가중치 설정
    1. 5 : 2.5 : 2의 기본값
    2. 4 : 3 : 3

In [2]:
pdf_path = "../00_data/Sustainability_report_2024_kr.pdf"
loader = PyPDFLoader(pdf_path)
docs = loader.load()
print(len(docs))

83


In [4]:
# 2. 문서 청킹
splitter = RecursiveCharacterTextSplitter(
    chunk_size = 1000,
    chunk_overlap = 100,
)
chunks = splitter.split_documents(docs[:20])
print(len(chunks))

48


In [5]:
# 3. 시나리오 설정 및 페르소나 생성
from ragas.llms import LangchainLLMWrapper  # 없어질 예정
from ragas.embeddings import OpenAIEmbeddings
import openai

gen_llm = LangchainLLMWrapper(ChatOpenAI(model="gpt-4.1-mini"))
openai_client = openai.OpenAI()
gen_embeddings = OpenAIEmbeddings(client = openai_client)

  gen_llm = LangchainLLMWrapper(ChatOpenAI(model="gpt-4.1-mini"))


In [7]:
from ragas.testset import TestsetGenerator

generator = TestsetGenerator(
    llm = gen_llm,
    embedding_model = gen_embeddings
)

In [8]:
generator.persona_list

### 자동 생성 페르소나 + 커스텀 페르소나
1. 우선 testset 하나를 만들어야함
2. 자동 생성 페르소나
3. 커스텀 페르소나를 추가

In [9]:
dataset_test = generator.generate_with_langchain_docs(
    documents = chunks,
    testset_size = 1
)

Applying HeadlinesExtractor:   0%|          | 0/39 [00:00<?, ?it/s]

Applying HeadlineSplitter:   0%|          | 0/48 [00:00<?, ?it/s]

Applying SummaryExtractor:   0%|          | 0/46 [00:00<?, ?it/s]

Property 'summary' already exists in node 'a63a48'. Skipping!
Property 'summary' already exists in node 'e233e6'. Skipping!
Property 'summary' already exists in node '07f19b'. Skipping!
Property 'summary' already exists in node '82a5c6'. Skipping!
Property 'summary' already exists in node '254b0c'. Skipping!
Property 'summary' already exists in node 'a4cabb'. Skipping!
Property 'summary' already exists in node '0943b0'. Skipping!


Applying CustomNodeFilter:   0%|          | 0/72 [00:00<?, ?it/s]

Applying EmbeddingExtractor:   0%|          | 0/46 [00:00<?, ?it/s]

  property_name, property_value = await self.extract(node)
Property 'summary_embedding' already exists in node '0943b0'. Skipping!
Property 'summary_embedding' already exists in node 'a4cabb'. Skipping!
Property 'summary_embedding' already exists in node '254b0c'. Skipping!
Property 'summary_embedding' already exists in node 'a63a48'. Skipping!
Property 'summary_embedding' already exists in node 'e233e6'. Skipping!
Property 'summary_embedding' already exists in node '82a5c6'. Skipping!
Property 'summary_embedding' already exists in node '07f19b'. Skipping!


Applying ThemesExtractor:   0%|          | 0/62 [00:00<?, ?it/s]

Applying NERExtractor:   0%|          | 0/62 [00:00<?, ?it/s]

Applying CosineSimilarityBuilder:   0%|          | 0/1 [00:00<?, ?it/s]

Applying OverlapScoreBuilder:   0%|          | 0/1 [00:00<?, ?it/s]

Generating personas:   0%|          | 0/3 [00:00<?, ?it/s]

Generating Scenarios:   0%|          | 0/3 [00:00<?, ?it/s]

Generating Samples:   0%|          | 0/3 [00:00<?, ?it/s]

In [10]:
generator.persona_list

[Persona(name='Corporate Sustainability Manager', role_description='Oversees ESG reporting and compliance with evolving global sustainability regulations while aligning business strategy with long-term technological and environmental goals.'),
 Persona(name='Sustainability and Environmental Compliance Manager', role_description='Oversees and implements sustainability initiatives, ensures compliance with environmental standards, and drives efforts to reduce carbon footprint and waste across manufacturing and supply chain operations.'),
 Persona(name='Sustainability Program Manager', role_description='Leads initiatives to assess and improve environmental and social impacts, engages diverse stakeholders, and aligns corporate strategies with ESG standards and global sustainability goals.')]

In [None]:
test_df = dataset_test.to_pandas()
test_df 

Unnamed: 0,user_input,reference_contexts,reference,synthesizer_name
0,삼성전자는 공급망 실사지침에 어떻게 대응하고 있나요?,[Facts & Figures PrinciplePlanet People\nCEO 메...,삼성전자는 독일에서 2023년 발효된 공급망실사법과 2024년 5월 확정되는 EU ...,single_hop_specific_query_synthesizer
1,How does Samsung Electronics analyze the envir...,[<1-hop>\n\n삼성전자 지속가능경영보고서 2024\n09\nOur Compa...,Samsung Electronics conducts a comprehensive a...,multi_hop_abstract_query_synthesizer
2,How DX부문 manage 온실가스 risk and plan carbon zero...,[<1-hop>\n\n리스크 관리\nDX부문은 기후변화와 관련된 리스 크의 실질적인...,DX부문 manage 온실가스 risk by identifying financial...,multi_hop_specific_query_synthesizer


In [12]:
# 커스텀 페르소나 만들기
from ragas.testset.persona import Persona
custom_personas = [
    Persona(name='Stock Investor', role_description='Analyzes financial performance, market trends, and corporate strategy to make informed decisions about buying, selling, or holding company shares.'),
    Persona(name='Job Seeker', role_description='Researches company culture, values, and career opportunities to prepare for the application and interview process, aiming to secure employment.'),
    Persona(name='Business Partner', role_description='Represents a partner or supplier company, managing the collaborative relationship, overseeing joint projects, and ensuring contractual obligations are met.')
]

In [13]:
auto_persona = generator.persona_list
auto_persona

[Persona(name='Corporate Sustainability Manager', role_description='Oversees ESG reporting and compliance with evolving global sustainability regulations while aligning business strategy with long-term technological and environmental goals.'),
 Persona(name='Sustainability and Environmental Compliance Manager', role_description='Oversees and implements sustainability initiatives, ensures compliance with environmental standards, and drives efforts to reduce carbon footprint and waste across manufacturing and supply chain operations.'),
 Persona(name='Sustainability Program Manager', role_description='Leads initiatives to assess and improve environmental and social impacts, engages diverse stakeholders, and aligns corporate strategies with ESG standards and global sustainability goals.')]

In [14]:
generator.persona_list = auto_persona + custom_personas
generator.persona_list

[Persona(name='Corporate Sustainability Manager', role_description='Oversees ESG reporting and compliance with evolving global sustainability regulations while aligning business strategy with long-term technological and environmental goals.'),
 Persona(name='Sustainability and Environmental Compliance Manager', role_description='Oversees and implements sustainability initiatives, ensures compliance with environmental standards, and drives efforts to reduce carbon footprint and waste across manufacturing and supply chain operations.'),
 Persona(name='Sustainability Program Manager', role_description='Leads initiatives to assess and improve environmental and social impacts, engages diverse stakeholders, and aligns corporate strategies with ESG standards and global sustainability goals.'),
 Persona(name='Stock Investor', role_description='Analyzes financial performance, market trends, and corporate strategy to make informed decisions about buying, selling, or holding company shares.'),
 P

In [15]:
# 비율 조정
from ragas.testset.synthesizers.multi_hop import (
    MultiHopAbstractQuerySynthesizer,
    MultiHopSpecificQuerySynthesizer,
)
from ragas.testset.synthesizers.single_hop.specific import (
    SingleHopSpecificQuerySynthesizer,
)
from ragas.llms.base import llm_factory

ragas_llm = llm_factory(model = "gpt-4.1-mini")

scenarios = [
    (SingleHopSpecificQuerySynthesizer(llm=ragas_llm), 0.4),
    (MultiHopAbstractQuerySynthesizer(llm=ragas_llm), 0.3),
    (MultiHopSpecificQuerySynthesizer(llm=ragas_llm), 0.3)
]

In [16]:
dataset = generator.generate(
    testset_size = 100,
    query_distribution = scenarios
)

Generating Scenarios:   0%|          | 0/3 [00:00<?, ?it/s]

Generating Samples:   0%|          | 0/100 [00:00<?, ?it/s]

In [17]:
dataset_df = dataset.to_pandas()
dataset_df.head()

Unnamed: 0,user_input,reference_contexts,reference,synthesizer_name
0,독일에서 2023년에 발효된 공급망 관련 법은 무엇인가요?,[Facts & Figures PrinciplePlanet People\nCEO 메...,독일에서는 공급망의 인권과 근로환경 관리를 의무화하는 공급망실사법이 2023년에 발...,single_hop_specific_query_synthesizer
1,Could you please explain the role and signific...,"[접수된 고충의 처리 원칙에 대한 기준을 수립하였고, 공급망 관리에 \n있어서는 비...",희망별숲 is a subsidiary company established in Ma...,single_hop_specific_query_synthesizer
2,Could you explain the role of the 글로벌 행동규범 in ...,[삼성전자 지속가능경영보고서 2024\n05\nOur Company Appendix...,Samsung Electronics has established the 글로벌 행동...,single_hop_specific_query_synthesizer
3,What is the role of 메모리 반도체 in Samsung Electro...,[Our Company AppendixMateriality Assessment Fa...,메모리 반도체 사업은 삼성전자의 DS(Device Solutions) 부문에 속하며...,single_hop_specific_query_synthesizer
4,What DS Device Solutions mean in context of 20...,[Device eXperienceDX\n DS Device Solutions\n메모...,DS Device Solutions refers to a segment mentio...,single_hop_specific_query_synthesizer


In [18]:
dataset_df.to_excel("report_2024_test.xlsx", index = False)