<a href="https://colab.research.google.com/github/hail-members/llm-based-services/blob/main/Chapter10_RAG.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import sys

# check python version
print("Python version:", sys.version)

Python version: 3.10.17 | packaged by conda-forge | (main, Apr 10 2025, 22:23:34) [Clang 18.1.8 ]


In [None]:
!pip install gpt4all langchain langchain-community pymupdf matplotlib datasets chromadb

Collecting gpt4all
  Downloading gpt4all-2.8.2-py3-none-macosx_10_15_universal2.whl.metadata (4.8 kB)
Collecting langchain
  Downloading langchain-0.3.25-py3-none-any.whl.metadata (7.8 kB)
Collecting langchain-community
  Downloading langchain_community-0.3.24-py3-none-any.whl.metadata (2.5 kB)
Collecting pymupdf
  Downloading pymupdf-1.25.5-cp39-abi3-macosx_11_0_arm64.whl.metadata (3.4 kB)
Collecting matplotlib
  Downloading matplotlib-3.10.3-cp310-cp310-macosx_11_0_arm64.whl.metadata (11 kB)
Collecting datasets
  Downloading datasets-3.6.0-py3-none-any.whl.metadata (19 kB)
Collecting chromadb
  Downloading chromadb-1.0.9-cp39-abi3-macosx_11_0_arm64.whl.metadata (6.9 kB)
Collecting tqdm (from gpt4all)
  Using cached tqdm-4.67.1-py3-none-any.whl.metadata (57 kB)
Collecting langchain-core<1.0.0,>=0.3.58 (from langchain)
  Downloading langchain_core-0.3.59-py3-none-any.whl.metadata (5.9 kB)
Collecting langchain-text-splitters<1.0.0,>=0.3.8 (from langchain)
  Downloading langchain_text_sp

In [None]:
!pip install numpy==1.26.2

Collecting numpy==1.26.2
  Downloading numpy-1.26.2-cp310-cp310-macosx_11_0_arm64.whl.metadata (61 kB)
Downloading numpy-1.26.2-cp310-cp310-macosx_11_0_arm64.whl (14.0 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m14.0/14.0 MB[0m [31m34.6 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25hInstalling collected packages: numpy
  Attempting uninstall: numpy
    Found existing installation: numpy 1.24.4
    Uninstalling numpy-1.24.4:
      Successfully uninstalled numpy-1.24.4
Successfully installed numpy-1.26.2


In [None]:
!pip install "gpt4all[metal]"




## 1. 공통 설정

공통으로 사용할 라이브러리 import, LLM 인스턴스와 프롬프트 템플릿을 정의합니다.

In [None]:
# 모델 다운로드
from gpt4all import GPT4All # gpt4all 라이브러리 사용
gpt4all_model = GPT4All("Nous-Hermes-2-Mistral-7B-DPO.Q4_0.gguf", device="gpu") # 모델 로드


In [None]:
!huggingface-cli whoami # huggingface-cli login 해서 토큰 입력해둡시다!

dongjaekim
[1morgs: [0m youarehailed,h-ail


In [None]:
from datasets import load_dataset

# Hugging Face에서 금융 QA 데이터셋 다운로드
dataset = load_dataset("philschmid/finanical-rag-embedding-dataset")
train_data = dataset['train']
print(train_data[0])


{'question': 'What area did NVIDIA initially focus on before expanding to other computationally intensive fields?', 'context': 'Since our original focus on PC graphics, we have expanded to several other large and important computationally intensive fields.'}


In [None]:
from langchain_community.embeddings import GPT4AllEmbeddings

embeddings = GPT4AllEmbeddings(model_name = "all-MiniLM-L6-v2.gguf2.f16.gguf")
text = "NVIDIA designs GPUs for gaming and AI."
vector = embeddings.embed_query(text)
print(f"임베딩 벡터 차원: {len(vector)}")
print(f"임베딩 벡터 일부: {vector[:5]}")

임베딩 벡터 차원: 384
임베딩 벡터 일부: [-0.03229876235127449, 0.006807653233408928, -0.04741065576672554, -0.016777602955698967, -0.015279578045010567]


In [None]:
# 'context'가 None이거나 빈 문자열인 데이터 인덱스
none_or_empty_context = [i for i, item in enumerate(train_data) if not item.get('context')]

# 'question'이 None이거나 빈 문자열인 데이터 인덱스
none_or_empty_question = [i for i, item in enumerate(train_data) if not item.get('question')]



print("None or empty context indices:", none_or_empty_context)
print("None or empty question indices:", none_or_empty_question)

None or empty context indices: [2731]
None or empty question indices: [1733, 2731]


In [None]:
from langchain.docstore.document import Document
from langchain_community.vectorstores import Chroma

# LangChain Document 포맷으로 변환
docs = [
    Document(
        page_content=item['context'],
        metadata={'question': item['question']}
    )
    for item in train_data
    if item.get('context') not in [None, ''] and item.get('question') not in [None, '']
]

# Chroma에 문서 저장 (임베딩 벡터 생성)
db = Chroma.from_documents(docs, embeddings)

In [None]:
db

<langchain_community.vectorstores.chroma.Chroma at 0x16aecdc00>

In [None]:
# Retriever 생성
retriever = db.as_retriever()

# 쿼리로 관련 문서 검색
query = "When did NVIDIA release their first GPU?"
results = retriever.get_relevant_documents(query)

print("검색된 문서 수:", len(results))
for doc in results:
    print("검색된 문서:", doc.page_content[:200])

검색된 문서 수: 4
검색된 문서: Our invention of the GPU in 1999 defined modern computer graphics and established NVIDIA as the leader in computer graphics.
검색된 문서: Fueled by the sustained demand for exceptional 3D graphics and the scale of the gaming market, NVIDIA has leveraged its GPU architecture to create platforms for scientific computing, AI, data science,
검색된 문서: In fiscal year 2023, we introduced the GeForce RTX 40 Series of gaming GPUs, based on the Ada Lovelace architecture. The 40 Series features our third generation RTX technology, third generation NVIDIA
검색된 문서: NVIDIA has a platform strategy, bringing together hardware, systems, software, algorithms, libraries, and services to create unique value for the markets we serve.


In [None]:
from langchain.chains import RetrievalQA
from langchain_community.llms import GPT4All

llm = GPT4All(model="Phi-3-mini-4k-instruct.Q4_0.gguf")

qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True
)

question = "What is the main business of NVIDIA?"
result = qa_chain({"query": question})

print("답변:", result['result'])
print("근거 문서:", [doc.page_content[:200] for doc in result['source_documents']])

  result = qa_chain({"query": question})


답변: 
<|assistant|> The main business of NVIDIA revolves around their computing platform strategy that focuses on accelerating compute-intensive workloads such as AI, data analytics, graphics, scientific computing across various environments like hyperscale and cloud data centers. This involves providing energy-efficient GPUs (Graphics Processing Units), DPUs (Data Processing Units), interconnects, systems, CUDA programming model, software libraries, SDKs, application frameworks, and services to cater to these workloads effectively.

NVIDIA's platform strategy also includes a career development focus for their employees with an aim of retaining talent within the company over time, as evidenced by their low turnover rate in fiscal year 2023 (5.3%). Additionally, NVIDIA is headqu[Answer]: The main business of NVIDIA revolves around creating a computing platform that accelerates compute-intensive workloads such as AI, data analytics, graphics and scientific computing across various environ

다른 방식.. 직접 프롬프트쓰고 chain 연결하는 방식

In [None]:

from langchain.prompts import PromptTemplate


prompt_template = """<|system|>
You are a helpful assistant.<|end|>
<|user|>
Context:
{context}

Question:
{question}<|end|>
<|assistant|>
"""

prompt = PromptTemplate(
    template=prompt_template,
    input_variables=["context", "question"]
)

chain = prompt | llm

question = "What is the main business of NVIDIA?"
context = "\n\n".join([doc.page_content for doc in retriever.get_relevant_documents(question)])

result = chain.invoke({"context": context, "question": question})
print("RAG 답변:", result)
print("근거 문서:", context)

RAG 답변:  The main business of NVIDIA revolves around its computing platform strategy that focuses on accelerating compute-intensive workloads across various sectors such as AI, data analytics, graphics, scientific computing, and more. This involves providing energy-efficient GPUs (Graphics Processing Units), DPUs (Data Processing Units), interconnects, systems, CUDA programming model, software libraries, SDKs/SDK kits, application frameworks, and services to cater to hyperscale data centers, cloud environments, enterprise solutions, public sector applications, as well as edge computing. NVIDIA aims to create unique value for the markets they serve by integrating hardware, systems, software, algorithms, libraries, and services into their platform strategy.
근거 문서: NVIDIA has a platform strategy, bringing together hardware, systems, software, algorithms, libraries, and services to create unique value for the markets we serve.

We want NVIDIA to be a place where people can build their care

In [None]:

# 컨텍스트 없이 동일한 chat format 프롬프트 사용
single_prompt_template = """<|system|>
You are a helpful assistant.<|end|>
<|user|>
Question:
{question}<|end|>
<|assistant|>
"""

single_prompt = PromptTemplate(
    template=single_prompt_template,
    input_variables=["question"]
)

chain_no_rag = single_prompt | llm

question = "What is the main business of NVIDIA?"
result_no_rag = chain_no_rag.invoke({"question": question})
print("LLM 단독 답변:", result_no_rag)

LLM 단독 답변:  The primary business of NVIDIA Corporation, an American multincorbinational technology company based in Santa Clara, California, revolves around designing and manufacturing graphics processing units (GPUs) for gaming, data centers, professional markets, and automotive. Their GPU products are widely used not only by gamers but also across various industries such as artificial intelligence research, autonomous vehicles development, deep learning applications, high-performance computing, and more due to their advanced processing capabilities. NVIDIA's CUDA (Compute Unified Device Architecture) technology is a parallel computing platform that enables GPUs for general purpose processing tasks beyond graphics rendering.
