# 02. LangChain 프레임워크 활용

## 🎯 학습 목표
1. LangChain의 핵심 개념 이해
2. 다양한 체인 패턴 구현
3. 프롬프트 템플릿 활용법
4. 메모리 관리 기법 습득

## 📚 LangChain이란?

## 1. LangChain 핵심 개념

### LangChain 구성 요소

```
┌─────────────────────────────────────────┐
│            LangChain Framework          │
├─────────────────────────────────────────┤
│  Models   │  Prompts  │  Memory         │
│  (LLM)    │ Templates │  Storage        │
├─────────────────────────────────────────┤
│  Chains   │  Agents   │  Tools          │
│  (연결)    │  (자율)    │  (도구)         │
├─────────────────────────────────────────┤
│  Indexes  │  Callbacks│  Evaluators     │
│  (데이터)  │  (이벤트)  │  (평가)         │
└─────────────────────────────────────────┘
```

### 왜 LangChain을 사용하나요?

1. **표준화**: LLM 애플리케이션 개발의 표준 패턴 제공
2. **모듈화**: 재사용 가능한 컴포넌트
3. **통합**: 다양한 LLM과 도구 쉽게 연결
4. **확장성**: 복잡한 워크플로우 구성 가능

## 2. 환경 설정

In [None]:
# 필요한 패키지 설치
!pip install langchain langchain-community -q

In [1]:
from langchain_community.llms import Ollama
from langchain.prompts import PromptTemplate, ChatPromptTemplate
from langchain.chains import LLMChain, SimpleSequentialChain
from langchain.memory import ConversationBufferMemory
from langchain.schema import BaseOutputParser
import json

# Ollama LLM 초기화
llm = Ollama(model="qwen3:8b", temperature=0.7)
print("✅ LangChain + Ollama 연결 완료")

✅ LangChain + Ollama 연결 완료


  llm = Ollama(model="qwen3:8b", temperature=0.7)


## 3. 프롬프트 템플릿

### 프롬프트 템플릿이란?
재사용 가능한 프롬프트 구조를 정의하는 방법입니다.

In [3]:
# 기본 프롬프트 템플릿
simple_template = PromptTemplate(
    input_variables=["topic"],
    template="{topic}에 대해 3줄로 한글로 설명해주세요:"
)

# 체인 생성
simple_chain = LLMChain(llm=llm, prompt=simple_template)

# 실행
result = simple_chain.run(topic="머신러닝")
print(result)

<think>
Okay, the user wants a three-line explanation of machine learning in Korean. Let me start by recalling what machine learning is. It's a subset of artificial intelligence where systems learn from data. I need to make sure each line is concise. First line: define it as a method where machines learn patterns from data. Second line: mention the process of improving performance through experience, maybe using algorithms. Third line: applications like image recognition or recommendation systems. Check for clarity and conciseness. Avoid technical jargon. Make sure each line flows naturally. Let me put it all together now.
</think>

머신러닝은 데이터에서 패턴을 학습하여 예측이나 판단을 수행하는 기술입니다.  
데이터를 통해 스스로 규칙을 발견하고, 반복적인 학습으로 성능을 개선합니다.  
이미지 인식, 추천 시스템, 자연어 처리 등 다양한 분야에 적용됩니다.


In [4]:
# 복잡한 프롬프트 템플릿
complex_template = PromptTemplate(
    input_variables=["role", "task", "style"],
    template="""
당신은 {role}입니다.

작업: {task}

다음 스타일로 답변해주세요: {style}

답변:
"""
)

chain = LLMChain(llm=llm, prompt=complex_template)

result = chain.run(
    role="경험 많은 데이터 과학자",
    task="신입 개발자에게 API와 데이터베이스의 차이 설명",
    style="친절하고 이해하기 쉽게, 비유를 사용해서"
)
print(result)

<think>
Okay, the user wants me to explain the difference between API and database to a new developer in a friendly and easy-to-understand way using a metaphor. Let me think about how to approach this.

First, I need to make sure I understand the concepts correctly. An API is like a set of rules or tools that allow different software applications to communicate with each other. It's more about the interface and the methods for accessing data. A database, on the other hand, is a structured collection of data, stored in a way that allows efficient retrieval and management.

So, the challenge is to find a metaphor that clearly distinguishes these two. Maybe using a restaurant analogy? Let's see. The database could be like the kitchen, where all the ingredients (data) are stored. The API would then be the menu that customers (other programs) use to order food (data) from the kitchen. Wait, but the menu is more like the API, and the kitchen is the database. That makes sense. The API is the 

## 4. Few-Shot Learning

예제를 통해 모델에게 패턴을 학습시키는 방법입니다.

In [5]:
from langchain.prompts import FewShotPromptTemplate

# 예제 데이터
examples = [
    {"input": "happy", "output": "😊"},
    {"input": "sad", "output": "😢"},
    {"input": "angry", "output": "😠"},
    {"input": "love", "output": "❤️"},
]

# 예제 템플릿
example_template = PromptTemplate(
    input_variables=["input", "output"],
    template="Input: {input}\nOutput: {output}"
)

# Few-shot 프롬프트
few_shot_prompt = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_template,
    prefix="Convert the emotion to an emoji:",
    suffix="Input: {emotion}\nOutput:",
    input_variables=["emotion"]
)

# 체인 생성 및 실행
chain = LLMChain(llm=llm, prompt=few_shot_prompt)

# 테스트
emotions = ["excited", "confused", "sleepy", "surprised"]
for emotion in emotions:
    result = chain.run(emotion=emotion)
    print(f"{emotion}: {result.strip()}")

excited: <think>
Okay, the user wants to convert emotions to emojis. Let me see the examples they provided. Happy is 😊, sad is 😢, angry is 😠, love is ❤️, and excited is... they didn't provide the output yet. So I need to figure out the right emoji for excited.

First, I should recall common emojis for emotions. Excited is usually associated with something energetic or enthusiastic. The typical emojis for excitement might be 😄, which is a smiley, but maybe there's a more specific one. Wait, there's also 😆 which is a grinning face with eyes, but that might be more for laughing. Alternatively, maybe 🎉 which is a party popper, but that's more for celebrating. Or maybe 😍 which is a face with heart eyes, but that's more for love or adoration. Wait, the user already used ❤️ for love, so maybe excited is different. Let me think again. 

Another possibility is 😄 for excitement, but maybe the user expects something more dynamic. Alternatively, the emoji for excited could be 😆 because it shows mo

## 5. 체인 조합하기

여러 체인을 연결하여 복잡한 워크플로우를 구성할 수 있습니다.

In [7]:
def simple_sequential_chain():
    print("SimpleSequentialChain 사용 (권장):")
    
    from langchain.chains import SimpleSequentialChain
    
    # 각 체인 정의
    idea_chain = LLMChain(
        llm=llm,
        prompt=PromptTemplate(
            input_variables=["topic"],
            template="{topic}에 관련된 비즈니스 아이디어를 내봐."
        )
    )
    
    business_chain = LLMChain(
        llm=llm,
        prompt=PromptTemplate(
            input_variables=["text"],  # 이전 체인의 출력을 입력으로
            template="다음 아이디어에 근거해서 비즈니스 모델을 구상해봐: {text}."
        )
    )
    
    challenge_chain = LLMChain(
        llm=llm,
        prompt=PromptTemplate(
            input_variables=["text"],  # 이전 체인의 출력을 입력으로
            template="다음 비즈니스 모델에서 3가지 도전적인 과제는 무엇일까?: {text}"
        )
    )
    
    # SimpleSequentialChain 사용
    overall_chain = SimpleSequentialChain(
        chains=[idea_chain, business_chain, challenge_chain],
        verbose=True
    )
    
    # 실행
    result = overall_chain.run("AI 교육")
    print(f"결과: {result}")

simple_sequential_chain()

SimpleSequentialChain 사용 (권장):


[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3m<think>
Okay, the user is asking for business ideas related to AI education. Let me start by brainstorming different angles. First, maybe online courses or platforms that teach AI concepts. But wait, there are already many existing platforms like Coursera and Udacity. So maybe a niche approach? Like focusing on specific areas such as AI for kids, or AI in specific industries like healthcare or finance.

Another thought: AI tools that help educators. Maybe an AI-powered tutoring system that adapts to each student's learning pace. That could be useful in schools or for self-learners. Or maybe tools that automate grading or provide real-time feedback. But how to differentiate from existing solutions?

What about AI-driven career guidance? Like helping students choose AI-related careers based on their skills and interests. That could be a service paired with educational platforms. Or maybe a

## 6. 출력 파서

LLM의 출력을 구조화된 형식으로 파싱하는 방법입니다.

In [8]:
from langchain.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field
from typing import List

# 출력 스키마 정의
class ProductAnalysis(BaseModel):
    product_name: str = Field(description="제품 이름")
    category: str = Field(description="제품 카테고리")
    pros: List[str] = Field(description="장점 리스트")
    cons: List[str] = Field(description="단점 리스트")
    rating: float = Field(description="1-5 평점")
    recommendation: str = Field(description="추천 여부와 이유")

# 파서 생성
parser = PydanticOutputParser(pydantic_object=ProductAnalysis)

# 프롬프트 템플릿
prompt = PromptTemplate(
    template="{product}에 대해 분석해주세요.\n{format_instructions}\n",
    input_variables=["product"],
    partial_variables={"format_instructions": parser.get_format_instructions()}
)

# 체인 생성
chain = LLMChain(llm=llm, prompt=prompt)

# 실행 및 파싱
try:
    output = chain.run(product="ChatGPT")
    parsed = parser.parse(output)
    print("파싱된 결과:")
    print(f"제품: {parsed.product_name}")
    print(f"카테고리: {parsed.category}")
    print(f"장점: {parsed.pros}")
    print(f"단점: {parsed.cons}")
    print(f"평점: {parsed.rating}")
    print(f"추천: {parsed.recommendation}")
except Exception as e:
    print(f"파싱 실패: {e}")
    print(f"원본 출력:\n{output}")

파싱 실패: Invalid json output: <think>
Okay, I need to analyze ChatGPT and present the information in the specified JSON format. Let me start by recalling what I know about ChatGPT. It's a large language model developed by OpenAI, right? So the product name should be "ChatGPT" and the category would be something like "AI Language Model" or "AI Chatbot".

Now, for the pros. ChatGPT is known for its ability to generate human-like text, handle various tasks like writing, coding, answering questions. It's also versatile, can be used in different domains. Maybe it's user-friendly and has a wide range of applications. Oh, and it's part of the GPT series, which is a big plus.

Cons could include things like potential biases in responses, the need for internet access since it's a cloud service, and maybe the cost associated with using it. Also, there might be limitations in understanding complex or nuanced queries. Oh, and privacy concerns since data is processed through OpenAI's servers.

Rating

## 7. 메모리 관리

대화 히스토리를 관리하여 컨텍스트를 유지하는 방법입니다.

In [9]:
from langchain.memory import ConversationBufferMemory, ConversationSummaryMemory
from langchain.chains import ConversationChain

# 1. Buffer Memory (전체 대화 저장)
buffer_memory = ConversationBufferMemory()
conversation = ConversationChain(
    llm=llm,
    memory=buffer_memory,
    verbose=True
)

print("=== Buffer Memory Demo ===")
conversation.predict(input="안녕하세요, 저는 파이썬을 배우고 있습니다.")
conversation.predict(input="초보자가 먼저 배워야 할 것은 무엇인가요?")
conversation.predict(input="그렇다면 제가 처음에 말한 언어는 무엇이었나요?")

# 메모리 내용 확인
print("\n메모리 내용:")
print(buffer_memory.buffer)

  buffer_memory = ConversationBufferMemory()
  conversation = ConversationChain(


=== Buffer Memory Demo ===


[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: 안녕하세요, 저는 파이썬을 배우고 있습니다.
AI:[0m

[1m> Finished chain.[0m


[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: 안녕하세요, 저는 파이썬을 배우고 있습니다.
AI: <think>
Okay, the user greeted me in Korean and mentioned they're learning Python. I should respond in Korean since they used that language. Let me make sure I'm using the right ter

In [10]:
# 2. Summary Memory (대화 요약 저장)
summary_memory = ConversationSummaryMemory(llm=llm)
summary_conversation = ConversationChain(
    llm=llm,
    memory=summary_memory,
    verbose=True
)

print("=== Summary Memory Demo ===")
summary_conversation.predict(input="머신러닝 프로젝트를 시작하려고 합니다.")
summary_conversation.predict(input="데이터 전처리는 어떻게 해야 하나요?")
summary_conversation.predict(input="모델 선택 기준은 무엇인가요?")

# 요약된 메모리 확인
print("\n요약된 메모리:")
print(summary_memory.buffer)

  summary_memory = ConversationSummaryMemory(llm=llm)


=== Summary Memory Demo ===


[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: 머신러닝 프로젝트를 시작하려고 합니다.
AI:[0m

[1m> Finished chain.[0m


[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
<think>
Okay, let's break this down. The user wants to start a machine learning project, so the AI needs to guide them through the initial steps. The previous summary was empty, so I need to create the first summary bas

## 8. 커스텀 체인 만들기

특정 목적에 맞는 커스텀 체인을 구성해봅시다.

In [11]:
class CodeReviewChain:
    """코드 리뷰를 위한 커스텀 체인"""
    
    def __init__(self, llm):
        self.llm = llm
        
        # 버그 검사 체인
        self.bug_prompt = PromptTemplate(
            input_variables=["code"],
            template="""
다음 코드에서 잠재적인 버그를 찾아주세요:

```python
{code}
```

버그 분석:
"""
        )
        self.bug_chain = LLMChain(llm=llm, prompt=self.bug_prompt)
        
        # 성능 개선 체인
        self.performance_prompt = PromptTemplate(
            input_variables=["code"],
            template="""
다음 코드의 성능을 개선할 수 있는 방법을 제안해주세요:

```python
{code}
```

성능 개선 제안:
"""
        )
        self.performance_chain = LLMChain(llm=llm, prompt=self.performance_prompt)
        
        # 리팩토링 제안 체인
        self.refactor_prompt = PromptTemplate(
            input_variables=["code"],
            template="""
다음 코드를 더 깔끔하게 리팩토링해주세요:

```python
{code}
```

리팩토링된 코드:
"""
        )
        self.refactor_chain = LLMChain(llm=llm, prompt=self.refactor_prompt)
    
    def review(self, code):
        """종합적인 코드 리뷰 수행"""
        results = {
            "bugs": self.bug_chain.run(code=code),
            "performance": self.performance_chain.run(code=code),
            "refactored": self.refactor_chain.run(code=code)
        }
        return results

# 사용 예제
reviewer = CodeReviewChain(llm)

sample_code = """
def find_duplicates(lst):
    duplicates = []
    for i in range(len(lst)):
        for j in range(i+1, len(lst)):
            if lst[i] == lst[j] and lst[i] not in duplicates:
                duplicates.append(lst[i])
    return duplicates
"""

review_results = reviewer.review(sample_code)

print("=== 코드 리뷰 결과 ===")
print("\n[버그 분석]")
print(review_results["bugs"][:300])
print("\n[성능 개선]")
print(review_results["performance"][:300])
print("\n[리팩토링]")
print(review_results["refactored"][:500])

=== 코드 리뷰 결과 ===

[버그 분석]
<think>
Okay, let's look at this Python code. The function is supposed to find duplicates in a list. The code uses two nested loops: for each element, it compares it with every subsequent element. If they are equal and the element isn't already in the duplicates list, it adds it to duplicates. Hmm, 

[성능 개선]
<think>
Okay, let's see. The user has this Python function called find_duplicates that's supposed to find duplicates in a list. But the code is using two nested loops, which probably makes it really slow for large lists. Let me think about how to improve this.

First, the current approach checks eve

[리팩토링]
<think>
Okay, let's see. The user wants me to refactor the given Python function to make it cleaner. The original code is called find_duplicates and takes a list. It uses nested loops to check for duplicates and appends them to a list. But I need to think about how to make this more efficient and readable.

First, the original code has two nested loops,

## 9. 실전 프로젝트: 문서 요약 파이프라인

In [12]:
from langchain.text_splitter import CharacterTextSplitter
from langchain.chains.summarize import load_summarize_chain
from langchain.docstore.document import Document

class DocumentSummarizer:
    """긴 문서를 요약하는 파이프라인"""
    
    def __init__(self, llm, chunk_size=500):
        self.llm = llm
        self.text_splitter = CharacterTextSplitter(
            chunk_size=chunk_size,
            chunk_overlap=50
        )
    
    def summarize(self, text, summary_type="map_reduce"):
        """문서 요약
        
        Args:
            text: 요약할 텍스트
            summary_type: 'stuff', 'map_reduce', 'refine' 중 선택
        """
        # 텍스트 분할
        chunks = self.text_splitter.split_text(text)
        docs = [Document(page_content=chunk) for chunk in chunks]
        
        # 요약 체인 생성
        if summary_type == "stuff":
            # 모든 텍스트를 한번에 처리 (짧은 문서용)
            chain = load_summarize_chain(self.llm, chain_type="stuff")
        elif summary_type == "map_reduce":
            # 각 청크를 요약하고 결합 (긴 문서용)
            chain = load_summarize_chain(self.llm, chain_type="map_reduce")
        else:
            # 점진적으로 요약 개선
            chain = load_summarize_chain(self.llm, chain_type="refine")
        
        return chain.run(docs)
    
    def extract_key_points(self, text, num_points=5):
        """핵심 포인트 추출"""
        prompt = PromptTemplate(
            input_variables=["text", "num"],
            template="""
다음 텍스트에서 가장 중요한 {num}개의 핵심 포인트를 추출해주세요:

{text}

핵심 포인트:
1.
"""
        )
        chain = LLMChain(llm=self.llm, prompt=prompt)
        return chain.run(text=text, num=num_points)

# 사용 예제
long_text = """
인공지능(AI)은 인간의 학습능력, 추론능력, 지각능력을 인공적으로 구현하려는 
컴퓨터 과학의 한 분야입니다. 1950년대에 앨런 튜링이 제안한 튜링 테스트를 
시작으로, AI는 꾸준히 발전해왔습니다.

현재 AI는 머신러닝, 특히 딥러닝의 발전으로 놀라운 성과를 보여주고 있습니다.
이미지 인식, 자연어 처리, 음성 인식 등 다양한 분야에서 인간 수준을 넘어서는
성능을 보여주고 있으며, 의료, 금융, 제조업 등 산업 전반에 혁신을 가져오고 있습니다.

하지만 AI의 발전은 일자리 감소, 프라이버시 침해, 편향성 문제 등 여러 우려도
함께 가져왔습니다. 따라서 AI의 윤리적 사용과 규제에 대한 논의가 활발히
진행되고 있습니다.
"""

summarizer = DocumentSummarizer(llm)

print("=== 문서 요약 ===")
summary = summarizer.summarize(long_text, "stuff")
print(summary)

print("\n=== 핵심 포인트 ===")
key_points = summarizer.extract_key_points(long_text, 3)
print(key_points)

=== 문서 요약 ===
<think>
Okay, I need to write a concise summary of the given Korean text about AI. Let me read through the original text first to understand the main points.

The text starts by defining AI as a computer science field aiming to replicate human learning, reasoning, and perception capabilities. It mentions Alan Turing's Turing Test from the 1950s as the starting point of AI development. Then it talks about current advancements, especially with machine learning and deep learning, leading to impressive results in areas like image recognition, natural language processing, and speech recognition. These technologies are now impacting various industries such as healthcare, finance, and manufacturing. However, there are concerns about job loss, privacy issues, and bias, leading to ongoing discussions on ethical use and regulation.

Now, I need to condense this into a concise summary. Let me identify the key elements: definition of AI, historical context (Turing Test), current tech

## 10. 성능 최적화 팁

### 체인 최적화
1. **캐싱**: 반복되는 프롬프트 결과 캐싱
2. **배치 처리**: 여러 입력을 한번에 처리
3. **비동기 실행**: 독립적인 체인들을 병렬 실행

### 메모리 관리
- `ConversationBufferWindowMemory`: 최근 N개 대화만 유지
- `ConversationSummaryBufferMemory`: 오래된 대화는 요약
- `VectorStoreRetrieverMemory`: 벡터 DB 활용

### 프롬프트 최적화
- 명확한 지시사항
- 출력 형식 명시
- Few-shot 예제 제공

## 🎯 실습 과제

1. **기본 과제**: 
   - 자신만의 프롬프트 템플릿 3개 만들기
   - Sequential Chain으로 3단계 워크플로우 구성

2. **심화 과제**:
   - 커스텀 OutputParser 구현
   - 대화형 챗봇에 메모리 기능 추가

3. **프로젝트**:
   - 뉴스 기사를 분석하고 요약하는 파이프라인 구축
   - 코드를 문서화하는 자동화 도구 만들기

## 📚 추가 학습 자료

- [LangChain 공식 문서](https://python.langchain.com/)
- [LangChain Cookbook](https://github.com/langchain-ai/langchain/tree/master/cookbook)
- [LangChain Hub](https://smith.langchain.com/hub)

## 다음 단계

다음 노트북에서는 **임베딩과 벡터 검색**을 활용하여 의미 기반 검색 시스템을 구축하는 방법을 배워보겠습니다.