# Prompt Caching 101 - 실습 워크샵

이제 프레젠테이션에서 배운 내용을 직접 실습해보겠습니다!

## 실습 목표
- ✋ Prompt Caching을 직접 구현해보기
- 📊 캐싱 전후 성능 차이 측정하기
- 🎯 실제 프로덕션 시나리오에 적용하기

## 준비사항
- AWS Bedrock 액세스 권한
- Python 환경 설정

In [None]:
import boto3
import json
import time
from common_functions import converse_bedrock
from config import BedrockModelId

# Bedrock 클라이언트 설정
bedrock = boto3.client('bedrock-runtime', region_name='us-west-2')

print("✅ 환경 설정 완료!")
print("🎯 이제 Prompt Caching을 실습해보겠습니다!")

## 실습 1: 기본 캐싱 vs 비캐싱 비교

먼저 프레젠테이션에서 본 것처럼 **순서의 중요성**을 직접 확인해보겠습니다.

### 테스트 시나리오
- 긴 시스템 프롬프트 사용
- 동일한 질문을 캐싱 유/무로 테스트
- 성능과 비용 차이 측정

In [None]:
SHOPPING_AGENT_PROMPT = """## Role Definition
You are a comprehensive e-commerce assistant specializing in both intelligent product discovery and order management. Your dual mission is to help users find the most relevant products from a comprehensive catalog while also providing personalized support for their order history and account management needs.

## Key Responsibilities

### Product Discovery Functions
- **Keyword-Based Product Search**: Utilize the `keyword_product_search` function to locate products matching user queries
- **Query Analysis**: Analyze user requests to extract the most relevant and effective search keywords
- **Contextual Understanding**: Leverage conversation history and user data to understand preferences and shopping patterns
- **Personalized Product Recommendations**: Tailor product suggestions based on user persona, order history, and discount preferences

### Order Management Functions
- **Order Status Assistance**: Help users track current orders, check delivery status, and understand order timelines
- **Order History Analysis**: Provide insights into past purchases, identify patterns, and suggest reorders
- **Account Support**: Assist with general account inquiries
- **Cross-Reference Intelligence**: Use order history to inform product recommendations and vice versa

## Interaction Methodology

### Step 1: Intent Classification and Context Analysis
- **Primary Intent Recognition**: Determine if the user is:
  - Seeking new products (product discovery mode)
  - Inquiring about existing orders (order management mode)
  - Looking for account assistance (support mode)
  - Requesting hybrid assistance (both product and order related)

- **Context Integration**: Consider:
  - User's order history patterns
  - Previous search preferences
  - Seasonal timing and relevance
  - User persona characteristics
  - Current order status if applicable

### Step 2: Personalized Response Strategy
- **Product Discovery Path**: When users seek new products
  - Extract keywords using established product vocabulary
  - Execute targeted search with personalization filters
  - Integrate order history insights for better recommendations
  - Consider replenishment needs based on past purchases

- **Order Management Path**: When users inquire about orders
  - Analyze specific order details and status
  - Provide comprehensive order information
  - Identify opportunities for related product suggestions
  - Address any concerns or questions proactively

- **Hybrid Approach**: When requests involve both aspects
  - Balance product recommendations with order information
  - Use order context to enhance product suggestions
  - Provide seamless transition between discovery and management

### Step 3: Execution and Response Delivery
- **Unified Information Gathering**: Collect relevant data from both product catalog and order systems
- **Intelligent Prioritization**: Present most relevant information first based on user's immediate needs
- **Cross-Platform Integration**: Seamlessly reference both product and order data in responses

## Product Search Guidelines

### Approved Product Keywords
Use only these keywords for product searches:

#### Specific Product Categories
- **Apparel**: jacket, shirt, sneaker, boot, scarf, belt, socks, sandals
- **Electronics**: camera, television, computer, headphones, speaker, microphone
- **Furniture**: tables, chairs, sofas, dressers, cushion
- **Kitchen**: cooking, kitchen, bowls
- **Jewelry**: earrings, necklace, bracelet, watch
- **Tools**: hammer, drill, saw, screwdriver, wrench, plier, axe
- **Outdoor**: camping, fishing, kayaking, travel
- **Decorative**: decorative, lighting, clock, plant, bouquet, centerpiece, wreath, arrangement

#### General Categories and Occasions
- **General Categories**: apparel, electronics, furniture, kitchen, decorative
- **Occasions**: christmas, halloween, easter, valentine, formal
- **Activities**: travel, camping, fishing, cooking, bathing, grooming
- **Food Categories**: fruits, vegetables, dairy, seafood, bakery

## Order History Integration

### Leveraging Order Data for Enhanced Recommendations
- **Replenishment Suggestions**: Identify consumable items that may need reordering
- **Upgrade Opportunities**: Suggest improved versions of previously purchased items
- **Complementary Products**: Recommend items that pair with past purchases
- **Seasonal Patterns**: Recognize seasonal buying patterns and proactively suggest relevant items
- **Brand Loyalty Recognition**: Note preferred brands and prioritize similar options

### Order Status and Management
- **Comprehensive Order Information**: Provide detailed status, tracking, and timeline information
- **Proactive Communication**: Alert users to delays, delivery updates, or important order changes
- **Historical Context**: Reference past orders to provide better support context

## Response Format and Structure

### Unified Output Format
Your response must follow this specific format based on the type of assistance provided:

#### For Product Discovery (with or without order context):
```
[Your personalized response discussing specific products and any relevant order context]

<|PRODUCTS|>
[comma-separated list of product IDs you specifically mentioned]
<|/PRODUCTS|>
```

#### For Order Management (with or without product suggestions):
```
[Your order management response with any relevant product suggestions]

<|ORDERS|>
[comma-separated list of order IDs you specifically mentioned or want to highlight]
<|/ORDERS|>
```

#### For Hybrid Responses (both products and orders):
```
[Your comprehensive response covering both products and orders]

<|PRODUCTS|>
[comma-separated list of product IDs you specifically mentioned]
<|/PRODUCTS|>

<|ORDERS|>
[comma-separated list of order IDs you specifically mentioned]
<|/ORDERS|>
```

### Critical Formatting Rules
- Always provide complete text response first
- Limit your text response to a single sentence: "주문번호 X는 X년 X월 X일 주문 되었고 X년 X월 X일 배송 되었습니다. 혹시 상품을 받지 못하셨으면 고객 센터로 문의 바랍니다."
- Use exact delimiter formats: `<|PRODUCTS|>`, `<|/PRODUCTS|>`, `<|ORDERS|>`, `<|/ORDERS|>`
- Be careful with the forward slash in the delimiters
- Include only IDs you specifically discussed or highlighted
- No additional text after closing delimiters
- Ensure all IDs are from actual search results or order data

## Personalization Strategy

### User Persona Integration
- **Lifestyle Alignment**: Match recommendations to user's demonstrated preferences and lifestyle
- **Quality Preferences**: Adjust suggestions based on user's purchase history and quality expectations
- **Brand Affinity**: Consider user's brand preferences from both persona data and order history
- **Price Sensitivity**: Align recommendations with user's discount persona and spending patterns
- **Be Implicit**: Do not explicitly mention the user's persona or discount persona in your response

### Historical Pattern Recognition
- **Purchase Frequency**: Identify regular buying patterns and suggest timely replenishments
- **Seasonal Behavior**: Recognize seasonal shopping habits and provide relevant suggestions
- **Category Preferences**: Understand favored product categories and prioritize accordingly
- **Evolution Tracking**: Notice changes in preferences over time and adapt recommendations

## Error Handling and Edge Cases

### Data Availability Issues
- **No Order History**: Focus on product discovery while acknowledging new customer status
- **No Product Results**: Suggest alternative searches and use order history for context
- **Incomplete Information**: Gracefully handle missing data while providing available assistance

### Technical Challenges
- **Search Function Errors**: Provide helpful alternatives and maintain service quality
- **Order System Issues**: Offer alternative support channels while attempting resolution
- **Data Inconsistencies**: Prioritize user experience while noting discrepancies appropriately

### Data Utilization Guidelines
- Respect user privacy while leveraging data for personalization
- Use historical data to enhance current recommendations
- Maintain consistency with established user preferences"""

print(f"시스템 프롬프트 길이: {len(SHOPPING_AGENT_PROMPT)} 글자")

In [None]:
order_history = "주문 ID: ORD123456\n주문 날짜: 2023년 10월 10일\n배송 상태: 배송 완료\n예상 도착 날짜: 2023년 10월 12일\n배송 주소: 서울시 강남구 역삼동 123-45\n\n주문 상세 정보:\n- 제품 ID: PROD7890 (블루투스 헤드폰)\n  수량: 1\n  가격: 150,000원\n- 제품 ID: PROD1112 (무선 충전기)\n  수량: 1\n  가격: 30,000원\n\n총 주문 금액: 180,000원\n결제 방식: 신용카드\n```"
user_prompt = f"Order History of user 15: {order_history}\n\n 내가 마지막으로 시킨거 배송 상태 알려줘."

inference_config = {
    "temperature": 0.0,
}

### 테스트 1: 캐싱 없이 호출

먼저 캐싱을 사용하지 않고 호출해보겠습니다.
- 시스템 프롬프트: ~8,158글자 (상당히 긴 프롬프트)
- 사용자 질문: 주문 상태 확인

In [None]:
%%time

response = converse_bedrock(
    system_prompt=SHOPPING_AGENT_PROMPT,
    message=user_prompt,
    cache_system=False,
    model_id=BedrockModelId.AMAZON_NOVA_PRO.value,
    inference_config=inference_config
)

print(f"LLM Response: {response['output']}")
print(f"Token usage: {response['usage']}")
print(f"Latency: {response['metrics']['latencyMs']}")

Input Token 길이: 1,942

In [None]:
%%timeit

response = converse_bedrock(
    system_prompt=SHOPPING_AGENT_PROMPT,
    message=user_prompt,
    cache_system=False,
    model_id=BedrockModelId.AMAZON_NOVA_PRO.value,
    inference_config=inference_config
)

### 테스트 2: 캐싱 적용

이제 동일한 프롬프트에 캐싱을 적용해보겠습니다.
- `cache_system=True`로 시스템 프롬프트 캐싱
- 캐싱 확인
- 성능 차이 확인

In [None]:
%%time

response = converse_bedrock(
    system_prompt=SHOPPING_AGENT_PROMPT,
    message=user_prompt,
    cache_system=True,
    model_id=BedrockModelId.AMAZON_NOVA_PRO.value,
    inference_config=inference_config
)

print(f"LLM Response: {response['output']}")
print(f"Token usage: {response['usage']}")
print(f"Latency: {response['metrics']['latencyMs']}")

In [None]:
%%timeit

response = converse_bedrock(
    system_prompt=SHOPPING_AGENT_PROMPT,
    message=user_prompt,
    cache_system=True,
    model_id=BedrockModelId.AMAZON_NOVA_PRO.value,
    inference_config=inference_config
)

## 실습 2: 순서의 중요성 확인

프레젠테이션에서 강조한 **"앞에서부터 100% 일치"** 원칙을 확인해보겠습니다.

### 실험: 앞에 한 글자만 추가하면?
원본 프롬프트 앞에 "." 하나만 추가해서 캐시가 미스되는지 확인해보겠습니다.

In [None]:
NEW_PROMPT = "." + SHOPPING_AGENT_PROMPT

In [None]:
response = converse_bedrock(
    system_prompt=NEW_PROMPT,
    message=user_prompt,
    cache_system=True,
    model_id=BedrockModelId.AMAZON_NOVA_PRO.value,
    inference_config=inference_config
)

print(f"LLM Response: {response['output']}")
print(f"Token usage: {response['usage']}")
print(f"Latency: {response['metrics']['latencyMs']}")

### 📊 결과 분석

앞에 `.` 하나 붙였을 뿐인데 캐시 miss가 되는 것을 볼 수 있습니다.

**핵심 교훈**: 
- 트랜스포머에서 앞의 토큰은 뒤의 모든 토큰에 영향을 줍니다
- 따라서 앞에서부터 100% 일치해야만 캐시 재사용 가능
- 프롬프트 설계 시 **고정 부분을 앞에 배치**하는 것이 중요!

In [None]:
NEW_NEW_PROMPT = NEW_PROMPT + "."

response = converse_bedrock(
    system_prompt=NEW_NEW_PROMPT,
    message=user_prompt,
    cache_system=True,
    model_id=BedrockModelId.AMAZON_NOVA_PRO.value,
    inference_config=inference_config
)

print(f"LLM Response: {response['output']}")
print(f"Token usage: {response['usage']}")
print(f"Latency: {response['metrics']['latencyMs']}")

### 📊 결과 분석

뒤에 `.` 하나 붙였는데 캐시 miss가 되는 것을 볼 수 있습니다.

**핵심 교훈**: 
- 캐시포인트 찍어준 지점이 달라지면 안 된다
- 임의로 캐시 hit을 판별할 수 있는 능력은 없다
- 프롬프트 설계 시 **고정 부분을 앞에 배치**하는 것이 중요!

## 실습 3: 사전 로딩 vs 캐싱 전략 비교

이제 두 가지 접근 방식을 비교해보겠습니다:

### 방법 A: 시스템 프롬프트에 사전 로딩
- 사용자 정보와 주문 내역을 시스템 프롬프트에 포함
- 단점: 사용자별로 다른 캐시 생성

### 방법 B: 메시지에 캐싱 적용
- 시스템 프롬프트는 공통으로 캐싱
- 사용자 정보는 메시지 레벨에서 캐싱
- 장점: 더 효율적인 캐시 전략

In [None]:
user_15_info = {
    'discount_persona': 'lower_priced_products',
    'last_name': '김',
    'first_name': '민수',
    'addresses': {'address':'서울시 강남구 테헤란로 152길 신사동 301호', 'zipcode':'06294'},
    'username': 'user15',
    'persona': 'seasonal_furniture_floral',
    'id': '15',
    'email': 'minsu.kim@example.com',
}

user_15_order_history = {'messageVersion': '1.0',
 'response': {'orders': [{'order_id': '207',
    'timestamp': '2020-06-25 15:58:05',
    'item_id': '408410b1-8914-4d3c-aebf-4375d7c36feb',
    'delivery_status': 'delivered'},
   {'order_id': '808',
    'timestamp': '2020-08-08 23:09:22',
    'item_id': '2b8f89d0-4078-4701-8aac-89c48d8ba392',
    'delivery_status': 'delivered'},
   {'order_id': '466',
    'timestamp': '2020-07-13 14:02:39',
    'item_id': 'a56308af-abd2-41a0-9cf3-b4a040fd8d3f',
    'delivery_status': 'delivered'},
   {'order_id': '307',
    'timestamp': '2020-07-02 14:48:07',
    'item_id': '15a367e4-4039-432a-b16e-562e3d02f3db',
    'delivery_status': 'delivered'},
   {'order_id': '351',
    'timestamp': '2020-07-04 14:05:14',
    'item_id': '59719e89-9677-4201-8aad-fb0157bbd30c',
    'delivery_status': 'delivered'},
   {'order_id': '843',
    'timestamp': '2020-08-11 18:19:23',
    'item_id': '59719e89-9677-4201-8aad-fb0157bbd30c',
    'delivery_status': 'delivered'},
   {'order_id': '785',
    'timestamp': '2020-08-07 06:47:18',
    'item_id': '1fd2dbd0-c0a4-40f7-b7f7-94c0d94a751d',
    'delivery_status': 'delivered'},
   {'order_id': '256',
    'timestamp': '2020-06-29 05:15:13',
    'item_id': 'a56308af-abd2-41a0-9cf3-b4a040fd8d3f',
    'delivery_status': 'delivered'},
   {'order_id': '115',
    'timestamp': '2020-06-18 09:12:01',
    'item_id': '68ba7c82-7669-4810-b3f7-44215fc4248b',
    'delivery_status': 'delivered'},
   {'order_id': '809',
    'timestamp': '2020-08-08 23:09:34',
    'item_id': '2b8f89d0-4078-4701-8aac-89c48d8ba392',
    'delivery_status': 'delivered'},
   {'order_id': '527',
    'timestamp': '2020-07-18 08:09:46',
    'item_id': 'a56308af-abd2-41a0-9cf3-b4a040fd8d3f',
    'delivery_status': 'delivered'},
   {'order_id': '206',
    'timestamp': '2020-06-25 15:57:57',
    'item_id': '408410b1-8914-4d3c-aebf-4375d7c36feb',
    'delivery_status': 'delivered'},
   {'order_id': '87',
    'timestamp': '2020-06-16 17:46:43',
    'item_id': 'a56308af-abd2-41a0-9cf3-b4a040fd8d3f',
    'delivery_status': 'delivered'},
   {'order_id': '117',
    'timestamp': '2020-06-18 19:20:36',
    'item_id': '2b8f89d0-4078-4701-8aac-89c48d8ba392',
    'delivery_status': 'delivered'},
   {'order_id': '807',
    'timestamp': '2020-08-08 23:09:04',
    'item_id': '2b8f89d0-4078-4701-8aac-89c48d8ba392',
    'delivery_status': 'delivered'},
   {'order_id': '848',
    'timestamp': '2020-08-12 00:38:05',
    'item_id': '408410b1-8914-4d3c-aebf-4375d7c36feb',
    'delivery_status': 'delivered'},
   {'order_id': '306',
    'timestamp': '2020-07-02 14:47:55',
    'item_id': '15a367e4-4039-432a-b16e-562e3d02f3db',
    'delivery_status': 'delivered'},
   {'order_id': '232',
    'timestamp': '2020-06-27 15:29:54',
    'item_id': '2b8f89d0-4078-4701-8aac-89c48d8ba392',
    'delivery_status': 'delivered'},
   {'order_id': '630',
    'timestamp': '2020-07-26 00:43:09',
    'item_id': 'a56308af-abd2-41a0-9cf3-b4a040fd8d3f',
    'delivery_status': 'delivered'},
   {'order_id': '247',
    'timestamp': '2020-06-28 07:56:44',
    'item_id': 'aca48197-40cf-47b1-8eb3-ceeb29f7f23e',
    'delivery_status': 'delivered'},
   {'order_id': '146',
    'timestamp': '2020-06-20 19:35:52',
    'item_id': '15a367e4-4039-432a-b16e-562e3d02f3db',
    'delivery_status': 'delivered'},
   {'order_id': '383',
    'timestamp': '2020-07-06 14:38:14',
    'item_id': '59719e89-9677-4201-8aad-fb0157bbd30c',
    'delivery_status': 'delivered'},
   {'order_id': '88',
    'timestamp': '2020-06-16 17:46:50',
    'item_id': 'a56308af-abd2-41a0-9cf3-b4a040fd8d3f',
    'delivery_status': 'delivered'},
   {'order_id': '629',
    'timestamp': '2020-07-26 00:42:43',
    'item_id': 'a56308af-abd2-41a0-9cf3-b4a040fd8d3f',
    'delivery_status': 'delivered'},
   {'order_id': '406',
    'timestamp': '2020-07-09 08:29:14',
    'item_id': '4a43c5f7-090c-4cce-93fe-36062539ec38',
    'delivery_status': 'delivered'},
   {'order_id': '41',
    'timestamp': '2020-06-13 16:06:31',
    'item_id': '408410b1-8914-4d3c-aebf-4375d7c36feb',
    'delivery_status': 'delivered'},
   {'order_id': '726',
    'timestamp': '2020-08-02 19:07:27',
    'item_id': '68ba7c82-7669-4810-b3f7-44215fc4248b',
    'delivery_status': 'delivered'},
   {'order_id': '405',
    'timestamp': '2020-07-09 08:29:06',
    'item_id': '4a43c5f7-090c-4cce-93fe-36062539ec38',
    'delivery_status': 'delivered'},
   {'order_id': '853',
    'timestamp': '2020-08-12 08:50:59',
    'item_id': 'a56308af-abd2-41a0-9cf3-b4a040fd8d3f',
    'delivery_status': 'delivered'},
   {'order_id': '540',
    'timestamp': '2020-07-19 10:57:03',
    'item_id': '408410b1-8914-4d3c-aebf-4375d7c36feb',
    'delivery_status': 'delivered'},
   {'order_id': '501',
    'timestamp': '2020-07-15 15:04:49',
    'item_id': '4a43c5f7-090c-4cce-93fe-36062539ec38',
    'delivery_status': 'delivered'},
   {'order_id': '528',
    'timestamp': '2020-07-18 08:09:47',
    'item_id': 'a56308af-abd2-41a0-9cf3-b4a040fd8d3f',
    'delivery_status': 'delivered'},
   {'order_id': '786',
    'timestamp': '2020-08-07 06:47:33',
    'item_id': '1fd2dbd0-c0a4-40f7-b7f7-94c0d94a751d',
    'delivery_status': 'delivered'},
   {'order_id': '787',
    'timestamp': '2020-08-07 06:47:36',
    'item_id': '1fd2dbd0-c0a4-40f7-b7f7-94c0d94a751d',
    'delivery_status': 'delivered'},
   {'order_id': '225',
    'timestamp': '2020-06-27 01:36:46',
    'item_id': '15a367e4-4039-432a-b16e-562e3d02f3db',
    'delivery_status': 'delivered'},
   {'order_id': '590',
    'timestamp': '2020-07-22 23:57:20',
    'item_id': '4a43c5f7-090c-4cce-93fe-36062539ec38',
    'delivery_status': 'delivered'},
   {'order_id': '604',
    'timestamp': '2020-07-24 11:35:27',
    'item_id': '4a43c5f7-090c-4cce-93fe-36062539ec38',
    'delivery_status': 'delivered'},
   {'order_id': '889',
    'timestamp': '2020-08-15 08:56:44',
    'item_id': 'a56308af-abd2-41a0-9cf3-b4a040fd8d3f',
    'delivery_status': 'paymentReceived'},
   {'order_id': '894',
    'timestamp': '2020-08-15 12:01:47',
    'item_id': '15a367e4-4039-432a-b16e-562e3d02f3db',
    'delivery_status': 'paymentReceived'},
   {'order_id': '895',
    'timestamp': '2020-08-15 23:20:12',
    'item_id': '408410b1-8914-4d3c-aebf-4375d7c36feb',
    'delivery_status': 'paymentReceived'},
   {'order_id': '897',
    'timestamp': '2020-08-16 02:23:28',
    'item_id': 'a56308af-abd2-41a0-9cf3-b4a040fd8d3f',
    'delivery_status': 'paymentReceived'},
   {'order_id': '949',
    'timestamp': '2020-08-19 23:47:29',
    'item_id': '1fd2dbd0-c0a4-40f7-b7f7-94c0d94a751d',
    'delivery_status': 'processed'},
   {'order_id': '927',
    'timestamp': '2020-08-18 07:07:45',
    'item_id': '4a43c5f7-090c-4cce-93fe-36062539ec38',
    'delivery_status': 'processed'},
   {'order_id': '925',
    'timestamp': '2020-08-18 07:07:20',
    'item_id': '4a43c5f7-090c-4cce-93fe-36062539ec38',
    'delivery_status': 'processed'},
   {'order_id': '926',
    'timestamp': '2020-08-18 07:07:28',
    'item_id': '4a43c5f7-090c-4cce-93fe-36062539ec38',
    'delivery_status': 'processed'},
   {'order_id': '923',
    'timestamp': '2020-08-18 03:58:34',
    'item_id': 'a56308af-abd2-41a0-9cf3-b4a040fd8d3f',
    'delivery_status': 'processed'}]}}

PRELOADED_PROMPT_USER_15 = SHOPPING_AGENT_PROMPT + f"\n\n{user_15_info}\n\nOrder History of user 15: {user_15_order_history}"

In [None]:
response = converse_bedrock(
    system_prompt=PRELOADED_PROMPT_USER_15,
    message="가장 최근에 주문한고 배송 상태 확인 좀 해줘",
    cache_system=True,
    model_id=BedrockModelId.AMAZON_NOVA_PRO.value,
    inference_config=inference_config
)

print(f"LLM Response: {response['output']}")
print(f"Token usage: {response['usage']}")
print(f"Latency: {response['metrics']['latencyMs']}")

In [None]:
user_314_info = {
    'discount_persona': 'lower_priced_products', 
    'last_name': '이', 
    'first_name': '준호', 
    'addresses': {'address':'서울시 종로구 세종대로 89길 청운동 205호', 'zipcode':'03032'}, 
    'username': 'user314', 
    'persona': 'books_apparel_homedecor', 
    'id': '314', 
    'email': 'junho.lee@example.com', 
    'gender': 'M', 
    'age': '31'
}

user_314_order_history = {'messageVersion': '1.0',
 'response': {'orders': [{'order_id': '828',
    'timestamp': '2020-08-10 19:24:22',
    'item_id': '49dd74a3-7d11-454b-b2b1-9d40f0fef566',
    'delivery_status': 'delivered'},
   {'order_id': '325',
    'timestamp': '2020-07-03 03:58:07',
    'item_id': '0451dee1-8367-4bf7-95a8-1e96c4f96784',
    'delivery_status': 'delivered'},
   {'order_id': '625',
    'timestamp': '2020-07-25 21:44:50',
    'item_id': '59b807a4-23d7-49f4-b3c1-64035bb3f35b',
    'delivery_status': 'delivered'},
   {'order_id': '713',
    'timestamp': '2020-08-01 21:39:02',
    'item_id': '7d676725-83fd-47a4-bd64-8acbed6a5e74',
    'delivery_status': 'delivered'},
   {'order_id': '694',
    'timestamp': '2020-07-31 10:37:03',
    'item_id': '7d676725-83fd-47a4-bd64-8acbed6a5e74',
    'delivery_status': 'delivered'},
   {'order_id': '605',
    'timestamp': '2020-07-24 13:48:40',
    'item_id': 'c108b04b-64fd-43bd-80fc-4819eb01803a',
    'delivery_status': 'delivered'},
   {'order_id': '2',
    'timestamp': '2020-06-10 17:39:55',
    'item_id': 'b840c965-bff0-481f-8d65-e3c2142a39c5',
    'delivery_status': 'delivered'},
   {'order_id': '8',
    'timestamp': '2020-06-10 23:26:46',
    'item_id': '589823cc-89f2-4bd5-8cab-e89478f530ea',
    'delivery_status': 'delivered'},
   {'order_id': '170',
    'timestamp': '2020-06-22 16:46:38',
    'item_id': 'a228856e-ed94-48e3-9af3-575d20565bde',
    'delivery_status': 'delivered'},
   {'order_id': '142',
    'timestamp': '2020-06-20 11:43:59',
    'item_id': 'b840c965-bff0-481f-8d65-e3c2142a39c5',
    'delivery_status': 'delivered'},
   {'order_id': '169',
    'timestamp': '2020-06-22 16:46:03',
    'item_id': 'a228856e-ed94-48e3-9af3-575d20565bde',
    'delivery_status': 'delivered'},
   {'order_id': '274',
    'timestamp': '2020-06-30 06:43:59',
    'item_id': '49dd74a3-7d11-454b-b2b1-9d40f0fef566',
    'delivery_status': 'delivered'},
   {'order_id': '75',
    'timestamp': '2020-06-16 04:50:34',
    'item_id': 'b840c965-bff0-481f-8d65-e3c2142a39c5',
    'delivery_status': 'delivered'},
   {'order_id': '401',
    'timestamp': '2020-07-08 23:53:03',
    'item_id': '0451dee1-8367-4bf7-95a8-1e96c4f96784',
    'delivery_status': 'delivered'},
   {'order_id': '573',
    'timestamp': '2020-07-22 04:20:07',
    'item_id': 'b840c965-bff0-481f-8d65-e3c2142a39c5',
    'delivery_status': 'delivered'},
   {'order_id': '567',
    'timestamp': '2020-07-21 06:37:06',
    'item_id': '0451dee1-8367-4bf7-95a8-1e96c4f96784',
    'delivery_status': 'delivered'},
   {'order_id': '569',
    'timestamp': '2020-07-21 18:49:18',
    'item_id': '8f08be25-5967-4a2f-82b5-324393201611',
    'delivery_status': 'delivered'},
   {'order_id': '366',
    'timestamp': '2020-07-05 17:19:28',
    'item_id': 'a228856e-ed94-48e3-9af3-575d20565bde',
    'delivery_status': 'delivered'},
   {'order_id': '650',
    'timestamp': '2020-07-27 21:56:27',
    'item_id': 'c108b04b-64fd-43bd-80fc-4819eb01803a',
    'delivery_status': 'delivered'},
   {'order_id': '556',
    'timestamp': '2020-07-20 11:16:49',
    'item_id': 'a228856e-ed94-48e3-9af3-575d20565bde',
    'delivery_status': 'delivered'},
   {'order_id': '388',
    'timestamp': '2020-07-06 21:04:43',
    'item_id': '0451dee1-8367-4bf7-95a8-1e96c4f96784',
    'delivery_status': 'delivered'},
   {'order_id': '46',
    'timestamp': '2020-06-14 02:33:11',
    'item_id': 'c108b04b-64fd-43bd-80fc-4819eb01803a',
    'delivery_status': 'delivered'},
   {'order_id': '7',
    'timestamp': '2020-06-10 23:26:38',
    'item_id': '589823cc-89f2-4bd5-8cab-e89478f530ea',
    'delivery_status': 'delivered'},
   {'order_id': '769',
    'timestamp': '2020-08-05 13:04:44',
    'item_id': 'b55e794f-1137-4de2-bcea-806041e080e6',
    'delivery_status': 'delivered'},
   {'order_id': '572',
    'timestamp': '2020-07-22 04:19:53',
    'item_id': 'b840c965-bff0-481f-8d65-e3c2142a39c5',
    'delivery_status': 'delivered'},
   {'order_id': '802',
    'timestamp': '2020-08-08 08:16:52',
    'item_id': '0451dee1-8367-4bf7-95a8-1e96c4f96784',
    'delivery_status': 'delivered'},
   {'order_id': '161',
    'timestamp': '2020-06-21 22:53:28',
    'item_id': 'e5816ea5-3ce2-4b86-9530-9b221b357b43',
    'delivery_status': 'delivered'},
   {'order_id': '235',
    'timestamp': '2020-06-27 17:30:28',
    'item_id': '59b807a4-23d7-49f4-b3c1-64035bb3f35b',
    'delivery_status': 'delivered'},
   {'order_id': '355',
    'timestamp': '2020-07-04 16:54:46',
    'item_id': 'e5816ea5-3ce2-4b86-9530-9b221b357b43',
    'delivery_status': 'delivered'},
   {'order_id': '354',
    'timestamp': '2020-07-04 16:54:31',
    'item_id': 'e5816ea5-3ce2-4b86-9530-9b221b357b43',
    'delivery_status': 'delivered'},
   {'order_id': '749',
    'timestamp': '2020-08-04 05:37:10',
    'item_id': 'e5816ea5-3ce2-4b86-9530-9b221b357b43',
    'delivery_status': 'delivered'},
   {'order_id': '977',
    'timestamp': '2020-08-21 23:34:46',
    'item_id': 'b840c965-bff0-481f-8d65-e3c2142a39c5',
    'delivery_status': 'inTransit'},
   {'order_id': '976',
    'timestamp': '2020-08-21 23:34:43',
    'item_id': 'b840c965-bff0-481f-8d65-e3c2142a39c5',
    'delivery_status': 'inTransit'},
   {'order_id': '975',
    'timestamp': '2020-08-21 23:34:35',
    'item_id': 'b840c965-bff0-481f-8d65-e3c2142a39c5',
    'delivery_status': 'inTransit'},
   {'order_id': '891',
    'timestamp': '2020-08-15 10:22:07',
    'item_id': '49dd74a3-7d11-454b-b2b1-9d40f0fef566',
    'delivery_status': 'paymentReceived'},
   {'order_id': '890',
    'timestamp': '2020-08-15 10:21:57',
    'item_id': '49dd74a3-7d11-454b-b2b1-9d40f0fef566',
    'delivery_status': 'paymentReceived'}]}}

PRELOADED_PROMPT_USER_314 = SHOPPING_AGENT_PROMPT + f"\n\n{user_314_info}\n\nOrder History of user 314: {user_314_order_history}"

In [None]:
response = converse_bedrock(
    system_prompt=PRELOADED_PROMPT_USER_314,
    message="가장 최근에 주문한거 배송 상태 확인 좀 해줘",
    cache_system=True,
    model_id=BedrockModelId.AMAZON_NOVA_PRO.value,
    inference_config=inference_config
)

print(f"LLM Response: {response['output']}")
print(f"Token usage: {response['usage']}")
print(f"Latency: {response['metrics']['latencyMs']}")

### 🤔 문제 발견!

사용자별로 시스템 프롬프트에 정보를 포함하면:
- 각 사용자마다 다른 캐시 포인트 생성
- 캐시 효율성 저하
- 비용 절감 효과 감소

**더 나은 방법이 필요합니다!**

---

## 실습 4: 메시지 레벨 캐싱 최적화

### 개선된 전략: 계층적 캐싱
1. **시스템 프롬프트**: 모든 사용자 공통 → 최대 재사용
2. **메시지 레벨**: 사용자별 정보 캐싱 → 개별 세션 최적화

### 메시지 구조 설계
```json
[
  {
    "role": "user", 
    "content": [
      {"text": "사용자 정보 + 주문 내역"},
      {"cachePoint": {"type": "default"}}
    ]
  },
  {"role": "assistant", "content": [{"text": "acknowledged"}]},
  {
    "role": "user",
    "content": [
      {"text": "실제 질문"},
      {"cachePoint": {"type": "default"}}
    ]
  }
]

In [None]:
messages_user15 = [
    {
        "role": "user",
        "content": [
            {
                "text": f"{json.dumps(user_15_info)}\n\nOrder History of user 15: {json.dumps(user_15_order_history)}"
            },
            {
                "cachePoint": {
                    "type": "default"
                }
            }
        ]
    },
    {
        "role": "assistant",
        "content": [
            {
                "text": "acknowledged"
            }
        ]
    },
    {
        "role": "user",
        "content": [
            {
                "text": "가장 최근에 주문한고 배송 상태 확인 좀 해줘"
            },
            {
                "cachePoint": {
                    "type": "default"
                }
            }
        ]
    }
]

In [None]:
response = converse_bedrock(
    system_prompt=SHOPPING_AGENT_PROMPT,
    message=messages_user15,
    cache_system=True,
    cache_messages=True,
    model_id=BedrockModelId.AMAZON_NOVA_PRO.value,
    inference_config=inference_config
)

print(f"LLM Response: {response['output']}")
print(f"Token usage: {response['usage']}")
print(f"Latency: {response['metrics']['latencyMs']}")

messages_user15.append(response['output']["message"])
messages_user15.append({
    "role": "user",
    "content": [
        {
            "text": "그 전에 주문한거는?"
        }
    ]
})

response_2 = converse_bedrock(
    system_prompt=SHOPPING_AGENT_PROMPT,
    message=messages_user15,
    cache_system=True,
    cache_messages=True,
    model_id=BedrockModelId.AMAZON_NOVA_PRO.value,
    inference_config=inference_config
)

print(f"LLM Response: {response_2['output']}")
print(f"Token usage: {response_2['usage']}")
print(f"Latency: {response_2['metrics']['latencyMs']}")


messages_user15.append(response_2['output']["message"])
messages_user15.append({
    "role": "user",
    "content": [
        {
            "text": "그 전에 주문한거는?"
        }
    ]
})

response_3 = converse_bedrock(
    system_prompt=SHOPPING_AGENT_PROMPT,
    message=messages_user15,
    cache_system=True,
    cache_messages=True,
    model_id=BedrockModelId.AMAZON_NOVA_PRO.value,
    inference_config=inference_config
)

print(f"LLM Response: {response_3['output']}")
print(f"Token usage: {response_3['usage']}")
print(f"Latency: {response_3['metrics']['latencyMs']}")

### 🎉 최적화 성공!

메시지 레벨 캐싱으로 다음 효과를 확인했습니다:

#### 캐시 효율성
- **1번째 호출**: `cacheWriteInputTokens: 6,185` (초기 캐시 생성)
- **2번째 호출**: `cacheReadInputTokens: 6,185`, `cacheWriteInputTokens: 136` (캐시에서 읽고 증분만 캐시 생성)
- **3번째 호출**: `cacheReadInputTokens: 6,321`, `cacheWriteInputTokens: 136` (캐시에서 읽고 증분만 캐시 생성)

#### 멀티턴 대화에서의 이점
- 대화가 길어질수록 캐시 재사용량 증가
- 응답 시간 절감
- 토큰 비용 대폭 절감

In [None]:
messages_user314 = [
    {
        "role": "user",
        "content": [
            {
                "text": f"{json.dumps(user_314_info)}\n\nOrder History of user 314: {json.dumps(user_314_order_history)}"
            },
            {
                "cachePoint": {
                    "type": "default"
                }
            }
        ]
    },
    {
        "role": "assistant",
        "content": [
            {
                "text": "acknowledged"
            }
        ]
    },
    {
        "role": "user",
        "content": [
            {
                "text": "가장 최근에 주문한고 배송 상태 확인 좀 해줘"
            },
            {
                "cachePoint": {
                    "type": "default"
                }
            }
        ]
    }
]

response = converse_bedrock(
    system_prompt=SHOPPING_AGENT_PROMPT,
    message=messages_user314,
    cache_system=True,
    cache_messages=True,
    model_id=BedrockModelId.AMAZON_NOVA_PRO.value,
    inference_config=inference_config
)

print(f"LLM Response: {response['output']}")
print(f"Token usage: {response['usage']}")
print(f"Latency: {response['metrics']['latencyMs']}")

### 🎉 최적화 성공!

User 15때 캐싱했던 System Prompt는 cache hit

#### 캐시 효율성
- **1번째 호출**: `cacheReadInputTokens: 1,630`, `cacheWriteInputTokens: 3,230` (System Prompt는 Cache Hit, User 314의 정보는 따로 preload)

---

## 🎯 워크샵 핵심 요약

### ✅ 검증된 최적화 패턴

#### 1. 시스템 프롬프트 캐싱
- 모든 사용자가 공유하는 긴 프롬프트에 적용
- `cache_system=True`로 간단 적용
- 8,000글자+ 프롬프트에서 큰 효과

#### 2. 순서의 중요성
- 앞에서부터 100% 일치해야 캐시 히트
- 고정 부분을 앞에, 변동 부분을 뒤에 배치
- 한 글자만 달라져도 캐시 미스!

#### 3. 계층적 캐싱 전략
- 시스템: 공통 프롬프트 캐싱
- 메시지: 사용자별 컨텍스트 캐싱
- 멀티턴 대화에서 효과 극대화

### 💡 즉시 적용 가능한 개선사항

1. **기존 긴 프롬프트에 캐시 적용**

2. **사용자별 정보는 메세지 레벨에서**

3. **대화 히스토리 캐싱으로 비용 절감**

**Remember** : 1번만 캐시 히트해도 비용은 이득!