# LLM 응답 시간 최적화 워크샵

프로덕션 환경에서 LLM 응답 시간을 최적화하는 실용적인 기법을 배우는 실습 워크샵에 오신 것을 환영합니다.

## 워크샵 목표
- 🔍 다양한 LLM 모델의 성능 특성 분석
- ⚡ 입력/출력 토큰이 응답 시간에 미치는 영향 측정
- 🛠️ 툴 사용량을 줄여 효율성을 높이는 전략 구현
- 📊 실제 시나리오에서의 최적화 기법 적용

## 사전 준비사항

이 워크샵을 진행하기 위해 필요한 라이브러리들을 설치합니다:
- AWS Bedrock SDK
- Strands 에이전트 프레임워크
- 기타 유틸리티 라이브러리

In [None]:
%pip install -r requirements.txt

In [None]:
from common_functions import converse_bedrock
from config import BedrockModelId
import json
from typing import List
import boto3
from boto3.dynamodb.conditions import Key
from strands import Agent, tool
from strands.models import BedrockModel

# 파트 1: 기준 성능 테스트

## 다양한 LLM 모델 성능 비교

동일한 조건에서 여러 모델의 응답 시간을 측정하여 기준선을 설정합니다.

### 테스트 모델:
- **Claude 4 Sonnet**
- **Claude 3.7 Sonnet**
- **Claude 3.5 Sonnet**
- **Claude 3.5 Haiku**
- **Amazon Nova Pro**
- **Amazon Nova Lite**

### 테스트 방법:
- 각 모델당 3회 테스트로 평균 지연시간 측정
- 동일한 시스템 프롬프트와 사용자 프롬프트 사용
- 매우 짧은 응답(1글자)으로 순수 모델 성능 측정

In [None]:
def test_llm_response_time(system_prompt, user_prompt):
    results = {
        "claude_4_sonnet": {
            "response": set(),
            "usage": set(),
            "latency": []
        },
        "claude_3_7_sonnet": {
            "response": set(),
            "usage": set(),
            "latency": []
        },
        "claude_3_5_sonnet": {
            "response": set(),
            "usage": set(),
            "latency": []
        },
        "claude_3_5_haiku": {
            "response": set(),
            "usage": set(),
            "latency": []
        },
        "amazon_nova_pro": {
            "response": set(),
            "usage": set(),
            "latency": []
        },
        "amazon_nova_lite": {
            "response": set(),
            "usage": set(),
            "latency": []
        }
    }

    for _ in range(3):  
        # Test Claude 4 Sonnet
        response = converse_bedrock(system_prompt, user_prompt, model_id=BedrockModelId.CLAUDE_4_SONNET_1_0.value)
        results["claude_4_sonnet"]["response"].add(json.dumps(response["output"]))
        results["claude_4_sonnet"]["usage"].add(json.dumps(response["usage"]))
        results["claude_4_sonnet"]["latency"].append(response["metrics"].get("latencyMs"))

        # Test Claude 3.7 Sonnet
        response = converse_bedrock(system_prompt, user_prompt, model_id=BedrockModelId.CLAUDE_3_7_SONNET_1_0.value)
        results["claude_3_7_sonnet"]["response"].add(json.dumps(response["output"]))
        results["claude_3_7_sonnet"]["usage"].add(json.dumps(response["usage"]))
        results["claude_3_7_sonnet"]["latency"].append(response["metrics"].get("latencyMs"))

        # Test Claude 3.5 Sonnet
        response = converse_bedrock(system_prompt, user_prompt, model_id=BedrockModelId.CLAUDE_3_5_SONNET_1_0.value)
        results["claude_3_5_sonnet"]["response"].add(json.dumps(response["output"]))
        results["claude_3_5_sonnet"]["usage"].add(json.dumps(response["usage"]))
        results["claude_3_5_sonnet"]["latency"].append(response["metrics"].get("latencyMs"))

        # Test Claude 3.5 Haiku
        response = converse_bedrock(system_prompt, user_prompt, model_id=BedrockModelId.CLAUDE_3_5_HAIKU_1_0.value)
        results["claude_3_5_haiku"]["response"].add(json.dumps(response["output"]))
        results["claude_3_5_haiku"]["usage"].add(json.dumps(response["usage"]))
        results["claude_3_5_haiku"]["latency"].append(response["metrics"].get("latencyMs"))

        # Test Amazon Nova Pro
        response = converse_bedrock(system_prompt, user_prompt, model_id=BedrockModelId.AMAZON_NOVA_PRO.value)
        results["amazon_nova_pro"]["response"].add(json.dumps(response["output"]))
        results["amazon_nova_pro"]["usage"].add(json.dumps(response["usage"]))
        results["amazon_nova_pro"]["latency"].append(response["metrics"].get("latencyMs"))

        # Test Amazon Nova Lite
        response = converse_bedrock(system_prompt, user_prompt, model_id=BedrockModelId.AMAZON_NOVA_LITE.value)
        results["amazon_nova_lite"]["response"].add(json.dumps(response["output"]))
        results["amazon_nova_lite"]["usage"].add(json.dumps(response["usage"]))
        results["amazon_nova_lite"]["latency"].append(response["metrics"].get("latencyMs"))

        
    for model, model_results in results.items():
        print(f"{model}:")
        print(f"Response: {model_results['response']}")
        print(f"Usage: {model_results['usage']}")
        print(f"Latency: {model_results['latency']}")
        print("-" * 50)

    return results

In [None]:
system_prompt="Whatever the user asks, respond with 1 and 1 only"
user_prompt="Test"

results = test_llm_response_time(system_prompt, user_prompt)

### 📊 결과 분석

각 모델의 **Time To First Token (TTFT)** 를 측정해봤습니다.

# Input Token이 응답속도에 미치는 영향

## 실험: 긴 입력이 처리 시간에 미치는 영향

입력 토큰 수가 응답 시간에 어떤 영향을 미치는지 알아봅시다.

### 가설
- **H0**: 입력 토큰 수는 응답 시간에 큰 영향을 주지 않는다
- **H1**: 입력 토큰이 많을수록 처리 시간이 길어진다

### 실험 설계
- **짧은 입력**: ~23 토큰 (간단한 프롬프트)
- **긴 입력**: ~965 토큰 (1000단어 랜덤 텍스트)
- **출력 고정**: 동일한 짧은 응답으로 출력 영향 제거

In [None]:
random_prompt = """# A 1000-Word Random Sentence
The peculiar purple elephant who lived beneath the sprawling oak tree that grew beside the ancient stone bridge crossing the meandering river filled with shimmering silver fish swimming in elaborate patterns around submerged logs covered in emerald moss dotted with tiny yellow flowers blooming despite the shadowy underwater environment where curious otters played hide-and-seek among the swaying reeds while overhead magnificent eagles soared through cotton-candy clouds drifting lazily across the azure sky painted with streaks of golden sunlight filtering through branches heavy with crimson leaves dancing in the gentle autumn breeze that carried the sweet scent of honeysuckle mixed with the earthy aroma of freshly turned soil where industrious ants marched in perfect formation carrying crumbs from the picnic basket forgotten by the young couple who had been feeding breadcrumbs to the chattering squirrels gathering acorns for winter while their faithful golden retriever chased butterflies fluttering between clusters of wildflowers blooming in vibrant shades of purple, pink, and orange across the meadow where children often flew colorful kites shaped like dragons and eagles on windy afternoons when the weather was perfect for outdoor adventures and exploring the mysterious caves hidden behind the waterfall cascading down the rocky cliff face worn smooth by centuries of rushing water creating rainbow mists that sparkled like diamonds in the afternoon sun while hikers paused to photograph the breathtaking scenery and listen to the symphony of birdsong echoing through the forest where ancient trees stood like silent sentinels guarding secrets whispered by the wind through their branches heavy with Spanish moss swaying hypnotically above the forest floor carpeted with fallen leaves crunching softly underfoot as woodland creatures scurried about their daily business of foraging for nuts and berries growing wild among the ferns and mushrooms sprouting from rotting logs that provided homes for countless insects buzzing and crawling in the perpetual cycle of life and death that governs all living things in this enchanted wilderness where time seems to move differently and worries fade away like morning mist rising from the pond where lily pads float serenely on the still surface reflecting the images of clouds and trees and occasionally disturbed by the gentle splash of a frog jumping from pad to pad or a fish rising to catch an unsuspecting fly hovering just above the water where dragonflies dart and dance in their eternal ballet of survival and reproduction while turtles sunbathe on half-submerged logs and great blue herons stand motionless as statues waiting patiently for the perfect moment to strike at their prey with lightning-fast precision that would make even the most skilled archer envious of such natural talent honed by millions of years of evolution that has shaped every creature in this delicate ecosystem where predator and prey exist in careful balance maintained by the invisible hand of nature orchestrating the complex web of relationships between all living things from the tiniest microorganisms in the soil to the majestic deer grazing peacefully in sun-dappled clearings where wildflowers bloom in profusion attracting bees and butterflies whose pollination ensures the continuation of the endless cycle of growth and renewal that defines the natural world in all its magnificent complexity and beauty that inspires artists and poets and philosophers to contemplate the deeper meanings of existence while scientists study the intricate mechanisms that govern biological processes and environmental interactions in their quest to understand the fundamental principles underlying the miraculous phenomenon we call life on this remarkable planet spinning through space in its eternal orbit around the sun that provides the energy powering all the complex chemical reactions that sustain the incredible diversity of organisms inhabiting every conceivable niche from the deepest ocean trenches to the highest mountain peaks where hardy plants and animals have adapted to survive in the most extreme conditions imaginable demonstrating the remarkable resilience and creativity of evolution in finding solutions to seemingly impossible challenges through the gradual accumulation of beneficial mutations over countless generations resulting in the stunning array of life forms that populate our world today including the purple elephant who started this meandering journey through the interconnected web of natural wonders that surround us everywhere if we only take the time to notice and appreciate the extraordinary beauty and complexity hidden in plain sight all around us every single day of our lives had finally decided after much deliberation and careful consideration of all the available options to venture forth from his comfortable home beneath the oak tree to explore the wider world beyond the familiar boundaries of his peaceful forest sanctuary."""

long_input_results = test_llm_response_time(system_prompt, random_prompt)

In [None]:
for key in results.keys():
    print(key)
    print(f"Short input latency: {results[key]['latency']}")
    print(f"Long input latency: {long_input_results[key]['latency']}")
    print("-"*50)

### 📊 입력 토큰 영향 분석

입력 토큰 수의 40배 증가에도 불구하고 응답 시간은 40배까지 늘어나지는 않음

**결론**: 트랜스포머의 병렬 처리 덕분에 입력 길이가 TTFT에 미치는 영향은 제한적

# Output Token이 응답속도에 미치는 영향

## 현실: 출력 길이가 총 응답 시간의 핵심

출력 토큰은 순차적으로 생성되므로 총 응답 시간에 직접적인 영향을 미칩니다.

### 프로덕션에서의 의미

**✅ 스트리밍 시나리오** (사용자 대면)
- TTFT(첫 토큰까지의 시간)가 핵심
- 사용자가 즉시 응답을 보기 시작
- 출력 길이는 덜 중요

**❌ 비스트리밍 시나리오** (에이전트 추론)
- 전체 응답 완료까지 기다려야 함
- 긴 추론은 전체 파이프라인 지연
- **이 경우 출력 길이 최적화가 중요!**

In [None]:
system_prompt_fixed_output="""This is a response time test for measuring TPS. Whatever the user asks, respond with this sentence and this sentence only: 
The mischievous cat wearing tiny purple socks who lived in the cluttered attic above the old bakery where warm cinnamon rolls baked every morning while jazz music played softly from the vintage radio sitting precariously on the windowsill overlooking the bustling street filled with colorful umbrellas dancing in the gentle rain that created puddles reflecting neon signs from the corner deli where the friendly owner always saved day-old sandwiches for stray dogs wandering through the neighborhood searching for scraps and kindness from strangers who might pause long enough to offer a gentle pat or encouraging word decided to chase butterflies."""

In [None]:
results = test_llm_response_time(system_prompt_fixed_output, user_prompt)

지금까지 TTFT와 입/출력 토큰이 응답속도에 미치는 영향에 대해서 학습했습니다.

다음 단계에서는 위 영향을 활용하는 실전 사용 전략에 대해 알아보겠습니다.

# 파트 2: LLM의 응답속도를 이해한 후 실전에 적용해서 응답 속도 줄여보기

## Tool 호출이 응답속도에 미치는 영향

### Tool 호출의 실제 비용
Tool 호출이 발생하면 다음과 같은 순차적 과정을 거칩니다:
1. 🤖 **LLM 추론 중단** → Tool 사용 결정
2. 🔧 **Tool 요청** → 외부 API/DB 호출 
3. 📊 **Tool 결과 대기** → 네트워크 I/O 시간
4. 🤖 **LLM 재호출** → 결과를 포함한 새로운 추론

**핵심 문제**: 각 Tool 호출마다 전체 파이프라인이 중단되고 재시작됩니다!

---

## 전략 #1: 사전 로딩으로 Tool 호출 완전 제거

### 핵심 아이디어: "이미 알 수 있는 정보는 미리 준비하자"

#### 쇼핑 어시스턴트 시나리오 분석
사용자가 쇼핑몰에 접속하는 순간, 우리는 이미 많은 정보를 알고 있습니다:

**🔍 즉시 알 수 있는 정보:**
- 사용자 ID, 이름, 주소
- 최근 주문 내역 (지난 6개월)
- 찜한 상품 목록
- 쿠폰 보유 현황
- 배송 주소록

**❓ 핵심 질문**: 사용자가 "내 최근 주문 보여줘"라고 요청할 때마다 굳이 DB를 조회해야 할까요?

### 기존 방식 vs 최적화된 방식

#### ❌ 기존 방식 (Tool 사용)

In [None]:
@tool
def get_orders_with_user_id(user_id: str) -> dict:
    """
    Get the orders for a user.
    """
    table_name = "OrdersTable"
    dynamodb = boto3.resource('dynamodb', region_name='us-west-2')
    table = dynamodb.Table(table_name)
    # Query the table for orders with the given user_id
    try:
        response = table.query(
            IndexName='UserStatusIndex',
            KeyConditionExpression=Key('user_id').eq(int(user_id))
        )
        
        orders = response.get('Items', [])
        
        # Format orders for response
        formatted_orders = []
        for order in orders:
            formatted_order = {
                "order_id": order.get('order_id'),
                "timestamp": order.get('timestamp'),
                "item_id": order.get('item_id'),
                "delivery_status": order.get('delivery_status')
            }
            formatted_orders.append(formatted_order)
        
        return {
            "messageVersion": "1.0",
            "response": {
                "orders": formatted_orders
            }
        }
    except Exception as e:
        error_msg = f"Error getting orders: {str(e)}"
        return {
            "messageVersion": "1.0",
            "response": {
                "error": error_msg
            }
        }

@tool
def get_user_info(user_id: str) -> dict:
    """
    Get the address for a user.
    """

    table_name = "UsersTable"
    dynamodb = boto3.resource('dynamodb', region_name='us-west-2')
    table = dynamodb.Table(table_name)
    if not user_id:
        return "Error: user_id is required"
    
    try:
        response = table.get_item(
            Key={
                'id': user_id
            }
        )

        user = response.get('Item')

        if not user:
            return "Error: User not found"
        
        return "User info: " + str(user)

    except Exception as e:
        error_msg = f"Error getting user address: {str(e)}"
        print(error_msg)
        return "Error: " + error_msg

get_orders_with_user_id("314")

사용자 정보와 주문 내역을 DynamoDB에서 읽어오는 Tool을 하나씩 정의했습니다.

각 툴의 실행 시간을 한번 확인해보겠습니다.

In [None]:
%%timeit
user_info = get_user_info("15")

In [None]:
%%timeit
user_orders = get_orders_with_user_id("15")

DB에서 정보를 읽어오는 시간은 ms 수준으로 오래 걸리지 않습니다. 

이제 이 Tool을 사용하는 Agent를 만들어보겠습니다.

In [None]:
agent_prompt = """You are a customer service agent that manages user accounts and order data through DynamoDB tables.

CORE BEHAVIOR:
- Always use the appropriate function based on the user's request

Available actions:
- get_orders_with_user_id: get the orders for a user
- get_user_info: get the user's information

ERROR HANDLING:
- If user data not found, explain clearly and offer alternatives
- For failed updates, provide guidance on correct format or requirements
- If orders are empty, inform user appropriately

RESPONSE FORMAT:
For orders:
Order Number: 123
Order Date: 2025-01-01
Order Status: Delivered

For user info:
User ID: 123
User Name: John Doe
User Email: john.doe@example.com
User Address: 123 Main St, Anytown, USA, 12345

Do NOT include any other text or XML tags like <thinking>."""

#### Tool을 사용해서 정보를 조회하는 Agent의 흐름

사용자: "최근 주문 내역 보여줘"

↓

LLM: "주문 내역을 조회하겠습니다"

↓

Tool 호출: get_user_orders(user_id)

↓

DB 조회

↓

LLM 재호출: 주문 내역 정리하여 응답

In [None]:
customer_agent = Agent(
    system_prompt=agent_prompt,
    tools=[get_orders_with_user_id, get_user_info],
    model=BedrockModel(
        model_id=BedrockModelId.AMAZON_NOVA_PRO.value,
    )
)

response = customer_agent("이 사용자의 최근 주문 내역 10개 알려줘 UserID: 15")
print(response.metrics.accumulated_usage)
print(response.metrics.accumulated_metrics)

### 📊 툴 사용 방식 결과

- **응답 시간**: ~7.9초
- **총 토큰**: 6,078개
- **툴 호출**: 1회 (DynamoDB 조회)

**분석**: 툴 호출과 LLM 추론이 순차적으로 실행되어 총 시간이 길어짐

In [None]:
%%timeit

customer_agent = Agent(
    system_prompt=agent_prompt,
    tools=[get_orders_with_user_id, get_user_info],
    model=BedrockModel(
        model_id=BedrockModelId.AMAZON_NOVA_PRO.value,
    ),
    callback_handler=None
)

response = customer_agent("이 사용자의 최근 주문 내역 10개 알려줘 UserID: 15")

환경마다 다를 수 있지만 테스트 환경에서 평균 약 7.9초가 걸렸습니다.

이번에는 Tool을 사용하지 않고 사전에 정보를 읽어 Agent에게 넘겨주는 사전 로딩 방식을 측정해보겠습니다.

#### ✅ 최적화된 방식 (사전 로딩)

**핵심 아이디어**: 사용자가 이미 로그인한 상황에서는 기본 정보를 미리 알 수 있다!

### 구현 방법
1. 사용자 접속 시 주문/사용자 정보를 미리 조회
2. 이 정보를 시스템 프롬프트에 포함
3. LLM이 툴 없이 즉시 응답

### 장점
- 툴 호출 완전 제거
- 즉시 응답 가능

### 흐름

사용자 접속 시: 백그라운드에서 사용자 정보 로딩

↓

시스템 프롬프트에 포함:
"사용자 정보: {...}
최근 주문: {...}"

↓

사용자: "최근 주문 내역 보여줘"

↓

LLM: 즉시 응답 (Tool 호출 없음)

In [None]:
frontloaded_agent_prompt = f"""You are a customer service agent that manages user accounts and order data through DynamoDB tables.

ERROR HANDLING:
- If user data not found, explain clearly and offer alternatives
- For failed updates, provide guidance on correct format or requirements
- If orders are empty, inform user appropriately

RESPONSE FORMAT:
For orders:
Order Number: 123
Order Date: 2025-01-01
Order Status: Delivered

For user info:
User ID: 123
User Name: John Doe
User Email: john.doe@example.com
User Address: 123 Main St, Anytown, USA, 12345

Do NOT include any other text.

User Info: 
{get_user_info("15")}

User Orders:
{get_orders_with_user_id("15")}"""

In [None]:
frontloaded_agent = Agent(
    system_prompt=frontloaded_agent_prompt,
    model=BedrockModel(
        model_id=BedrockModelId.AMAZON_NOVA_PRO.value,
    )
)

frontloaded_response = frontloaded_agent("이 사용자의 가장 최근 주문 내역 10개 알려줘")
print(frontloaded_response.metrics.accumulated_usage)
print(frontloaded_response.metrics.accumulated_metrics)

In [None]:
%%timeit

frontloaded_agent = Agent(
    system_prompt=frontloaded_agent_prompt,
    model=BedrockModel(
        model_id=BedrockModelId.AMAZON_NOVA_PRO.value,
    ),
    callback_handler=None
)

frontloaded_response = frontloaded_agent("이 사용자의 가장 최근 주문 내역 10개 알려줘")

### 📊 사전 로딩 방식 결과 비교

| 방식 | 응답 시간 | 토큰 사용량 | 툴 호출 |
|------|-----------|-------------|---------|
| 툴 사용 | ~7.9초 | 6,078개 | 1회 |
| 사전 로딩 | ~3.7초 | 4,863개 | 0회 |

**개선 효과**:
- ⚡ **53% 응답 시간 단축**
- 💰 **20% 토큰 비용 절약**
- 🔧 **툴 호출 완전 제거**

**Key Learning**: 예측 가능한 데이터는 사전 로딩이 훨씬 효율적!

---

## 전략 #2: 한 번의 Tool 호출로 최대한 많은 정보 가져오기

### 핵심 아이디어: "어차피 호출할 거면 한 번에 다 가져오자"

#### 상품 검색 시나리오 분석
사용자가 "리뷰 좋은 티셔츠 추천해줘"라고 요청했을 때, 우리에게 필요한 정보들:

**🔍 필요한 데이터:**
- 티셔츠 상품 목록 (이름, 가격, 재고)
- 각 상품의 리뷰 점수와 요약
- 상품 이미지 URL
- 할인 정보
- 배송 가능 여부

**❓ 핵심 질문**: 이 정보들을 하나씩 개별적으로 조회해야 할까요?

### 기존 방식 vs 최적화된 방식

#### ❌ 기존 방식 (개별 Tool 호출)

In [None]:
@tool
def keyword_product_search(query_keywords: str) -> list:
    """
    Search for products by keywords.
    Args:
        query_keywords: str
    Returns:
        List[dict]
    """
    return [
      {
        "_index": "products",
        "_type": "_doc",
        "_id": "tshirt_001",
        "_score": 9.123456,
        "_source": {
          "id": "tshirt_001",
          "image_url": "https://cdn.example.com/images/organic-cotton-basic-tee-white.jpg",
          "name": "Patagonia Organic Cotton Basic T-Shirt",
          "description": "Sustainably made organic cotton t-shirt in classic fit, perfect for casual wear and everyday comfort",
          "price": 35.00,
          "gender_affinity": "unisex",
          "current_stock": 78
        }
      },
      {
        "_index": "products",
        "_type": "_doc",
        "_id": "tshirt_002",
        "_score": 8.789123,
        "_source": {
          "id": "tshirt_002",
          "image_url": "https://cdn.example.com/images/vintage-band-tee-rolling-stones.jpg",
          "name": "Vintage Rolling Stones Concert T-Shirt",
          "description": "Authentic vintage-style band t-shirt with distressed print and soft cotton blend fabric",
          "price": 28.99,
          "gender_affinity": "unisex",
          "current_stock": 42
        }
      },
      {
        "_index": "products",
        "_type": "_doc",
        "_id": "tshirt_003",
        "_score": 8.456789,
        "_source": {
          "id": "tshirt_003",
          "image_url": "https://cdn.example.com/images/moisture-wicking-performance-tee-navy.jpg",
          "name": "Nike Dri-FIT Performance T-Shirt",
          "description": "Moisture-wicking athletic t-shirt designed for workouts and active lifestyle with breathable fabric",
          "price": 25.00,
          "gender_affinity": "male",
          "current_stock": 156
        }
      },
      {
        "_index": "products",
        "_type": "_doc",
        "_id": "tshirt_004",
        "_score": 8.124567,
        "_source": {
          "id": "tshirt_004",
          "image_url": "https://cdn.example.com/images/bamboo-fiber-tee-soft-gray.jpg",
          "name": "Boody Bamboo Fiber T-Shirt",
          "description": "Ultra-soft bamboo fiber t-shirt with natural antibacterial properties and temperature regulation",
          "price": 32.50,
          "gender_affinity": "female",
          "current_stock": 89
        }
      },
      {
        "_index": "products",
        "_type": "_doc",
        "_id": "tshirt_005",
        "_score": 7.891234,
        "_source": {
          "id": "tshirt_005",
          "image_url": "https://cdn.example.com/images/graphic-tee-mountain-design-black.jpg",
          "name": "REI Mountain Adventure Graphic T-Shirt",
          "description": "Outdoor-inspired graphic t-shirt with mountain print, made from recycled materials",
          "price": 24.95,
          "gender_affinity": "unisex",
          "current_stock": 67
        }
      },
      {
        "_index": "products",
        "_type": "_doc",
        "_id": "tshirt_006",
        "_score": 7.567890,
        "_source": {
          "id": "tshirt_006",
          "image_url": "https://cdn.example.com/images/premium-cotton-henley-burgundy.jpg",
          "name": "J.Crew Premium Cotton Henley T-Shirt",
          "description": "Classic henley-style t-shirt made from premium ring-spun cotton with vintage wash finish",
          "price": 39.50,
          "gender_affinity": "male",
          "current_stock": 34
        }
      },
      {
        "_index": "products",
        "_type": "_doc",
        "_id": "tshirt_007",
        "_score": 7.234567,
        "_source": {
          "id": "tshirt_007",
          "image_url": "https://cdn.example.com/images/tie-dye-festival-tee-rainbow.jpg",
          "name": "Urban Outfitters Tie-Dye Festival T-Shirt",
          "description": "Colorful tie-dye t-shirt with psychedelic patterns, perfect for festivals and casual summer wear",
          "price": 22.00,
          "gender_affinity": "unisex",
          "current_stock": 123
        }
      },
      {
        "_index": "products",
        "_type": "_doc",
        "_id": "tshirt_008",
        "_score": 6.901234,
        "_source": {
          "id": "tshirt_008",
          "image_url": "https://cdn.example.com/images/minimalist-pocket-tee-olive.jpg",
          "name": "Everlane Organic Cotton Pocket T-Shirt",
          "description": "Minimalist pocket t-shirt crafted from organic cotton with ethical manufacturing practices",
          "price": 28.00,
          "gender_affinity": "unisex",
          "current_stock": 91
        }
      },
      {
        "_index": "products",
        "_type": "_doc",
        "_id": "tshirt_009",
        "_score": 6.567890,
        "_source": {
          "id": "tshirt_009",
          "image_url": "https://cdn.example.com/images/fitted-crop-tee-blush-pink.jpg",
          "name": "Lululemon Fitted Crop T-Shirt",
          "description": "Fitted crop-length t-shirt designed for yoga and casual wear with four-way stretch fabric",
          "price": 42.00,
          "gender_affinity": "female",
          "current_stock": 56
        }
      },
      {
        "_index": "products",
        "_type": "_doc",
        "_id": "tshirt_010",
        "_score": 6.234567,
        "_source": {
          "id": "tshirt_010",
          "image_url": "https://cdn.example.com/images/heavyweight-workwear-tee-charcoal.jpg",
          "name": "Carhartt Heavyweight Workwear T-Shirt",
          "description": "Durable heavyweight cotton t-shirt built for tough work environments with reinforced seams",
          "price": 19.99,
          "gender_affinity": "male",
          "current_stock": 178
        }
      }
    ]


@tool
def get_product_reviews(product_ids: List[str]) -> List[dict]:
    """
    Get product reviews by product IDs.
    Args:
        product_ids: List[str]
    Returns:
        List[dict]
    """
    return [
  {
    "product_id": "tshirt_001",
    "avg_rating": 4.3,
    "positive_keywords": ["soft fabric", "sustainable", "good fit", "breathable", "eco-friendly", "comfortable", "classic style", "well-made", "organic cotton"],
    "negative_keywords": ["shrinks after washing", "thin material", "fades quickly", "overpriced for basic tee", "wrinkles easily"],
    "review_summary": "Customers appreciate the soft, breathable organic cotton fabric and the sustainable manufacturing process. The classic fit works well for most body types and the shirt is comfortable for both casual wear and outdoor activities. However, several reviews mention significant shrinkage after the first wash and some color fading over time. A few customers feel the price is too high for what they consider a basic t-shirt, despite the organic materials."
  },
  {
    "product_id": "tshirt_002",
    "avg_rating": 4.1,
    "positive_keywords": ["authentic vintage look", "soft cotton", "cool design", "comfortable fit", "nostalgic", "unique style", "good quality print"],
    "negative_keywords": ["expensive for vintage style", "print cracks over time", "sizing runs large", "thin fabric", "fades after washing"],
    "review_summary": "Music fans love the authentic vintage aesthetic and nostalgic appeal of this Rolling Stones t-shirt. The soft cotton blend feels comfortable and the distressed print looks genuinely vintage. Many appreciate the unique style and conversation-starting design. However, customers report that the print begins to crack and fade after several washes. Some find the sizing runs larger than expected, and a few think the price is steep for what's essentially a reproduction vintage tee."
  },
  {
    "product_id": "tshirt_003",
    "avg_rating": 4.5,
    "positive_keywords": ["excellent moisture-wicking", "breathable", "lightweight", "perfect for workouts", "quick-dry", "comfortable", "durable", "great fit"],
    "negative_keywords": ["shows sweat stains", "expensive", "synthetic feel", "odor retention", "sizing inconsistent"],
    "review_summary": "Athletes and fitness enthusiasts consistently praise this t-shirt's moisture-wicking performance and breathability during workouts. The lightweight fabric dries quickly and maintains its shape well after multiple washes. Users appreciate the comfortable athletic fit and durability for regular gym use. However, some customers note that despite the moisture-wicking properties, sweat stains can still be visible on darker colors. A few mention the synthetic feel compared to cotton, and some report sizing inconsistencies between different colors."
  },
  {
    "product_id": "tshirt_004",
    "avg_rating": 4.6,
    "positive_keywords": ["incredibly soft", "breathable", "temperature regulating", "antibacterial", "eco-friendly", "comfortable", "natural fabric", "luxurious feel"],
    "negative_keywords": ["expensive", "shrinks significantly", "delicate care required", "limited color options", "wrinkles easily"],
    "review_summary": "Customers rave about the incredibly soft texture and luxurious feel of this bamboo fiber t-shirt. Many appreciate the natural temperature regulation and antibacterial properties, making it ideal for sensitive skin. The breathable fabric is praised for all-day comfort and the eco-friendly aspect appeals to environmentally conscious buyers. The main concerns are significant shrinkage after washing and the need for delicate care. Several customers wish there were more color options available, and the premium price point is frequently mentioned."
  },
  {
    "product_id": "tshirt_005",
    "avg_rating": 4.2,
    "positive_keywords": ["cool graphic design", "sustainable materials", "good fit", "outdoor vibe", "quality print", "comfortable", "unique style"],
    "negative_keywords": ["print quality varies", "expensive for graphic tee", "thin fabric", "fades over time", "sizing runs small"],
    "review_summary": "Outdoor enthusiasts love the mountain-inspired graphic design and appreciate that it's made from recycled materials. The print quality is generally good and the outdoor aesthetic resonates well with the target audience. Customers find it comfortable for casual wear and appreciate the environmental consciousness. However, some report inconsistent print quality between different shirts, and the graphic tends to fade after repeated washing. A few customers mention the fabric feels thinner than expected for the price point."
  },
  {
    "product_id": "tshirt_006",
    "avg_rating": 4.4,
    "positive_keywords": ["premium quality", "soft cotton", "classic style", "well-constructed", "vintage wash", "comfortable fit", "durable", "sophisticated look"],
    "negative_keywords": ["expensive", "shrinks after washing", "limited size availability", "wrinkles easily", "thick fabric for summer"],
    "review_summary": "Customers appreciate the premium ring-spun cotton quality and classic henley styling that works well for both casual and slightly dressed-up occasions. The vintage wash finish and construction quality receive consistent praise. Many find the fit comfortable and appreciate the sophisticated look compared to basic t-shirts. However, the high price point is a common complaint, and several customers report shrinkage after the first wash. Some find the fabric too thick for hot summer weather."
  },
  {
    "product_id": "tshirt_007",
    "avg_rating": 3.9,
    "positive_keywords": ["fun design", "vibrant colors", "festival vibe", "unique patterns", "soft fabric", "good price", "eye-catching"],
    "negative_keywords": ["colors bleed in wash", "fades quickly", "thin material", "sizing inconsistent", "print quality varies", "cheap feel"],
    "review_summary": "Young customers love the vibrant tie-dye patterns and festival aesthetic of this t-shirt. The colorful designs are eye-catching and perfect for casual summer wear or music events. The affordable price point makes it accessible for younger buyers. However, many reviews mention significant color bleeding during the first few washes and rapid fading of the tie-dye patterns. The fabric quality is described as thin and somewhat cheap feeling. Sizing consistency is also an issue, with some customers receiving shirts that fit differently than expected."
  },
  {
    "product_id": "tshirt_008",
    "avg_rating": 4.5,
    "positive_keywords": ["minimalist design", "ethical manufacturing", "soft organic cotton", "perfect fit", "quality construction", "versatile", "sustainable"],
    "negative_keywords": ["expensive for basic design", "limited color options", "shrinks slightly", "pocket placement odd", "plain styling"],
    "review_summary": "Customers who value ethical fashion appreciate the transparent manufacturing practices and high-quality organic cotton. The minimalist design is praised for its versatility and the pocket detail adds a nice touch. The fit is consistently described as perfect and the construction quality is excellent. However, some customers feel the price is high for what they see as a fairly basic design. The limited color palette and simple styling don't appeal to everyone, and a few mention the pocket placement feels slightly off."
  },
  {
    "product_id": "tshirt_009",
    "avg_rating": 4.3,
    "positive_keywords": ["perfect for yoga", "flattering fit", "four-way stretch", "comfortable", "quality fabric", "stylish crop length", "breathable"],
    "negative_keywords": ["very expensive", "shows everything underneath", "limited color options", "sizing runs small", "too fitted for some"],
    "review_summary": "Yoga practitioners and fitness enthusiasts love the four-way stretch fabric and flattering fitted design. The crop length is perfect for high-waisted leggings and the fabric moves well during exercise. Many appreciate the quality construction and breathable material. However, the high price point is frequently criticized, especially for what's essentially a basic fitted tee. Some customers find it too fitted or mention that it shows undergarments easily. The limited color selection is also mentioned as a drawback."
  },
  {
    "product_id": "tshirt_010",
    "avg_rating": 4.7,
    "positive_keywords": ["extremely durable", "heavyweight fabric", "great value", "perfect for work", "reinforced seams", "comfortable", "long-lasting", "tough"],
    "negative_keywords": ["heavy fabric", "boxy fit", "limited style options", "thick for summer", "basic design", "not fashionable"],
    "review_summary": "Workers and those needing durable clothing consistently praise this t-shirt's exceptional build quality and longevity. The heavyweight cotton and reinforced seams hold up excellently to tough work conditions and frequent washing. Customers appreciate the great value for money and find it very comfortable for long work days. The main criticisms center around the boxy, utilitarian fit that's not particularly fashionable. Some find the fabric too heavy for hot weather or casual wear, and the basic design doesn't appeal to style-conscious customers."
  }
]

예시를 위한 하드코딩된 결과를 가져오는 tool을 생성했습니다.

이제 각 툴의 실행 시간을 측정해보겠습니다.

In [None]:
%%timeit
keyword_product_search("products")

In [None]:
%%timeit
get_product_reviews(["123", "456", "789"])

각 tool을 호출하는 데에 걸리는 시간은 마이크로초 수준으로 매우 작습니다.

* 실전에서는 검색 API와 DB 호출하는 함수 구현이 필요합니다.

---

상품을 검색하는 Agent를 구현하고, 위에 두 개의 Tool을 사용하도록 하겠습니다.

예상 시나리오

사용자: "리뷰 좋은 티셔츠 5개 추천해줘"

↓

Tool 호출: search_products("티셔츠") → 10개 상품 ID

↓

Tool 호출: get_reviews("상품 ID들")

In [None]:
search_agent_prompt = """You are a product catalog search agent that finds relevant products using keyword-based search.

CORE BEHAVIOR:
- Use keyword_product_search for text-based product searches
- Use get_product_reviews to get product reviews
- Extract the most relevant keywords from user queries for keyword searches

FUNCTION SELECTION:
- Use keyword_product_search for general product searches based on categories or descriptions
- Use get_product_reviews to get product reviews

KEYWORD EXTRACTION:
- Match user queries to available product keywords
- Use specific product types when mentioned (jacket, sneaker, camera, etc.)
- For broad queries, use general categories (apparel, electronics, furniture, etc.)
- Combine related keywords when appropriate

AVAILABLE KEYWORDS:
t-shirt, jacket, sneakers

RESPONSE FORMAT:
- Product Name
- Product Description
- Product Price

Do not include any other text in your response."""

In [None]:
%%time

search_agent = Agent(
    system_prompt=search_agent_prompt,
    tools=[keyword_product_search, get_product_reviews],
    model=BedrockModel(
        model_id=BedrockModelId.AMAZON_NOVA_PRO.value,
    )
)

search_agent("Find me 5 t-shirts with the highest reviews")

**총 Tool 호출**: 2회
**총 시간**: ~17초

In [None]:
%%timeit

search_agent = Agent(
    system_prompt=search_agent_prompt,
    tools=[keyword_product_search, get_product_reviews],
    model=BedrockModel(
        model_id=BedrockModelId.AMAZON_NOVA_PRO.value,
    ),
	callback_handler=None
)

search_agent("Find me 5 t-shirts with the highest reviews")

### 📊 개별 호출 방식 결과

- **응답 시간**: ~17.4초
- **툴 호출**: 2회 (검색 1회 + 리뷰 1회)
- **총 토큰**: 9,083개

**분석**: 
- 2번의 순차 호출 필요
- 각 단계마다 LLM 추론 시간 (TTFT + Output Tokens) 추가

---

#### ✅ 최적화된 방식 (통합 Tool 호출)

보통 개발할 때 함수를 짜듯이 생각하기보다, 연관된 정보는 하나의 Tool에 몰아줌으로써 불필요한 Tool 호출을 줄일 수 있습니다.

위 예시에서는 상품 정보를 가져올 때 이미 상품 ID가 들어있기 때문에 상품 리뷰도 바로 조회할 수 있는 점을 이용하여 두 개의 Tool을 하나로 결합합니다.

In [None]:
@tool
def search_products_with_reviews(query: str) -> list:
    search_results = keyword_product_search(query)
    reviews_data = get_product_reviews(search_results)

    reviews_lookup = {review["product_id"]: review for review in reviews_data}
    
    # Join data
    joined_results = []
    for hit in search_results:
        # Start with product data
        joined_item = hit["_source"].copy()
        joined_item["search_score"] = hit["_score"]
        
        # Add review data if available
        product_id = hit["_source"]["id"]
        if product_id in reviews_lookup:
            review = reviews_lookup[product_id]
            joined_item.update({
                "avg_rating": review["avg_rating"],
                "positive_keywords": review["positive_keywords"],
                "negative_keywords": review["negative_keywords"],
                "review_summary": review["review_summary"]
            })
        else:
            # Handle missing reviews
            joined_item.update({
                "avg_rating": None,
                "positive_keywords": [],
                "negative_keywords": [],
                "review_summary": None
            })
        
        joined_results.append(joined_item)
    
    return joined_results

search_products_with_reviews("티셔츠")

상품을 검색하는 Agent를 구현하고, 위의 Tool을 사용하도록 하겠습니다.

예상 시나리오

사용자: "리뷰 좋은 티셔츠 5개 추천해줘"

↓

Tool 호출: search_products_with_reviews("티셔츠") → 10개 상품 정보와 리뷰 정보


In [None]:
%%time

tool_aware_agent=Agent(
    system_prompt=search_agent_prompt,
    tools=[search_products_with_reviews],
    model=BedrockModel(
        model_id=BedrockModelId.AMAZON_NOVA_PRO.value,
    )
)

tool_aware_agent("Find me 5 t-shirts with the highest reviews")

Tool 호출 1회와 약 13.4초의 시간이 걸렸습니다.

In [None]:
%%timeit

tool_aware_agent=Agent(
    system_prompt=search_agent_prompt,
    tools=[search_products_with_reviews],
    model=BedrockModel(
        model_id=BedrockModelId.AMAZON_NOVA_PRO.value,
    ),
	callback_handler=None
)

tool_aware_agent("Find me 5 t-shirts with the highest reviews")

### 📊 통합 툴 방식 결과 비교

| 방식 | 응답 시간 | 툴 호출 | 토큰 사용량 |
|------|-----------|---------|-------------|
| 개별 호출 | ~17.4초 | 2회 | 9,083개 |
| 통합 툴 | ~13.4초 | 1회 | 5,473개 |

**개선 효과**:
- ⚡ **23% 응답 시간 단축**
- 🔧 **50% 툴 호출 감소**
- 💰 **40% 토큰 비용 절약**

**Key Learning**: 관련 데이터는 한 번에 조회하는 것이 효율적!

In [None]:
search_agent = Agent(
    system_prompt=search_agent_prompt,
    tools=[keyword_product_search, get_product_reviews],
    model=BedrockModel(
        model_id=BedrockModelId.CLAUDE_3_5_HAIKU_1_0.value,
    )
)

search_agent("Find me 5 t-shirts")
search_agent("Which t-shirt has the highest rating?")

In [None]:

tool_aware_agent=Agent(
    system_prompt=search_agent_prompt,
    tools=[search_products_with_reviews],
    model=BedrockModel(
        model_id=BedrockModelId.CLAUDE_3_5_HAIKU_1_0.value,
    )
)

tool_aware_agent("Find me 5 t-shirts")
tool_aware_agent("Which t-shirt has the highest rating?")

---

# 🎯 워크샵 핵심 요약

## 검증된 최적화 전략

### 1️⃣ 모델 선택 최적화

### 2️⃣ 토큰 영향 이해
- **입력 토큰**: 길어도 TTFT에 큰 영향 없음 → 필요한 컨텍스트 적극 활용
- **출력 토큰**: 직접적인 응답 시간 증가 → 비스트리밍 상황에서 간결함 유지

### 3️⃣ 툴 사용 최적화
- **사전 로딩**: 예측 가능한 데이터는 컨텍스트에 포함 (53% 개선)
- **배치 처리**: 관련 데이터를 한 번에 조회 (23% 개선)
- **캐싱 전략**: 반복 호출 방지

## 💡 즉시 적용 가능한 액션 아이템

1. **현재 에이전트 분석**: 툴 호출 패턴과 응답 시간 측정
2. **사전 로딩 기회 식별**: 사용자별/세션별 예측 가능한 데이터 파악
3. **통합 API 설계**: 자주 함께 사용되는 데이터 소스 통합
4. **모델 선택 재검토**: 작업별 최적 모델 매칭

---

*LLM의 응답시간을 분석하고, 실전 사례를 통해 응답을 빠르게 할 수 있는 방법에 대해 학습했습니다.* 😊