# Augmented LLM Foundations with Spring AI

## About This Notebook

This is an educational companion to the VoyagerMate implementation in /src. 

**How to use this notebook:**
- Theory and Concepts: Each section explains Spring AI concepts
- Demo Code: Runnable examples demonstrate the concepts
- Reference to /src: Links to actual VoyagerMate implementation
- Learn by doing: Execute cells, modify parameters, experiment

**The actual implementation:**
- `/src/main/java/...` contains the working VoyagerMate application
- Shell commands in `VoyagerMateCommands.java` show production usage
- Services in `VoyagerMateService.java` implement real workflows
- This notebook helps you understand the why and how behind that code

Spring AI applies familiar Spring patterns to large language models (LLMs), making it practical to build production-ready AI services in Java. VoyagerMate, a travel concierge example, demonstrates how these concepts translate into a domain-specific product.

## Setup: Import Dependencies

First, let's set up our environment with Spring AI dependencies. In Kotlin Notebooks, we use `@file:DependsOn` to import Maven dependencies.

In [25]:
// Dependencies matching pom.xml
@file:DependsOn("org.springframework.ai:spring-ai-starter-model-azure-openai:1.0.3")
@file:DependsOn("org.springframework.boot:spring-boot-starter-web:3.5.6")
@file:DependsOn("org.springframework.retry:spring-retry:2.0.11")

// Spring AI imports
import org.springframework.ai.chat.client.ChatClient
import org.springframework.ai.chat.client.ChatClientResponse
import org.springframework.ai.chat.messages.AssistantMessage
import org.springframework.ai.chat.messages.SystemMessage
import org.springframework.ai.chat.messages.UserMessage
import org.springframework.ai.chat.model.ChatResponse
import org.springframework.ai.chat.prompt.Prompt
import org.springframework.ai.chat.prompt.PromptTemplate
import org.springframework.ai.chat.prompt.SystemPromptTemplate
import org.springframework.ai.azure.openai.AzureOpenAiChatModel
import org.springframework.ai.azure.openai.AzureOpenAiChatOptions
import org.springframework.ai.azure.openai.AzureOpenAiAudioTranscriptionModel
import org.springframework.ai.content.Media
import org.springframework.ai.converter.BeanOutputConverter
import org.springframework.ai.tool.annotation.Tool
import org.springframework.ai.tool.annotation.ToolParam
import org.springframework.util.MimeTypeUtils

println("✅ Dependencies loaded")

✅ Dependencies loaded


## 1. Understanding Large Language Models

### 1.1 What an LLM Is

An LLM is a pre-trained pattern-matching model that generates text, audio, or structured data by predicting the next token in a sequence. It synthesises personalised answers from patterns it absorbed during training rather than looking up exact facts.

**Key characteristics:**

- **Not a database:** no built-in lookups for real-time or private data
- **Probabilistic nature:** temperature and sampling parameters influence creativity versus determinism
- **Context-aware operation:** the model only "remembers" the tokens you send in the current request
- **Stateless interaction:** nothing persists across calls unless you resend it

### 1.2 Token-by-Token Generation

Every response is emitted token by token. Tokens are ~4 characters on average, and both prompt and response tokens count towards usage quotas.

Keep interactions efficient by:
- Chunking large documents instead of pasting them wholesale
- Preferring retrieval strategies (RAG) over monolithic prompts
- Monitoring prompt/response token counts for cost and latency control

### 1.3 Stateless Conversations

Because LLM APIs are stateless, you must resend prior turns to maintain context:

In [26]:
// Demonstrating stateless conversation with message history
val history = listOf(
    UserMessage("My name is Alex"),
    AssistantMessage("Nice to meet you, Alex!"),
    UserMessage("What's my name?")
)

// When sent to chatModel: chatModel.call(new Prompt(history))
// Expected output: "Your name is Alex."

println("Message history created:")
history.forEach { msg ->
    println("  ${msg.javaClass.simpleName}: ${msg.text}")
}

Message history created:
  UserMessage: My name is Alex
  AssistantMessage: Nice to meet you, Alex!
  UserMessage: What's my name?


---

## 2. Messages and Prompt Engineering

### 2.1 Message Roles

Spring AI encodes prompts as structured messages with explicit roles:

- **System role:** global guidance ("You are VoyagerMate, an accessible travel advisor.")
- **User role:** the traveller's request
- **Assistant role:** previous model replies that maintain conversational flow
- **Tool role:** results returned by external functions

### 2.2 Constructing a Prompt

Let's build a prompt using Spring AI's message types and templates:

In [27]:
// Creating a system prompt with template
val systemText = """
    You are VoyagerMate, a helpful travel assistant.
    Focus on logistics, budgets, and local insight.
    Reply in the style of {voice}.
""".trimIndent()

val systemTemplate = SystemPromptTemplate(systemText)
val systemMessage = systemTemplate.createMessage(mapOf("voice" to "enthusiastic local guide"))

val userMessage = UserMessage("Plan a 3-day spring trip to Rome with mobility support.")

// Combine messages into a prompt
val prompt = Prompt(listOf(systemMessage, userMessage))

println("System message: ${systemMessage.text}")
println()
println("User message: ${userMessage.text}")

// chatModel.call(prompt) would send this to the LLM

System message: You are VoyagerMate, a helpful travel assistant.
Focus on logistics, budgets, and local insight.
Reply in the style of enthusiastic local guide.

User message: Plan a 3-day spring trip to Rome with mobility support.


### 2.3 Four Building Blocks of Effective Prompts

1. **Instructions:** exactly what the model should produce
2. **External context:** traveller preferences, budgets, constraints
3. **User input:** the direct question or task
4. **Output cues:** required structure (tables, JSON schema, markdown sections)

**Practical tips:**

- **Be specific:** "List 5 family-friendly activities in Tokyo for April" beats "Tell me about Tokyo."
- **Communicate constraints:** budgets, accessibility needs, dietary choices
- **Provide structure:** define sections such as `## Morning`, `## Afternoon`, or share JSON scaffolding
- **Iterate:** capture edge cases, test tone, and refine temperature settings

### 2.4 Context Engineering

When you add instructions, examples, tools, and retrieved documents, you are performing **context engineering**.

- **Right altitude:** avoid vague goals ("be helpful") or hyper-specific micromanagement
- **Minimal toolset:** only register the functions you expect to use
- **Few-shot examples:** offer 2–3 high-quality samples instead of exhaustive lists
- **Just-in-time data:** fetch snippets when needed instead of saturating the prompt window

---

## 3. Prompt Templates and Reuse

Spring AI's `PromptTemplate` decouples static prompt structure from dynamic values:

In [22]:
// Using PromptTemplate for reusable prompts
// In Java: var template = new PromptTemplate("Explain {season} travel in {destination}");
val template = PromptTemplate("Explain {season} travel in {destination}")

val autumnBarcelonaPrompt = template.create(
    mapOf(
        "season" to "autumn",
        "destination" to "Barcelona"
    )
)

println("Template: Explain {season} travel in {destination}")
println()
autumnBarcelonaPrompt.instructions.forEach { msg ->
    println("Resolved: ${msg.text}")
}

Template: Explain {season} travel in {destination}

Resolved: Explain autumn travel in Barcelona


Templates can also be stored as external resources to version them alongside code, making it easier for product and content teams to collaborate.

---

## 4. Working with Spring AI ChatClient

### 4.1 Architectural Overview

`ChatClient` wraps a provider-specific `ChatModel` with fluent APIs, default options, advisors, and tool orchestration.

```
┌─────────────────┐
│ Your Service    │  (ChatClient)
└────────┬────────┘
         │
┌────────▼────────┐
│ Spring AI Layer │  Prompt assembly, retries,
│                 │  observability, advisors
└────────┬────────┘
         │
┌────────▼────────┐
│ Provider API    │  Azure OpenAI, OpenAI, Anthropic...
└─────────────────┘
```

### 4.2 Fluent Usage Patterns

Here are the common ChatClient patterns (Java-style, executable in Kotlin):

In [None]:
// ChatClient fluent API demonstration

// Since we don't have a configured ChatClient in notebook, let's demonstrate the pattern structure
class MockChatClient {
    fun prompt() = PromptBuilder()
    
    class PromptBuilder {
        private var userText = ""
        private var temp: Double? = null
        
        fun user(text: String): PromptBuilder {
            userText = text
            return this
        }
        
        fun options(temperature: Double): PromptBuilder {
            temp = temperature
            return this
        }
        
        fun call(): CallResponse {
            return CallResponse(userText, temp)
        }
        
        fun stream(): StreamResponse {
            return StreamResponse(userText)
        }
    }
    
    class CallResponse(private val prompt: String, private val temp: Double?) {
        fun content(): String = "Response to: $prompt (temp: ${temp ?: "default"})"
        fun <T> entity(clazz: Class<T>): T? = null // Would convert to type
    }
    
    class StreamResponse(private val prompt: String) {
        fun content() = prompt.split(" ").asSequence()
    }
}

val chatClient = MockChatClient()

// Pattern 1: Simple call
val answer = chatClient.prompt()
    .user("Best time to visit Santorini?")
    .call()
    .content()
println("Simple call: $answer")

// Pattern 2: With temperature
val creative = chatClient.prompt()
    .user("Suggest quirky activities")
    .options(1.1)
    .call()
    .content()
println("With options: $creative")

// Pattern 3: Streaming
val stream = chatClient.prompt()
    .user("Tell me about Iceland")
    .stream()
    .content()
println("Streaming: ${stream.take(3).joinToString(" ")}...")

### 4.3 Metadata and Resilience

Extract metadata from responses for observability:

In [24]:
// Extracting metadata - working simulation

// Simulate the response structure from ChatClient
data class TokenUsage(
    val promptTokens: Int,
    val completionTokens: Int,
    val totalTokens: Int
)

data class ResponseMetadata(
    val model: String,
    val usage: TokenUsage
)

data class ChatResponse(
    val content: String,
    val metadata: ResponseMetadata
)

// Simulate what you'd get from: chatClient.prompt().user(...).call().chatResponse()
val response = ChatResponse(
    content = "Paris is beautiful in spring with mild weather and blooming gardens.",
    metadata = ResponseMetadata(
        model = "gpt-4o",
        usage = TokenUsage(
            promptTokens = 25,
            completionTokens = 50,
            totalTokens = 75
        )
    )
)

// Extract and use the metadata
println("Response: ${response.content}")
println()
println("Metadata:")
println("  Model: ${response.metadata.model}")
println("  Prompt tokens: ${response.metadata.usage.promptTokens}")
println("  Completion tokens: ${response.metadata.usage.completionTokens}")
println("  Total tokens: ${response.metadata.usage.totalTokens}")
println()
println("Cost calculation: ${response.metadata.usage.totalTokens} tokens * \$0.0001 = $${response.metadata.usage.totalTokens * 0.0001}")

Response: Paris is beautiful in spring with mild weather and blooming gardens.

Metadata:
  Model: gpt-4o
  Prompt tokens: 25
  Completion tokens: 50
  Total tokens: 75

Cost calculation: 75 tokens * $0.0001 = $0.007500000000000001


### 4.4 Default Configuration

Configure ChatClient with defaults to keep tone, tooling, and memory consistent:

In [None]:
// ChatClient builder pattern - ACTUAL EXECUTION

class ChatClientBuilder {
    private var systemPrompt: String = ""
    private val tools = mutableListOf<String>()
    private val advisors = mutableListOf<String>()
    private var temperature: Double = 0.7
    
    fun defaultSystem(prompt: String): ChatClientBuilder {
        systemPrompt = prompt
        return this
    }
    
    fun defaultTools(vararg toolNames: String): ChatClientBuilder {
        tools.addAll(toolNames)
        return this
    }
    
    fun defaultAdvisors(vararg advisorNames: String): ChatClientBuilder {
        advisors.addAll(advisorNames)
        return this
    }
    
    fun defaultTemperature(temp: Double): ChatClientBuilder {
        temperature = temp
        return this
    }
    
    fun build(): ConfiguredClient {
        return ConfiguredClient(systemPrompt, tools, advisors, temperature)
    }
}

data class ConfiguredClient(
    val systemPrompt: String,
    val tools: List<String>,
    val advisors: List<String>,
    val temperature: Double
) {
    fun showConfig() {
        println("ChatClient Configuration:")
        println("  System prompt: $systemPrompt")
        println("  Tools: ${tools.joinToString(", ")}")
        println("  Advisors: ${advisors.joinToString(", ")}")
        println("  Temperature: $temperature")
    }
}

// Execute: Build a ChatClient
println("=== Building ChatClient ===")
println()

val client = ChatClientBuilder()
    .defaultSystem("You are VoyagerMate, an expert travel assistant.")
    .defaultTools("findAttractions", "estimateBudget", "checkWeather")
    .defaultAdvisors("MemoryAdvisor", "RAGAdvisor", "LoggingAdvisor")
    .defaultTemperature(0.7)
    .build()

client.showConfig()

---

## 5. Multimodal Interactions

### 5.1 Image Analysis

GPT-4o-style models accept images alongside text, enabling visual travel insights:

In [28]:
// Image analysis with Media builder - ACTUAL EXECUTION

// Media builder implementation
data class MediaObject(val mimeType: String, val data: ByteArray, val size: Int) {
    companion object {
        fun builder() = MediaBuilder()
    }
    
    class MediaBuilder {
        private var mimeType: String = ""
        private var data: ByteArray = byteArrayOf()
        
        fun mimeType(type: String): MediaBuilder {
            mimeType = type
            return this
        }
        
        fun data(bytes: ByteArray): MediaBuilder {
            data = bytes
            return this
        }
        
        fun build() = MediaObject(mimeType, data, data.size)
    }
}

// UserMessage with media builder
data class MultimodalMessage(val text: String, val media: MediaObject?) {
    companion object {
        fun builder() = Builder()
    }
    
    class Builder {
        private var text: String = ""
        private var media: MediaObject? = null
        
        fun text(t: String): Builder {
            text = t
            return this
        }
        
        fun media(m: MediaObject): Builder {
            media = m
            return this
        }
        
        fun build() = MultimodalMessage(text, media)
    }
}

// Execute: Build a multimodal message
println("=== Multimodal Message Builder ===")
println()

// Simulate image data
val imageBytes = ByteArray(1024) { it.toByte() }  // 1KB fake image

// Build media
val media = MediaObject.builder()
    .mimeType("image/jpeg")
    .data(imageBytes)
    .build()

println("1. Created Media:")
println("   MIME type: ${media.mimeType}")
println("   Size: ${media.size} bytes")
println()

// Build message with media
val message = MultimodalMessage.builder()
    .text("What destination is shown in this photo?")
    .media(media)
    .build()

println("2. Created UserMessage:")
println("   Text: ${message.text}")
println("   Has media: ${message.media != null}")
println("   Media size: ${message.media?.size} bytes")

=== Multimodal Message Builder ===

1. Created Media:
   MIME type: image/jpeg
   Size: 1024 bytes

2. Created UserMessage:
   Text: What destination is shown in this photo?
   Has media: true
   Media size: 1024 bytes


### 5.2 Audio Input and Output

Process audio notes and generate speech responses:

In [29]:
// Audio transcription - ACTUAL EXECUTION

import org.springframework.core.io.ByteArrayResource

class AudioTranscriptionService {
    fun transcribe(audioResource: ByteArrayResource): String {
        // Simulate Whisper API transcription
        val size = audioResource.contentLength()
        return when {
            size < 1000 -> "Short audio: I want to visit Paris."
            size < 5000 -> "Medium audio: I want to plan a trip to Paris in spring for 5 days with a budget of 2000 euros."
            else -> "Long audio: I'm planning a European vacation and I'd like to visit Paris in the spring, probably for about 5 days. My budget is around 2000 euros. Can you help me plan the perfect itinerary?"
        }
    }
    
    fun processWithLLM(transcript: String, userPrompt: String): String {
        // Simulate LLM processing the transcript
        val hasCity = transcript.contains("Paris", ignoreCase = true)
        val hasDuration = Regex("\\d+\\s*days?").find(transcript) != null
        val hasBudget = Regex("\\d+\\s*(euros?|dollars?)").find(transcript) != null
        
        return buildString {
            append("I can help you plan a trip to Paris! Based on your note:\n")
            if (hasDuration) append("- Duration: 5 days\n")
            if (hasBudget) append("- Budget: 2000 euros\n")
            append("- Season: Spring (perfect for Paris!)\n")
            append("\nI'll create a detailed itinerary for you.")
        }
    }
}

// Execute: Transcribe and process audio
println("=== Audio Transcription Demo ===")
println()

val transcriptionService = AudioTranscriptionService()

// Test different audio sizes
val audioSizes = listOf(500, 2500, 8000)

audioSizes.forEach { size ->
    val audioBytes = ByteArray(size) { it.toByte() }
    val audioResource = ByteArrayResource(audioBytes)
    
    println("Audio ${size} bytes:")
    
    // Transcribe
    val transcript = transcriptionService.transcribe(audioResource)
    println("  Transcript: $transcript")
    
    // Process with LLM
    val response = transcriptionService.processWithLLM(transcript, "Help plan this trip")
    println("  LLM Response: ${response.lines().first()}")
    println()
}

=== Audio Transcription Demo ===

Audio 500 bytes:
  Transcript: Short audio: I want to visit Paris.
  LLM Response: I can help you plan a trip to Paris! Based on your note:

Audio 2500 bytes:
  Transcript: Medium audio: I want to plan a trip to Paris in spring for 5 days with a budget of 2000 euros.
  LLM Response: I can help you plan a trip to Paris! Based on your note:

Audio 8000 bytes:
  Transcript: Long audio: I'm planning a European vacation and I'd like to visit Paris in the spring, probably for about 5 days. My budget is around 2000 euros. Can you help me plan the perfect itinerary?
  LLM Response: I can help you plan a trip to Paris! Based on your note:



---

## 6. Structured Outputs

### 6.1 Why Structured Outputs Matter

Prompting for JSON is unreliable because models add prose or omit fields. OpenAI's structured output mode constrains generation using JSON Schema so every field matches your specification.

Let's create a typed itinerary structure:

In [30]:
// Typed models for structured outputs

data class ItineraryPlan(
    val destinationOverview: String,
    val highlights: List<String>,
    val dailySchedule: List<ItineraryDay>,
    val bookingReminders: List<String>,
    val estimatedBudget: Double
)

data class ItineraryDay(
    val day: String,
    val theme: String,
    val activities: List<String>,
    val diningRecommendation: String
)

println("Structured output models defined")
println()
println("ItineraryPlan fields:")
println("  - destinationOverview: String")
println("  - highlights: List<String>")
println("  - dailySchedule: List<ItineraryDay>")
println("  - bookingReminders: List<String>")
println("  - estimatedBudget: Double")

Structured output models defined

ItineraryPlan fields:
  - destinationOverview: String
  - highlights: List<String>
  - dailySchedule: List<ItineraryDay>
  - bookingReminders: List<String>
  - estimatedBudget: Double


In [31]:
// BeanOutputConverter - actual working example

// Create a converter for our model type
val converter = BeanOutputConverter(ItineraryPlan::class.java)

// Get the JSON schema that would be sent to the model
val schema = converter.format

println("BeanOutputConverter Example:")
println()
println("1. Created converter for ItineraryPlan")
println("2. Generated JSON Schema:")
println(schema)
println()

// Simulate a JSON response from the model
val mockJsonResponse = """
{
  "destinationOverview": "Rome offers ancient history, world-class cuisine, and stunning architecture",
  "highlights": ["Colosseum", "Vatican Museums", "Trevi Fountain"],
  "dailySchedule": [
    {
      "day": "Day 1",
      "theme": "Ancient Rome",
      "activities": ["Visit Colosseum", "Walk Roman Forum", "Explore Palatine Hill"],
      "diningRecommendation": "Traditional trattoria in Trastevere"
    }
  ],
  "bookingReminders": ["Book Colosseum tickets in advance", "Reserve Vatican tour"],
  "estimatedBudget": 750.0
}
""".trimIndent()

// Convert JSON to typed object
val itinerary = converter.convert(mockJsonResponse)

println("3. Converted JSON response to typed object:")
println("   Destination: ${itinerary.destinationOverview}")
println("   Highlights: ${itinerary.highlights.joinToString(", ")}")
println("   Budget: $${itinerary.estimatedBudget}")
println("   Days: ${itinerary.dailySchedule.size}")

BeanOutputConverter Example:

1. Created converter for ItineraryPlan
2. Generated JSON Schema:
Your response should be in JSON format.
Do not include any explanations, only provide a RFC8259 compliant JSON response following this format without deviation.
Do not include markdown code blocks in your response.
Remove the ```json markdown from the output.
Here is the JSON Schema instance your output must adhere to:
```{
  "$schema" : "https://json-schema.org/draft/2020-12/schema",
  "type" : "object",
  "properties" : {
    "bookingReminders" : {
      "type" : "array",
      "items" : {
        "type" : "string"
      }
    },
    "dailySchedule" : {
      "type" : "array",
      "items" : {
        "type" : "object",
        "properties" : {
          "activities" : {
            "type" : "array",
            "items" : {
              "type" : "string"
            }
          },
          "day" : {
            "type" : "string"
          },
          "diningRecommendation" : {
         

java.lang.RuntimeException: com.fasterxml.jackson.databind.exc.InvalidDefinitionException: Cannot construct instance of `Line_35_jupyter$ItineraryPlan` (no Creators, like default constructor, exist): cannot deserialize from Object value (no delegate- or property-based Creator)
 at [Source: REDACTED (`StreamReadFeature.INCLUDE_SOURCE_IN_LOCATION` disabled); line: 2, column: 3]

---

## 7. Tool Calling with Spring AI

### 7.1 Why Tools Matter

LLMs cannot access live weather, reservations, or private customer data. **Tool calling** lets the model request application functions whenever it needs extra information or actions.

### 7.2 Defining Tools

Tools are methods annotated with `@Tool` that the LLM can invoke when it needs external data or actions:

In [32]:
// Tool calling simulation

class VoyagerToolsSimulation {
    private val attractionData = mapOf(
        "Rome" to listOf("Colosseum", "Vatican Museums", "Trevi Fountain", "Pantheon"),
        "Tokyo" to listOf("Senso-ji", "Tokyo Skytree", "Meiji Shrine", "Shibuya"),
        "Barcelona" to listOf("Sagrada Familia", "Park Güell", "La Rambla")
    )
    
    // Would have @Tool annotation in production
    fun findAttractions(city: String, limit: Int = 5): String {
        val attractions = attractionData[city] ?: return "No data for $city"
        return attractions.take(limit).joinToString(", ")
    }
    
    fun estimateBudget(city: String, days: Int): Double {
        val dailyRates = mapOf("Rome" to 150.0, "Tokyo" to 200.0, "Barcelona" to 130.0)
        return (dailyRates[city] ?: 100.0) * days
    }
}

val tools = VoyagerToolsSimulation()

// Test the tools
println("Tool: findAttractions")
println("  Rome (top 3): ${tools.findAttractions("Rome", 3)}")
println("  Tokyo (all): ${tools.findAttractions("Tokyo")}")
println()

println("Tool: estimateBudget")
println("  Rome, 5 days: $${tools.estimateBudget("Rome", 5)}")
println("  Tokyo, 7 days: $${tools.estimateBudget("Tokyo", 7)}")
println()

println("LLM calls tools automatically when needed")
println("See: VoyagerTools.java in /src")

Tool: findAttractions
  Rome (top 3): Colosseum, Vatican Museums, Trevi Fountain
  Tokyo (all): Senso-ji, Tokyo Skytree, Meiji Shrine, Shibuya

Tool: estimateBudget
  Rome, 5 days: $750.0
  Tokyo, 7 days: $1400.0

LLM calls tools automatically when needed
See: VoyagerTools.java in /src


### 7.3 Tool Context and Direct Returns

- Use `ToolContext` to pass hidden operational data (tenant IDs, request IDs) that should not live in the model prompt
- Set `returnDirect = true` when the tool output should bypass the model, for example returning a generated PDF itinerary or RAG search snippet directly to the caller

---

## 8. Bringing Data to the Model

Spring AI supports three complementary strategies:

### Fine-tuning
Retrains the model with domain-specific data. Expensive and rarely necessary for itinerary planning but helps with niche terminology.

### Retrieval-Augmented Generation (RAG)
Embeds documents, stores them in a vector database, and injects the top semantic matches into the prompt. This approach keeps prompts concise while grounding answers in current knowledge.

Process:
1. Extract content
2. Split it into coherent chunks
3. Embed
4. Store the vectors
5. At query time, retrieve via similarity
6. Compose the augmented prompt
7. Call the model

### Tool calling
Accesses external APIs (weather, flights, loyalty data) and takes actions such as bookings or notifications in real time.

**VoyagerMate combines all three:** model priors for general travel knowledge, RAG for curated guides, and tools for live data.

---

## 9. Advisors: Cross-Cutting Behaviours

Advisors are Spring AI interceptors that modify prompts or responses before they reach the provider:

- `QuestionAnswerAdvisor` attaches RAG results from a `VectorStore`
- `MessageChatMemoryAdvisor` stores conversational history
- `SimpleLoggerAdvisor` prints final prompts, responses, tools, and token metrics

**Ordering matters:** add memory first, retrieval second, logging last.

In [33]:
// Advisors - ACTUAL EXECUTION

// Memory Advisor
class MemoryAdvisor {
    private val conversations = mutableMapOf<String, MutableList<String>>()
    
    fun addMessage(sessionId: String, message: String) {
        conversations.getOrPut(sessionId) { mutableListOf() }.add(message)
    }
    
    fun getHistory(sessionId: String) = conversations[sessionId] ?: emptyList()
    
    fun augmentPrompt(sessionId: String, newPrompt: String): String {
        val history = getHistory(sessionId)
        return if (history.isEmpty()) {
            newPrompt
        } else {
            "Previous conversation:\n${history.takeLast(3).joinToString("\n")}\n\nNew message: $newPrompt"
        }
    }
}

// RAG Advisor
class RAGAdvisor {
    private val documents = listOf(
        "Paris has the Eiffel Tower, Louvre Museum, and Arc de Triomphe",
        "Best time to visit Paris is April-June and September-October",
        "Paris metro system is efficient, buy a Navigo pass",
        "Rome features Colosseum, Vatican City, and Trevi Fountain",
        "Tokyo combines modern tech with traditional temples"
    )
    
    fun retrieveRelevant(query: String, topK: Int = 3): List<String> {
        return documents
            .map { doc -> doc to calculateRelevance(doc, query) }
            .sortedByDescending { it.second }
            .take(topK)
            .map { it.first }
    }
    
    private fun calculateRelevance(doc: String, query: String): Int {
        val queryWords = query.lowercase().split(" ")
        return queryWords.count { doc.lowercase().contains(it) }
    }
    
    fun augmentPrompt(query: String, userPrompt: String): String {
        val relevant = retrieveRelevant(query)
        return "Context:\n${relevant.joinToString("\n")}\n\nQuestion: $userPrompt"
    }
}

// Execute: Use advisors
println("=== Advisors Demo ===")
println()

val memoryAdvisor = MemoryAdvisor()
val ragAdvisor = RAGAdvisor()

// Simulate a conversation with memory
val sessionId = "user-456"

println("1. Memory Advisor:")
memoryAdvisor.addMessage(sessionId, "User: Tell me about Paris")
memoryAdvisor.addMessage(sessionId, "Assistant: Paris is a beautiful city...")

val promptWithMemory = memoryAdvisor.augmentPrompt(sessionId, "What about transportation?")
println(promptWithMemory)
println()

// Simulate RAG retrieval
println("2. RAG Advisor:")
val query = "Paris travel tips"
val relevantDocs = ragAdvisor.retrieveRelevant(query)
println("   Query: $query")
println("   Retrieved ${relevantDocs.size} documents:")
relevantDocs.forEach { println("   - $it") }
println()

val promptWithRAG = ragAdvisor.augmentPrompt(query, "How do I get around?")
println("   Augmented prompt:")
println(promptWithRAG.lines().take(4).joinToString("\n"))

=== Advisors Demo ===

1. Memory Advisor:
Previous conversation:
User: Tell me about Paris
Assistant: Paris is a beautiful city...

New message: What about transportation?

2. RAG Advisor:
   Query: Paris travel tips
   Retrieved 3 documents:
   - Paris has the Eiffel Tower, Louvre Museum, and Arc de Triomphe
   - Best time to visit Paris is April-June and September-October
   - Paris metro system is efficient, buy a Navigo pass

   Augmented prompt:
Context:
Paris has the Eiffel Tower, Louvre Museum, and Arc de Triomphe
Best time to visit Paris is April-June and September-October
Paris metro system is efficient, buy a Navigo pass


---

## 10. Putting the Patterns Together

### 10.1 Conversational Concierge

Building a session-aware chat endpoint:

In [None]:
// Conversational endpoint simulation - ACTUAL EXECUTION

class ConversationalChatBot {
    private val sessionMemory = mutableMapOf<String, MutableList<Pair<String, String>>>()
    
    fun chat(userMessage: String, sessionId: String): String {
        val history = sessionMemory.getOrPut(sessionId) { mutableListOf() }
        history.add("user" to userMessage)
        
        // Simulate response based on context
        val response = when {
            userMessage.contains("weather", ignoreCase = true) -> 
                "Rome has beautiful spring weather, around 20°C (68°F) in May."
            userMessage.contains("hotel", ignoreCase = true) -> 
                "I recommend staying near Termini Station for easy access to attractions."
            else -> 
                "I can help with that! (Context: ${history.size - 1} previous messages)"
        }
        
        history.add("assistant" to response)
        return response
    }
    
    fun getConversationHistory(sessionId: String) = sessionMemory[sessionId] ?: emptyList()
}

// Create chatbot
val chatBot = ConversationalChatBot()

// Execute a conversation
println("=== Conversational Chat Demo ===")
println()

println("User: Tell me about Rome")
val r1 = chatBot.chat("Tell me about Rome", "session-123")
println("Bot: $r1")
println()

println("User: What about the weather?")
val r2 = chatBot.chat("What about the weather?", "session-123")
println("Bot: $r2")
println()

println("User: Recommend a hotel")
val r3 = chatBot.chat("Recommend a hotel", "session-123")
println("Bot: $r3")
println()

println("=== Full Conversation History ===")
chatBot.getConversationHistory("session-123").forEachIndexed { idx, (role, msg) ->
    println("${idx + 1}. ${role.uppercase()}: $msg")
}

### 10.2 Image-to-Itinerary Insight

Analyzing travel photos to generate destination recommendations:

In [None]:
// Image analysis - ACTUAL EXECUTION

data class DestinationAnalysis(
    val location: String,
    val landmarks: List<String>,
    val bestTimeToVisit: String,
    val estimatedCost: String,
    val similarDestinations: List<String>
)

class PhotoAnalyzer {
    private val knownLocations = mapOf(
        "eiffel" to DestinationAnalysis(
            location = "Paris, France",
            landmarks = listOf("Eiffel Tower", "Louvre Museum", "Arc de Triomphe"),
            bestTimeToVisit = "April-June, September-October",
            estimatedCost = "$150-200 per day",
            similarDestinations = listOf("London", "Rome", "Barcelona")
        ),
        "colosseum" to DestinationAnalysis(
            location = "Rome, Italy",
            landmarks = listOf("Colosseum", "Vatican", "Trevi Fountain"),
            bestTimeToVisit = "April-May, September-October",
            estimatedCost = "$120-180 per day",
            similarDestinations = listOf("Athens", "Barcelona", "Florence")
        ),
        "tokyo-tower" to DestinationAnalysis(
            location = "Tokyo, Japan",
            landmarks = listOf("Tokyo Skytree", "Senso-ji Temple", "Meiji Shrine"),
            bestTimeToVisit = "March-May, September-November",
            estimatedCost = "$180-250 per day",
            similarDestinations = listOf("Osaka", "Kyoto", "Seoul")
        )
    )
    
    fun analyzePhoto(photoIdentifier: String): DestinationAnalysis {
        return knownLocations[photoIdentifier] ?: knownLocations["eiffel"]!!
    }
}

val analyzer = PhotoAnalyzer()

// Execute photo analysis
println("=== Photo Analysis Demo ===")
println()

val photos = listOf("eiffel", "colosseum", "tokyo-tower")

photos.forEach { photo ->
    val analysis = analyzer.analyzePhoto(photo)
    
    println("Photo: $photo")
    println("   Location: ${analysis.location}")
    println("   Landmarks: ${analysis.landmarks.joinToString(", ")}")
    println("   Best time: ${analysis.bestTimeToVisit}")
    println("   Daily cost: ${analysis.estimatedCost}")
    println("   Similar: ${analysis.similarDestinations.take(2).joinToString(", ")}")
    println()
}

### 10.3 Orchestrated Itinerary Generation

Combining RAG, tools, and memory for comprehensive trip planning:

In [None]:
// Complete system: RAG + Tools + Memory - ACTUAL EXECUTION

class TravelPlanningSystem {
    // Tools
    private val attractions = mapOf(
        "Rome" to listOf("Colosseum", "Vatican Museums", "Trevi Fountain", "Pantheon", "Spanish Steps"),
        "Tokyo" to listOf("Senso-ji Temple", "Tokyo Skytree", "Meiji Shrine", "Tsukiji Market"),
        "Barcelona" to listOf("Sagrada Familia", "Park Güell", "La Rambla", "Gothic Quarter")
    )
    
    private val budgets = mapOf(
        "Rome" to 150.0,
        "Tokyo" to 200.0,
        "Barcelona" to 130.0
    )
    
    // RAG documents
    private val travelGuides = listOf(
        "Rome: Ancient capital with 2,800 years of history. Famous for Colosseum and Vatican.",
        "Tokyo: Ultra-modern metropolis blending tradition with innovation. Known for temples and technology.",
        "Barcelona: Gaudi's architectural masterpiece. Mediterranean beaches meet Gothic Quarter."
    )
    
    // Memory
    private val conversationMemory = mutableMapOf<String, MutableList<String>>()
    
    fun planTrip(request: String, sessionId: String): Map<String, Any> {
        // 1. Update memory
        val history = conversationMemory.getOrPut(sessionId) { mutableListOf() }
        history.add("Request: $request")
        
        // 2. Parse city from request
        val city = when {
            request.contains("Rome", ignoreCase = true) -> "Rome"
            request.contains("Tokyo", ignoreCase = true) -> "Tokyo"
            request.contains("Barcelona", ignoreCase = true) -> "Barcelona"
            else -> "Rome"
        }
        
        // 3. RAG: Retrieve relevant context
        val context = travelGuides.filter { it.contains(city) }.firstOrNull() ?: ""
        
        // 4. Tools: Get data
        val topAttractions = attractions[city]?.take(3) ?: emptyList()
        val dailyBudget = budgets[city] ?: 100.0
        
        // 5. Parse duration
        val days = Regex("(\\d+)").find(request)?.value?.toInt() ?: 3
        val totalBudget = dailyBudget * days
        
        // 6. Generate itinerary
        val itinerary = buildMap {
            put("destination", city)
            put("duration", "$days days")
            put("overview", context)
            put("topAttractions", topAttractions)
            put("dailyBudget", dailyBudget)
            put("totalBudget", totalBudget)
            put("conversationTurns", history.size)
        }
        
        history.add("Response: Itinerary for $city")
        
        return itinerary
    }
}

// Execute the system
val planningSystem = TravelPlanningSystem()

println("=== Complete Travel Planning System ===")
println()

// Test 1: Rome trip
println("Request 1: Plan a 5-day trip to Rome")
val rome = planningSystem.planTrip("Plan a 5-day trip to Rome", "user-789")
println("  Destination: ${rome["destination"]}")
println("  Duration: ${rome["duration"]}")
println("  Overview: ${rome["overview"]}")
println("  Top attractions: ${rome["topAttractions"]}")
println("  Daily budget: $${rome["dailyBudget"]}")
println("  Total budget: $${rome["totalBudget"]}")
println("  Conversation: ${rome["conversationTurns"]} turns")
println()

// Test 2: Tokyo trip (same session)
println("Request 2: Actually, what about Tokyo for 7 days?")
val tokyo = planningSystem.planTrip("Actually, what about Tokyo for 7 days?", "user-789")
println("  Destination: ${tokyo["destination"]}")
println("  Duration: ${tokyo["duration"]}")
println("  Total budget: $${tokyo["totalBudget"]}")
println("  Conversation: ${tokyo["conversationTurns"]} turns (memory preserved)")
println()

// Test 3: Barcelona trip (new session)
println("Request 3: Barcelona 3 day trip")
val barcelona = planningSystem.planTrip("Barcelona 3 day trip", "user-999")
println("  Destination: ${barcelona["destination"]}")
println("  Total budget: $${barcelona["totalBudget"]}")
println("  Conversation: ${barcelona["conversationTurns"]} turns (new session)")
println()

println("System combines:")
println("  - RAG (travel guide context)")
println("  - Tools (attractions, budget calculation)")
println("  - Memory (conversation history)")
println("  - Structured output (typed response)")

## Summary

You have learned the foundations of Spring AI:

- LLM Fundamentals: Stateless, token-based, probabilistic models
- Prompt Engineering: Message roles, templates, context engineering
- ChatClient: Fluent API, metadata, streaming, default configuration
- Multimodal: Image analysis, audio transcription, speech generation
- Structured Outputs: Type-safe JSON with BeanOutputConverter
- Tool Calling: Extending LLMs with external functions
- Data Strategies: Fine-tuning, RAG, tool integration
- Advisors: Cross-cutting concerns (memory, retrieval, logging)
- Production Patterns: Conversational APIs, image analysis, orchestration

## References

- [Spring AI ChatClient Reference](https://docs.spring.io/spring-ai/reference/api/chatclient.html)
- [Spring AI Prompt and Message Reference](https://docs.spring.io/spring-ai/reference/api/prompt.html)
- [Anthropic: Effective Context Engineering](https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents)
- [Spring AI OpenAI Chat (Images and Audio)](https://docs.spring.io/spring-ai/reference/api/chat/openai-chat.html)
- [OpenAI Structured Outputs Guide](https://platform.openai.com/docs/guides/structured-outputs)
- [Spring AI Tools API and Function Calling](https://docs.spring.io/spring-ai/reference/api/tools.html)

## Next Steps

Continue to the next notebook: [Agentic Patterns](./agentic-patterns.ipynb) to learn about workflows, agents, and advanced orchestration patterns.