# Introduction to Using Spring AI with Kotlin

This notebook provides an introductory tutorial on using Spring AI in Kotlin to interact with large language models through an OpenAI example. We'll walk through the process step by step, covering configuration, using prompts, handling streaming responses, obtaining structured data, and utilizing tools.

### Setting Up Your Project

Ensure that your project includes the necessary Spring AI dependencies:

In [1]:
USE {
    dependencies {
        val springAiVersion = "1.0.0-M6"
        implementation("org.springframework.ai:spring-ai-openai:$springAiVersion")
        implementation("org.springframework.ai:spring-ai-openai-spring-boot-starter:$springAiVersion")
        implementation("com.fasterxml.jackson.module:jackson-module-kotlin:2.18.2")
    }
}

Provide your OpenAI API key by setting up the `OPENAI_API_KEY` environmental variable. Alternatively, you can copy it here:

In [2]:
val apiKey = System.getenv("OPENAI_API_KEY") ?: "YOUR_OPENAI_API_KEY"

Set up the OpenAI chat model with your API key and configure the desired settings, such as temperature and model type:

In [3]:
import org.springframework.ai.openai.OpenAiChatModel
import org.springframework.ai.openai.OpenAiChatOptions
import org.springframework.ai.openai.api.OpenAiApi

val openAiApi = OpenAiApi(apiKey)
val openAiChatOptions = OpenAiChatOptions.builder()
    .model(OpenAiApi.ChatModel.GPT_4_O_MINI)
    .temperature(0.7)
    .build()
val chatModel = OpenAiChatModel(openAiApi, openAiChatOptions)

### Sending Prompts

Interact with the API by sending a prompt to the chat model and receiving a response:

In [4]:
chatModel.call("Generate a hokku about Kotlin")

Kotlin whispers soft,  
In the realm of code it flows,  
Concise, clear, and bright.

Use Spring AI's `ChatClient` to create more complex prompts, such as providing system instructions:

In [5]:
import org.springframework.ai.chat.client.ChatClient

val chatClient = ChatClient.builder(chatModel).defaultSystem(
    """
    You are a Lord of the Rings expert and a trusted advisor.
    Offer wise, concise guidance in the style of Middle-earth,
    drawing from its lore, characters, and philosophy.
    """.trimIndent()
).build()

Now you can send a user-defined prompt to the chat model and retrieve the response content as a `String`:

In [6]:
chatClient
    .prompt()
    .user("What awaits us?")
    .call()
    .content()


Ah, dear seeker of wisdom, what awaits you is but a tapestry woven from the threads of choice, fate, and the will of the One. In the realms of Middle-earth, the path is often shrouded in shadow and light alike.

Much like Frodo upon his journey to Mount Doom, you may find trials that test your resolve and courage. Yet remember, even the smallest person can change the course of the future. Seek the counsel of friends, for fellowship is a beacon in the darkest of times. 

Embrace the uncertainty, for therein lies the adventure. Heed the words of Gandalf, who spoke of the importance of the choices we make: "All we have to decide is what to do with the time that is given us." Thus, prepare your heart for the journey ahead, and walk it with honor and hope.

Try replacing the `content()` call with `chatResponse()` to gain deeper insight into the response. `ChatResponse` represents the AI model's reply and includes metadata on how it was generated, such as the number of tokens used.

### Handling Streaming Responses

Using the `stream()` method, you receive partial chunks of the response as soon as they're ready. This approach allows you to avoid waiting for the AI to generate the entire response and enables you to display real-time progress to users.

Include the coroutine dependency to work with the result as a Kotlin `Flow`:

In [7]:
%useLatestDescriptors
%use coroutines
@file:DependsOn("org.jetbrains.kotlinx:kotlinx-coroutines-reactive:1.10.1")
@file:DependsOn("org.jetbrains.kotlinx:kotlinx-coroutines-reactor:1.10.1")

In a reactive UI, you can show the incoming response in real time. To keep this example simple, we display each chunk of the response on a separate line (although they are printed simultaneously):

In [8]:
import kotlinx.coroutines.reactive.asFlow

val streamingResponse: Flow<String> = chatModel
    .stream("Generate a hokku about Kotlin")
    .asFlow()

runBlocking {
    streamingResponse.collect {
        print(it)
    }
}


Code flows like water,  
Kotlin's breeze through the system,  
Clean, concise, and bright.

 Since `collect` is a suspend function, we wrap it inside a `runBlocking` call to use it within a notebook.

### Structured Output

Spring AI can automatically deserialize responses into Kotlin data classes, making it easy to handle structured outputs.

Let's retrieve the response from the LLM about the movie in our desired format:

In [9]:
data class Movie(
    val title: String,
    val year: Int,
    val director: String,
    val genre: String
)

Specify the `ResponseFormat` as `JSON_OBJECT` to instruct the LLM to return the output strictly in JSON, enabling Spring AI to automatically convert it into a `data` class:

In [10]:
import org.springframework.ai.openai.api.ResponseFormat

val structuredOutputOptions = OpenAiChatOptions.builder()
    .model(OpenAiApi.ChatModel.GPT_4_O)
    .responseFormat(ResponseFormat.builder().type(ResponseFormat.Type.JSON_OBJECT).build())
    .build()
val chatModelWithStructuredOutput = OpenAiChatModel(openAiApi, structuredOutputOptions)

In the following example, OpenAI returns the requested JSON, which is automatically converted into a `Movie`:

In [11]:
ChatClient.create(chatModelWithStructuredOutput)
    .prompt()
    .user("Movie that won the Oscar for Best Picture in 1990")
    .call()
    .entity(Movie::class.java)

Movie(title=Driving Miss Daisy, year=1990, director=Bruce Beresford, genre=Comedy-drama)

AI models often hallucinate and aren't guaranteed to return correct answers. As a result, they may sometimes fail to produce the structured output as requested, instead returning something different—such as JSON with additional comments. Larger models tend to produce the expected output more consistently. In this example, selecting `GPT_4_O` rather than `GPT_4_O_MINI` yields both the correct movie choice ('Driving Miss Daisy') and properly formatted JSON. For real-life applications, consider implementing a validation mechanism to ensure the model's output matches the desired format.

### Using Tools

Tools allow LLMs to access your custom services in a powerful and flexible way. Let's use tools to work with OpenAI's function-calling feature and implement a weather service query.

Without additional tools, the model won't provide information about the current weather, responding instead that it's unable to offer real-time weather updates:


In [12]:
chatModel.call("What's the weather like in Paris today?")

I'm unable to provide real-time weather updates. To find out the current weather in Paris, I recommend checking a reliable weather website or app.

Let's imagine we have a weather service providing weather information for different locations. By using tools, we can give OpenAI access to this service. In this tutorial, we'll use `mockWeatherService` to simulate such a service:

In [13]:
fun mockWeatherService(location: String): Double? = when {
    "Paris" in location -> 15.0
    "Tokyo" in location -> 10.0
    "San Francisco" in location -> 30.0
    else -> null
}

We need to grant the model access to the weather tool. First, we define a `FunctionTool` named `"getCurrentWeather"` with the description `"Get the current temperature for a given location."` It includes one required property, `"location"`, of type `string`:

In [14]:
import org.springframework.ai.model.ModelOptionsUtils

val functionTool = OpenAiApi.FunctionTool(
    OpenAiApi.FunctionTool.Type.FUNCTION,
    OpenAiApi.FunctionTool.Function(
        "Get current temperature for a given location.",
        "getCurrentWeather", ModelOptionsUtils.jsonToMap(
            """
                {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "City and country e.g. Bogotá, Colombia"
                        }
                    },
                    "required": ["location"],
                    "additionalProperties": false
                }
                """.trimIndent()
        ),
        true
    )
)

Now, we send the user's question along with the list of available tools.

In [15]:
import org.springframework.ai.openai.api.OpenAiApi.*
import org.springframework.ai.openai.api.OpenAiApi.ChatCompletionRequest.ToolChoiceBuilder

val initialUserMessage = ChatCompletionMessage(
    "What's the weather like in Paris today?",
    ChatCompletionMessage.Role.USER
)
val chatCompletionRequest = ChatCompletionRequest(
    listOf(initialUserMessage), "gpt-4o",
    listOf(functionTool), ToolChoiceBuilder.AUTO
)

Depending on the user's question, the model can now return a response containing information about the tools it chooses to use and the arguments required for those tools. If the user asks about the weather, the model selects our weather tool. If the user asks an unrelated question, the model behaves as usual. We can display the entire response to see which tools were chosen:

In [16]:
val chatCompletion = openAiApi.chatCompletionEntity(chatCompletionRequest)
val responseFromLLM = chatCompletion.body!!.choices().first().message()
responseFromLLM

ChatCompletionMessage[rawContent=null, role=ASSISTANT, name=null, toolCallId=null, toolCalls=[ToolCall[index=null, id=call_4FIXrL1H8Hv3fnNMChMIyf8A, type=function, function=ChatCompletionFunction[name=getCurrentWeather, arguments={"location":"Paris, France"}]]], refusal=null, audioOutput=null]

The response specifies the tool the LLM intends to call and its arguments:

```function=ChatCompletionFunction[name=getCurrentWeather, arguments={"location":"Paris, France"}]```

We invoke the tool and send the result back to the model so that it can generate the final response for the user—or possibly decide to call other tools based on the conversation.

In [17]:
lateinit var messageWithToolInvocation: ChatCompletionMessage
for (toolCall in responseFromLLM.toolCalls()) {
    when (val functionName = toolCall.function().name()) {
        "getCurrentWeather" -> {
            val location = toolCall.function().arguments()
            val temperature = mockWeatherService(location)
            messageWithToolInvocation = ChatCompletionMessage(
                if (temperature != null) "$temperature C" else "Unable to get the weather",
                ChatCompletionMessage.Role.TOOL,
                functionName, toolCall.id(), null, null, null
            )
        }
    }
}

Now, we send all the messages to the LLM to provide the full context: the initial message, the response with the tool choice, and the tool invocation result. With this information, the LLM can now answer the user's initial question about the current weather in Paris:

In [18]:
val messages = mutableListOf(initialUserMessage, responseFromLLM, messageWithToolInvocation)
val functionResponseRequest = ChatCompletionRequest(messages, "gpt-4o", 0.2)
val resultingCompletion = openAiApi.chatCompletionEntity(functionResponseRequest)
resultingCompletion.body!!.choices().first().message().content()

The current temperature in Paris, France is 15°C.

The LLM successfully used the provided tool to respond to the user. Enhancing LLMs with external tools can automate tasks such as data retrieval, customer support, and IoT control.

This notebook serves as an overview of how to integrate Spring AI into your Kotlin projects, enabling you to build powerful AI-driven applications. Experiment further with prompts and tailored implementations for your specific needs! 🚀