Make sure Python and Jupyter are set up locally first (and if necessary relaunch VSCode). There is a `shell.nix` that does this for you if you have Nix installed, but it won't work with some of the Spring AI library dependencies (the ones that have native code extensions), so it's better to use a virtual env created from the global Python.

```bash
$ python3 -m venv .venv
$ . .venv/bin/activate
$ pip install jupyter ipykernel notebook
```

Then, select the Python interpreter in the top right corner of the VSCode editor - it should be `java (Rapaio/j!)`. If that is not available, install it with `jbang install-kernel@jupyter-java rapaio`. Make sure to get the latest version of `jbang` and also edit the `kernel.json` file in `~/.local/share/jupyter/kernels` to upgrade to version 2.0.0 and Java 22.

```json
{
  "argv" : [
    "/home/dsyer/.sdkman/candidates/jbang/current/bin/jbang",
    "--java",
    "22",
...
    "io.github.padreati:rapaio-jupyter-kernel:2.0.0@fatjar",
    "{connection_file}"
  ],
...
}
```

In [1]:
%classpath target/test-classes/
%classpath target/classes/

[0m[32mAdd /home/dsyer/dev/scratch/enhance-llm/target/test-classes to classpath
[0m[0m[32mAdd /home/dsyer/dev/scratch/enhance-llm/target/classes to classpath
[0m

In [2]:
%%jars
target/lib

[0m[32mFound 108 jar files.
[0m[0m[32mAdd /home/dsyer/dev/scratch/enhance-llm/target/lib/jackson-datatype-jdk8-2.17.2.jar to classpath
[0m[0m[32mAdd /home/dsyer/dev/scratch/enhance-llm/target/lib/typetools-0.6.2.jar to classpath
[0m[0m[32mAdd /home/dsyer/dev/scratch/enhance-llm/target/lib/spring-boot-starter-logging-3.3.2.jar to classpath
[0m[0m[32mAdd /home/dsyer/dev/scratch/enhance-llm/target/lib/jackson-core-2.17.2.jar to classpath
[0m[0m[32mAdd /home/dsyer/dev/scratch/enhance-llm/target/lib/logback-classic-1.5.6.jar to classpath
[0m[0m[32mAdd /home/dsyer/dev/scratch/enhance-llm/target/lib/spring-ai-core-1.0.0-SNAPSHOT.jar to classpath
[0m[0m[32mAdd /home/dsyer/dev/scratch/enhance-llm/target/lib/hamcrest-core-2.2.jar to classpath
[0m[0m[32mAdd /home/dsyer/dev/scratch/enhance-llm/target/lib/jackson-databind-2.17.2.jar to classpath
[0m[0m[32mAdd /home/dsyer/dev/scratch/enhance-llm/target/lib/jtokkit-1.1.0.jar to classpath
[0m[0m[32mAdd /home/dsyer/dev/s

In [3]:
import org.springframework.boot.logging.*;
// Make sure Spring Boot is using the Java Logging System not Logback
System.setProperty(LoggingSystem.SYSTEM_PROPERTY, "org.springframework.boot.logging.java.JavaLoggingSystem");

Example of how to get the models prepared in onxx format (not needed if you use the ollama embeddings):

```bash
$ pip install optimum onnx onnxruntime
$ optimum-cli export onnx --model sentence-transformers/multi-qa-MiniLM-L6-cos-v1 onnx-output
```

In [10]:
%%bash
ls -l onnx-output

total 102052
-rw-r--r-- 1 dsyer dsyer      697 Aug  5 09:55 config.json
-rw-r--r-- 1 dsyer dsyer 13085339 Aug  5 09:59 databricks-dolly-15k.jsonl
-rw-r--r-- 1 dsyer dsyer 90447733 Aug  5 09:55 model.onnx
-rw-r--r-- 1 dsyer dsyer      695 Aug  5 09:55 special_tokens_map.json
-rw-r--r-- 1 dsyer dsyer   711661 Aug  5 09:55 tokenizer.json
-rw-r--r-- 1 dsyer dsyer     1433 Aug  5 09:55 tokenizer_config.json
-rw-r--r-- 1 dsyer dsyer   231508 Aug  5 09:55 vocab.txt


Test containers are in charge of making sure the vector store and ollama is running in the background and the model is available before we start the app. If we weren't using test containers, we would have to start the vector store and ollama manually before running the app.

```bash
$ docker run -it --rm --name chroma -p 8000:8000 ghcr.io/chroma-core/chroma:0.4.15
$ ollama serve
$ ollama pull mistral
$ ollama pull albertogg/multi-qa-minilm-l6-cos-v1
```

In [4]:
import org.springframework.boot.SpringApplication;
import com.example.DemoApplication;
import com.example.DemoApplicationTests.TestcontainersConfiguration;
var app = SpringApplication.from(DemoApplication::main).with(TestcontainersConfiguration.class).run(new String[] {}).getApplicationContext();


  .   ____          _            __ _ _
 /\\ / ___'_ __ _ _(_)_ __  __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
 \\/  ___)| |_)| | | | | || (_| |  ) ) ) )
  '  |____| .__|_| |_|_| |_\__, | / / / /

 :: Spring Boot ::                (v3.3.2)



[Engine-thread-0] INFO com.example.DemoApplication - Starting DemoApplication using Java 22.0.2 with PID 2571820 (/home/dsyer/dev/scratch/enhance-llm/target/classes started by dsyer in /home/dsyer/dev/scratch/enhance-llm)
[Engine-thread-0] INFO com.example.DemoApplication - No active profile set, falling back to 1 default profile: "default"
[Engine-thread-0] INFO org.springframework.boot.devtools.env.DevToolsPropertyDefaultsPostProcessor - For additional web related logging consider setting the 'logging.level.web' property to 'DEBUG'
[Engine-thread-0] INFO org.springframework.boot.web.embedded.tomcat.TomcatWebServer - Tomcat initialized with port 8080 (http)
[Engine-thread-0] INFO org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext - Root WebApplicationContext: initialization completed in 981 ms
[Engine-thread-0] INFO org.testcontainers.images.PullPolicy - Image pull policy will be performed by: DefaultPullPolicy()
[Engine-thread-0] INFO org.testcontainers.u

Grab the embedding model (which calls out to ollama) and show how it works

In [5]:
import org.springframework.ai.embedding.EmbeddingModel;
var model = app.getBean(EmbeddingModel.class);
model.embed("Hello World");

[0.006631332, 0.046368703, 0.040787365, 0.0068707983, 5.2580936E-4, -0.06577722, 0.08494715, -0.05236606, -0.059159886, 0.015687117, 0.024459207, -0.09621295, 0.0137719605, 0.0038830542, 0.079698086, 0.019421214, 0.017994735, -0.056662295, -0.11073014, -0.035871916, -0.022705115, -0.0013302569, 0.06894718, 0.038753763, -0.039282657, 0.0033189429, 0.006297793, 0.04922261, 0.06391162, -0.0035785448, -0.05908602, -0.011911157, 0.12739654, -0.013358063, -0.016495287, 0.03261636, 0.011256272, -0.08177851, -0.046204865, 0.011682691, -6.965274E-4, -0.024080234, 0.02508672, 0.078631684, -0.01314921, -9.546165E-5, -0.03294358, 0.00206195, 0.030726533, 0.026222486, -0.009433887, -0.017831437, -0.009996171, -0.004444101, 0.0787742, -0.01740961, 0.013069235, -0.0259258, 0.063523404, -0.032665376, -0.096209645, -0.019935789, 0.010766199, 0.055098467, -0.0014918122, -0.059977274, -0.04793592, 0.030429062, -0.120088845, -0.062409397, -0.09021954, 0.012253389, 0.03912556, 0.05785284, 0.020256057, -0.0

In [8]:
model.embed("Hello London");

[0.06458347, -0.016361302, 0.10603686, -0.037863392, -0.058281023, -0.015379198, 0.09285434, -0.10891376, -0.09566109, -0.0056571118, 0.023809595, -0.048426945, -0.009564186, -0.030932844, 0.054733466, 0.024749363, 0.029845512, -0.077863194, -0.079561055, -0.056448657, 0.006628047, -0.009469016, 0.03259563, 0.06378993, -0.049305998, -0.0056157745, 0.021606663, 0.06963346, 0.060883693, -0.0072005694, -0.045955893, -0.06279615, 0.08838572, -0.0099392915, 0.042798094, 0.076203585, 0.046086285, -0.0962945, -0.04446693, -0.008219454, -0.028764637, -0.06651775, 0.030907819, 0.03594228, 0.025715964, 0.010163348, 0.013631136, -0.01376544, 0.034846872, -0.011912927, 0.06949194, -0.0053752363, 0.026269503, -0.072157405, 0.030910658, -0.023688449, -0.036038045, 0.002724903, 0.04858974, -0.01754085, -0.113722526, 0.061265346, -0.05241749, 0.04245531, -1.3573108E-4, -0.025376935, -0.058399178, 0.01386713, -0.08140539, -0.0076985164, -0.083254546, -0.027383072, 0.039830126, 0.012757815, -0.018389178

Set up some JSON reading utilities getting ready to load up the database

In [6]:
import com.fasterxml.jackson.databind.ObjectMapper;
var mapper = app.getBean(ObjectMapper.class)

In [7]:
var reader = mapper.readerFor(Map.class)

Example of how that works on JSON line input (not part of the final code)

In [11]:
reader.readValues(new File("./data/lines.jsonl")).forEachRemaining(line -> System.out.println(((Map)line).get("instruction")));

How to make a cup of tea?
What is the capital of France?
What is the capital of Germany?
What is the capital of Italy?
What is the capital of Spain?
What is the capital of Portugal?
What is the capital of Greece?
What is the capital of Turkey?
What is the capital of Egypt?
What is the capital of South Africa?
What is the capital of Nigeria?
What is the capital of Kenya?
What is the capital of India?
What is the capital of China?
What is the capital of Japan?
What is the capital of Australia?
What is the capital of New Zealand?
What is the capital of Canada?
What is the capital of the United States?
What is the capital of Brazil?
What is the capital of Argentina?
What is the capital of Chile?
What is the capital of Peru?


No grab the vector store client from the Spring application

In [8]:
import org.springframework.ai.document.Document;
import org.springframework.ai.vectorstore.VectorStore;

var store = app.getBean(VectorStore.class);

The store is backed by a ChromaDB instance, and Spring can contact it directly via this API client:

In [21]:
import org.springframework.ai.chroma.ChromaApi;
ChromaApi chroma = app.getBean(ChromaApi.class);
chroma.listCollections()

[Collection[id=daefe768-9467-4fda-9483-ce8a15e33699, name=SpringAiCollection, metadata={hnsw:space=cosine}]]

In [31]:
chroma.countEmbeddings(chroma.getCollection("SpringAiCollection").id())

1773

Here's how the database works, once it is loaded up (skip this on the first run)

In [9]:
String instruction = "What is the name of the major school of praxiology not developed by Ludwig von Mises";
String context = "In philosophy, praxeology or praxiology (/\u02ccpr\u00e6ksi\u02c8\u0252l\u0259d\u0292i/; from Ancient Greek \u03c0\u03c1\u1fb6\u03be\u03b9\u03c2 (praxis) 'deed, action', and -\u03bb\u03bf\u03b3\u03af\u03b1 (-logia) 'study of') is the theory of human action, based on the notion that humans engage in purposeful behavior, contrary to reflexive behavior and other unintentional behavior.\n\nFrench social philosopher Alfred Espinas gave the term its modern meaning, and praxeology was developed independently by two principal groups: the Austrian school, led by Ludwig von Mises, and the Polish school, led by Tadeusz Kotarbi\u0144ski.";
System.out.println("# %s #".formatted(instruction));
model.embed("Content: %s? %s".formatted(instruction, context));

# What is the name of the major school of praxiology not developed by Ludwig von Mises #


[-0.05053427, 0.04297461, -0.045063112, -0.06397166, 0.019651894, -0.04763994, -0.046440884, 6.5320585E-4, 0.0016956597, 0.052775312, 0.1109083, 0.02209367, 0.021964898, 0.07065174, 0.0126805175, -0.04802021, -0.16395299, 0.049849737, 0.008665036, 0.024124151, 0.019940736, 0.0015308138, 0.057818282, -0.067190275, -0.044270832, -0.052304085, 0.002736632, -0.12320983, 0.017474953, 0.078215435, 0.11136217, -1.5054201E-4, 0.05952729, -0.049480777, -0.019757204, 0.051097896, -0.030561175, -0.028664663, -0.0075910706, 0.10336379, -0.067194864, 0.06692436, -0.008113962, 0.022617739, -0.0039515854, 0.008915416, -0.0038416989, 0.054947305, -0.05115669, 0.010362023, -0.057132725, -0.050576743, -0.017043194, -0.033381253, -0.05952656, 0.03713194, -0.015368629, -0.017530194, -0.07446701, -0.06905668, 0.05249427, -0.0725816, 0.017660446, 0.0017593116, -0.08129157, 0.04012552, -0.016445478, 0.011256193, 0.04910064, -0.0013750067, 0.01151369, -0.0639617, 0.06437585, 0.009756044, 0.017726857, -0.08044

The dataset is stored in huggingface in `git-lfs` format, so you need to install `git-lfs` and then run `git lfs install && git lfs checkout` in the repo to download the data. Or download it manually from the [Hugging Face website](https://huggingface.co/datasets/databricks-dolly-15k). Or use the HugingFace CLI:

```bash
$ huggingface-cli download --local-dir onnx-output --repo-type dataset databricks/databricks-dolly-15k databricks-dolly-15k.jsonl
```

Now we load up the database with the data from databricks. This is a one-time operation, so we can skip it for subsequent chat interactions, once the database is running.

In [13]:
import org.springframework.util.StringUtils;
List<Document> list = new ArrayList<>();
reader.readValues(new File("./onnx-output/databricks-dolly-15k.jsonl")).forEachRemaining(line -> {
	Map map = (Map) line;
	if ("closed_qa".equals(map.get("category"))) {
		String instruction = ((String) map.get("instruction")).trim();
		if (StringUtils.hasText(instruction) && !instruction.endsWith("?") && !instruction.endsWith(".") && !instruction.endsWith("!")) {
			instruction = instruction + "?";
		}
		String context = ((String) map.get("context")).trim();
		if (StringUtils.hasText(context)) {
			Document doc = new Document("Content: %s%s".formatted(StringUtils.hasText(instruction) ? instruction + " " : "", context));
			list.add(doc);
		}
	}
});
store.add(list);

Here's an example of what it can do - locate the closest context to a given question:

In [14]:
store.similaritySearch("When was Tomoaki Komorida born?")

[Document{id='a86e5518-0a73-4029-96d1-63ce012eeb06', metadata={distance=0.2492783}, content='Content: When was Tomoaki Komorida born? Komorida was born in Kumamoto Prefecture on July 10, 1981. After graduating from high school, he joined the J1 League club Avispa Fukuoka in 2000. Although he debuted as a midfielder in 2001, he did not play much and the club was relegated to the J2 League at the end of the 2001 season. In 2002, he moved to the J2 club Oita Trinita. He became a regular player as a defensive midfielder and the club won the championship in 2002 and was promoted in 2003. He played many matches until 2005. In September 2005, he moved to the J2 club Montedio Yamagata. In 2006, he moved to the J2 club Vissel Kobe. Although he became a regular player as a defensive midfielder, his gradually was played less during the summer. In 2007, he moved to the Japan Football League club Rosso Kumamoto (later Roasso Kumamoto) based in his local region. He played as a regular player and the

The core of the Spring AI programming model is the chat interaction. Here's how it works:

In [15]:
import org.springframework.ai.chat.model.ChatModel;
var chat = app.getBean(ChatModel.class);

The model doesn't know the answer on its own, so this is unsurprisingly not helpful.

In [16]:
chat.call("When was Tomoaki Komorida born?")

 Tomoaki Komoroda's exact date of birth is not publicly available. I found a professional tennis player named Tomoaki Komorikta, who was born on August 16, 1984, but it seems like he might be a different person. It would be best to double-check the information with reliable sources for accuracy.

When we add the augmentation, the model can find the answer in the context:

In [17]:
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.client.advisor.QuestionAnswerAdvisor;
import org.springframework.ai.vectorstore.SearchRequest;
var builder = app.getBean(ChatClient.Builder.class);
var response = builder.build().prompt()
	.advisors(new QuestionAnswerAdvisor(store, SearchRequest.defaults().withTopK(1)))
	.user("When was Tomoaki Komorida born?")
	.call();

In [18]:
response.content()

 Tomoaki Komorida was born on July 10, 1981.

In [15]:
for (int i=0; i<10; i++) {
	System.out.println(response.content());
}

 Tomoaki Komorida was born on July 10, 1981, as mentioned in the provided context.
 Tomoaki Komorida was born on July 10, 1981.
 Based on the provided context, Tomoaki Komorida was born on July 10, 1981.
 Tomoaki Komorida was born on July 10, 1981.
 Tomoaki Komorida was born on July 10, 1981.
 Based on the provided context, Tomoaki Komorida was born on July 10, 1981.
 Tomoaki Komorida was born on July 10, 1981.
 Tomoaki Komorida was born on July 10, 1981.
 Tomoaki Komorida was born on July 10, 1981.
 Tomoaki Komorida was born on July 10, 1981, as mentioned in the context provided.


The app is configured to truncate the prompts (in the database load and in the chat interaction). It works in the example above, but in general people tend to split the context into multiple parts. This is a limitation of the model, not the app. If you need to do it Spring AI has this feature implemented as a text splitter:

In [30]:
import org.springframework.ai.transformer.splitter.TokenTextSplitter;
TokenTextSplitter splitter = new TokenTextSplitter(200, 50, 5, 400, true);
splitter.split(new Document("Komorida was born in Kumamoto Prefecture on July 10, 1981. After graduating from high school, he joined the J1 League club Avispa Fukuoka in 2000. Although he debuted as a midfielder in 2001, he did not play much and the club was relegated to the J2 League at the end of the 2001 season. In 2002, he moved to the J2 club Oita Trinita. He became a regular player as a defensive midfielder and the club won the championship in 2002 and was promoted in 2003. He played many matches until 2005. In September 2005, he moved to the J2 club Montedio Yamagata. In 2006, he moved to the J2 club Vissel Kobe. Although he became a regular player as a defensive midfielder, his gradually was played less during the summer. In 2007, he moved to the Japan Football League club Rosso Kumamoto (later Roasso Kumamoto) based in his local region. He played as a regular player and the club was promoted to J2 in 2008. Although he did not play as much, he still played in many matches. In 2010, he moved to Indonesia and joined Persela Lamongan. In July 2010, he returned to Japan and joined the J2 club Giravanz Kitakyushu. He played often as a defensive midfielder and center back until 2012 when he retired."))

[Engine-thread-5] INFO org.springframework.ai.transformer.splitter.TextSplitter - Splitting up document into 2 chunks.


[Document{id='dc6b45a0-5014-4231-8b1a-654e1dd80999', metadata={}, content='Komorida was born in Kumamoto Prefecture on July 10, 1981. After graduating from high school, he joined the J1 League club Avispa Fukuoka in 2000. Although he debuted as a midfielder in 2001, he did not play much and the club was relegated to the J2 League at the end of the 2001 season. In 2002, he moved to the J2 club Oita Trinita. He became a regular player as a defensive midfielder and the club won the championship in 2002 and was promoted in 2003. He played many matches until 2005. In September 2005, he moved to the J2 club Montedio Yamagata. In 2006, he moved to the J2 club Vissel Kobe. Although he became a regular player as a defensive midfielder, his gradually was played less during the summer.', media=[]}, Document{id='38c4969b-5cb7-49f3-a360-3894caf38bc6', metadata={}, content='In 2007, he moved to the Japan Football League club Rosso Kumamoto (later Roasso Kumamoto) based in his local region. He played