# PDF Q&A - Simple RAG (Retrieval-Augmented Generation) System

A simple RAG system to provide answers based on a PDF file.

## Prerequisites 

This example requires the following:
-  [Chroma](https://docs.trychroma.com/) as the vector database. Install using `pip install chromadb`.
-  Ollama to run local models.

Models used in this example. Make sure these models are pulled first. You are free to switch to any other models.
- Embedding - [bge-large](https://ollama.com/library/bge-large)
- Chat generation - [Llama 3.2](https://ollama.com/library/llama3.2)

In [14]:
@file:Repository("https://repo1.maven.org/maven2")
@file:Repository("https://repo.spring.io/milestone/")
@file:DependsOn("org.springframework.ai:spring-ai-ollama:1.0.0-M2")
@file:DependsOn("org.springframework.ai:spring-ai-tika-document-reader:1.0.0-M2")
@file:DependsOn("org.springframework.ai:spring-ai-pdf-document-reader:1.0.0-M2")
@file:DependsOn("org.springframework.ai:spring-ai-chroma-store:1.0.0-M2")

## Start Chroma server

Start the chroma server, the default port is `8000`.

You can also start Chroma directly using `chroma run`.

In [36]:
// Start Chroma server using Java

val dbPath = java.nio.file.Files.createTempDirectory("chroma").toFile()
val pb = ProcessBuilder("chroma", "run", "--path", "db")
pb.directory(dbPath)
pb.start()
println("Chroma server started, db path: $dbPath")

Chroma server started, db path: C:\Users\Alex\AppData\Local\Temp\chroma2197048964601029228


## Load PDF content into Chroma

Parse the PDF file and split into chunks, then save to Chroma.

In [11]:
import org.springframework.ai.ollama.*
import org.springframework.ai.ollama.api.*

val embeddingModel = OllamaEmbeddingModel(OllamaApi(), OllamaOptions.create().withModel("bge-large"))

In [26]:
import java.nio.file.Path
import org.springframework.ai.reader.pdf.PagePdfDocumentReader
import org.springframework.ai.vectorstore.VectorStore
import org.springframework.core.io.FileSystemResource
import org.springframework.ai.transformer.splitter.TokenTextSplitter
import org.springframework.ai.vectorstore.ChromaVectorStore
import org.springframework.ai.chroma.ChromaApi

val reader = PagePdfDocumentReader(FileSystemResource(Path.of("../", "data", "Understanding_Climate_Change.pdf")))
val splitter = TokenTextSplitter()
val docs = splitter.split(reader.read())
println("${docs.size} docs to store")

val chromaUrl = "http://localhost:8000"
val chromaApi = ChromaApi(chromaUrl)
val collectionName = "pdf-qa"
try {
    chromaApi.getCollection(collectionName)
} catch (e: Exception) {
    println("Create collection $collectionName")
    chromaApi.createCollection(ChromaApi.CreateCollectionRequest(collectionName))
}
val chromaVectorStore = ChromaVectorStore(embeddingModel, chromaApi, collectionName, true)
chromaVectorStore.afterPropertiesSet()
chromaVectorStore.add(docs)
println("Docs loaded to Chroma")

33 docs to store
Docs loaded to Chroma


## Query

Use `QuestionAnswerAdvisor` to implement simple RAG.

In [35]:
import org.springframework.ai.chat.client.ChatClient
import org.springframework.ai.ollama.OllamaChatModel
import org.springframework.ai.ollama.api.OllamaOptions
import org.springframework.ai.chat.client.advisor.QuestionAnswerAdvisor
import org.springframework.ai.vectorstore.SearchRequest

val chatClient = ChatClient.builder(OllamaChatModel(OllamaApi())).build()
val options = OllamaOptions.builder().withModel("llama3.2:1b").build()
val qaAdvisor = QuestionAnswerAdvisor(chromaVectorStore, SearchRequest.defaults().withTopK(3))
val query = "What is the main cause of climate change?"
val output = chatClient.prompt().user(query).options(options).advisors(qaAdvisor).call().content()
println(output)

The Greenhouse Effect and Climate Change: Understanding the Issues

You've been informed about the greenhouse effect and climate change, which are crucial topics for your understanding of our planet's situation.

**The Greenhouse Effect**

As mentioned in the text, the greenhouse effect is a natural process that occurs when certain gases in the Earth's atmosphere, such as carbon dioxide (CO2), methane (CH4), and nitrous oxide (N2O), trap heat from the sun. This trapping of heat leads to a "greenhouse effect," keeping the planet warm enough to support life.

**Human Activities**

The main cause of climate change is human activities that release large amounts of CO2 into the atmosphere, primarily through fossil fuel combustion for energy and transportation. The industrial revolution marked the beginning of a significant increase in fossil fuel consumption, which continues to rise today.

**Consequences of Climate Change**

The consequences of climate change are far-reaching and devastati