# RAG Example as Notebook

Activate the LangChain4J extension

In [0]:
use(JTExtension.LANG_CHAIN4J_EXTENSION);

## Setup some defaults

In [0]:
var OLLAMA_HOST = "http://localhost:11434";
var chatModelName = "granite3.3";
var embedModelName ="granite-embedding";

## Initialize Local Embedding Model (Ollama)

In [0]:
var embeddingModel = OllamaEmbeddingModel.builder()
 .baseUrl(OLLAMA_HOST)
 .modelName(embedModelName)
 .build()

## Initialize Local LLM (Ollama Chat Model)

In [0]:
var chatModel = OllamaChatModel.builder()
 .baseUrl(OLLAMA_HOST)
 .modelName(chatModelName)
 .temperature(0.7)
 .timeout(Duration.ofMinutes(2)) // Adjust timeout if models are slow to respond
 .build();

## Creating In-Memory Embedding Store... 

In [0]:
var embeddingStore = new InMemoryEmbeddingStore<TextSegment>();

## Load and Process Documents

Load some markdown files describing JEP's
- 502
- 507
- 526


In [0]:
var jep502= UrlDocumentLoader.load("file:///Users/sven/work/jtaccuino_sr_prod/jeps/502.md", new TextDocumentParser());
var jep507= UrlDocumentLoader.load("file:///Users/sven/work/jtaccuino_sr_prod/jeps/507.md", new TextDocumentParser());
var jep526= UrlDocumentLoader.load("file:///Users/sven/work/jtaccuino_sr_prod/jeps/526.md", new TextDocumentParser());
var documents = List.of(jep502, jep507, jep526);
documents.size()

Embed the segments and add them to the embedding store

In [0]:
var ingestionResult = EmbeddingStoreIngestor.builder()
            .embeddingStore(embeddingStore)
            .embeddingModel(embeddingModel)
            .documentSplitter(DocumentSplitters.recursive(100, 25))
            .build()
            .ingest(documents)

## Create a Conversational Retrieval Chain

In [0]:
var contentRetriever = EmbeddingStoreContentRetriever.builder()
                .embeddingStore(embeddingStore)
                .embeddingModel(embeddingModel)
                .maxResults(2)
                .minScore(0.7)
                .build()

In [0]:
var queryTransformer = new CompressingQueryTransformer(chatModel);


In [0]:
var retrievalAugmentor = DefaultRetrievalAugmentor.builder()
                .queryTransformer(queryTransformer)
                .contentRetriever(contentRetriever)
                .build()

In [0]:
public interface Assistant {
    String answer(String query);
}

In [0]:
var assistant = AiServices.builder(Assistant.class).chatModel(chatModel)
                .retrievalAugmentor(retrievalAugmentor)
                .chatMemory(MessageWindowChatMemory.withMaxMessages(10))
                .build();


RAG System Ready

Type your questions

In [0]:
var query = "Use only local data for answering questions. Pretend you are an expert Java Language Architect, but do not tell me. Format the output to have not more than 80 characters per line in a fluent style. Do not use any enumerative style."

In [0]:
assistant.answer(query)

In [0]:
assistant.answer("Tell me about pattern matching for primitive types in Java")

In [0]:
assistant.answer("Can you provide some example code")

In [0]:
assistant.answer("Are there some more details on conversions for numerical types with regards to pattern matching?")

In [0]:
assistant.answer("Are there implications due the numerical matching between int and float for example")