RAG the Easy Way with Oracle AI Vector Search, LangChain4J, OracleEmbeddingStore, and Oracle Database 23ai
A minimal Java example demonstrating how to build an embedding store using LangChain4J’s Oracle extension, store embeddings in Oracle Database 23ai, and perform vector similarity retrieval for a basic RAG workflow.
- Java Development Kit (JDK) 17+
- Oracle Database Free Release 23ai (via container image)
- Oracle JDBC Driver 23ai (version 23.9.0.25.07)
- Preferred Java IDE (Eclipse, IntelliJ IDEA, or VS Code)
- Build tool: Maven or Gradle
Edit your pom.xml to include:
<properties>
<java.version>17</java.version>
<maven.compiler.source>17</maven.compiler.source>
<maven.compiler.target>17</maven.compiler.target>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<jdbc.version>23.9.0.25.07</jdbc.version>
<langchain4j.version>1.3.0‑beta9</langchain4j.version>
</properties>
<dependencies>
<!-- Oracle Database 23ai JDBC / UCP -->
<dependency>
<groupId>com.oracle.database.jdbc</groupId>
<artifactId>ojdbc17</artifactId>
<version>${jdbc.version}</version>
</dependency>
<dependency>
<groupId>com.oracle.database.jdbc</groupId>
<artifactId>ucp17</artifactId>
<version>${jdbc.version}</version>
</dependency>
<!-- LangChain4J Oracle Extension -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j‑oracle</artifactId>
<version>${langchain4j.version}</version>
</dependency>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j‑embeddings‑all‑minilm‑l6‑v2‑q</artifactId>
<version>${langchain4j.version}</version>
</dependency>
</dependencies>Create a Java class (e.g., OracleEmbeddingStoreExample.java) to demonstrate embedding storage and retrieval:
package com.oracle.dev.jdbc.langchain4j;
import java.sql.SQLException;
import javax.sql.DataSource;
import dev.langchain4j.data.embedding.Embedding;
import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.model.embedding.EmbeddingModel;
import dev.langchain4j.model.embedding.onnx.allminilml6v2q.AllMiniLmL6V2QuantizedEmbeddingModel;
import dev.langchain4j.rag.content.Content;
import dev.langchain4j.rag.content.retriever.ContentRetriever;
import dev.langchain4j.rag.content.retriever.EmbeddingStoreContentRetriever;
import dev.langchain4j.rag.query.Query;
import dev.langchain4j.store.embedding.EmbeddingStore;
import dev.langchain4j.store.embedding.oracle.CreateOption;
import dev.langchain4j.store.embedding.oracle.OracleEmbeddingStore;
public class OracleEmbeddingStoreExample {
public static void main(String[] args) throws SQLException {
DataSource dataSource = OracleDBUtils.getPooledDataSource();
EmbeddingStore<TextSegment> embeddingStore = OracleEmbeddingStore.builder()
.dataSource(dataSource)
.embeddingTable("test_content_retriever", CreateOption.CREATE_OR_REPLACE)
.build();
EmbeddingModel embeddingModel = new AllMiniLmL6V2QuantizedEmbeddingModel();
ContentRetriever retriever = EmbeddingStoreContentRetriever.builder()
.embeddingStore(embeddingStore)
.embeddingModel(embeddingModel)
.maxResults(2)
.minScore(0.5)
.build();
TextSegment segment1 = TextSegment.from("I like soccer.");
Embedding embedding1 = embeddingModel.embed(segment1).content();
embeddingStore.add(embedding1, segment1);
TextSegment segment2 = TextSegment.from("I love Stephen King.");
Embedding embedding2 = embeddingModel.embed(segment2).content();
embeddingStore.add(embedding2, segment2);
Content match = retriever.retrieve(Query.from("What is your favourite writer?")).get(0);
System.out.println(match.textSegment());
}
}Provide a helper class to manage the Oracle JDBC connection:
package com.oracle.dev.jdbc.langchain4j;
import java.sql.Connection;
import java.sql.SQLException;
import javax.sql.DataSource;
import java.util.Properties;
import oracle.ucp.jdbc.PoolDataSource;
import oracle.ucp.jdbc.PoolDataSourceFactory;
public class OracleDBUtils {
private final static String URL = "jdbc:oracle:thin:@localhost:1521/FREEPDB1";
private final static String USERNAME = System.getenv("DB_23AI_USERNAME");
private final static String PASSWORD = System.getenv("DB_23AI_PASSWORD");
public static DataSource getPooledDataSource() throws SQLException {
PoolDataSource pds = PoolDataSourceFactory.getPoolDataSource();
pds.setConnectionFactoryClassName("oracle.jdbc.pool.OracleDataSource");
pds.setURL(URL);
pds.setUser(USERNAME);
pds.setPassword(PASSWORD);
Properties prop = new Properties();
prop.setProperty("oracle.jdbc.vectorDefaultGetObjectType", "String");
pds.setConnectionProperties(prop);
pds.setInitialPoolSize(10);
return pds;
}
}- Compile and run the
OracleEmbeddingStoreExamplein your IDE of choice. - Expect output corresponding to whichever segment best matches the query
"What is your favourite writer?"— in this case, likely"I love Stephen King.". - Try experimenting with different queries (e.g.,
"What is your favourite sport?") to see semantic retrieval in action.
You’ve successfully built a minimal Java RAG workflow leveraging:
- Oracle Database 23ai as a vector store,
- LangChain4J Oracle extension for seamless integration,
- Lightweight ONNX-based embedding model,
- Simple JDBC setup and retrieval logic.
With this foundation, you're well-positioned to expand into more complex RAG scenarios, integrate other embedding models, or connect with LLMs for generative applications.