Skip to content

Demo of a Retrieval-Augmented Generation (RAG) system using MariaDB for vector search and LocalAI for embeddings and text generation.

Notifications You must be signed in to change notification settings

alejandro-du/mariadb-rag-demo-java

Repository files navigation

Demo: RAG with MariaDB (Java)

This demo demonstrates how to build a Retrieval-Augmented Generation (RAG) application using MariaDB, LocalAI, and Java.

Note: This demo uses a preview version of MariaDB which includes SQL syntax that will likely change in the next GA (stable) version.

Prerequisites

You only need Docker installed and running on your computer to run this demo.

Setup

Start the LocalAI and MariaDB services (see the docker-compose.yml file):

docker compose up -d

Download the dataset: https://www.kaggle.com/datasets/asaniczka/amazon-canada-products-2023-2-1m-products

Move to the directory where you downloaded the dataset and create a slice of it. For example 50k products:

head -n 50001 ~/Downloads/amz_ca_total_products_data_processed.csv > ~/Downloads/slice.csv

Copy the file to the MariaDB Docker container:

docker cp ~/Downloads/slice.csv mariadb:/slice.csv

Connect to the MariaDB server:

docker exec -it mariadb mariadb -u root -p'password' demo

Load the data from the CSV file into the MariaDB database:

LOAD DATA LOCAL INFILE '/slice.csv'
INTO TABLE products
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 LINES
(
    asin,
    title,
    img_url,
    product_url,
    stars,
    reviews,
    price,
    list_price,
    category_name,
    is_best_seller,
    bought_in_last_month
);

Exit the MariaDB client:

exit

Calculate the vector embeddings:

./UpdateVectors.java

Be patient. This might take a lot of time depending on your hardware and the size of the slice that you took.

Run the demo

Before you run the demo double-check the models downloaded successfully:

docker logs -f local-ai

Start the demo:

./RagDemo.java

About

Demo of a Retrieval-Augmented Generation (RAG) system using MariaDB for vector search and LocalAI for embeddings and text generation.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published