This project leverages Retrieval Augmented Generation (RAG) to enhance SQL query generation for Oracle Autonomous Data Warehouse (ADW) using the OCI Generative AI service and Marqo AI. By using vector search to narrow down relevant schema data, we provide context to the LLM, enabling it to generate accurate SQL queries.
- Oracle ADW Integration: Utilize Oracle's enterprise-grade data warehouse for scalable and secure data storage.
- Marqo AI: Vectorize and search schema data efficiently.
- OCI Generative AI: Deploy and use the Llama3 70B model for cost-effective and improved SQL generation.
- Python 3.7 or higher
- Docker
- Oracle Cloud account
- Git
git clone https://github.com/ValyrianDev/RAG-SQL-Generator.git
cd RAG-SQL-Generator
python -m venv venv
source venv/bin/activate # On Windows use `venv\Scripts\activate`
pip install -r requirements.txt
Create a .env
file in the root directory and populate it with your database and OCI credentials:
DB_USERNAME=your_db_username
DB_PASSWORD=your_db_password
DB_WALLET_LOCATION=path_to_your_wallet
DB_CONFIG_DIR=path_to_your_wallet
DB_DSN=your_dsn
DB_SCHEMA=your_schema
OCI_COMPARTMENT_ID=your_compartment_id
OCI_CONFIG_PROFILE=your_config_profile
OCI_CONFIG_FILE=path_to_your_oci_config_file
OCI_ENDPOINT=your_oci_service_endpoint
OCI_MODEL_ID=your_model_id
Follow these steps to create and set up an Oracle ADW instance:
- Log into your OCI console.
- Navigate to
Autonomous Data Warehouse
underOracle Database
. - Create a new instance with the
Always Free
configuration. - Download your instance wallet and keep it secure.
- Navigate to
Database Actions
in your ADW instance. - Run the provided SQL script in a newly created schema to populate it with sample data.
docker pull marqoai/marqo
docker run -d -p 8882:8882 marqoai/marqo
Run the main script to fetch the schema, set up the Marqo index, and start generating SQL queries:
python main.py
- Initial Setup: Run the script and choose to refresh the database schema if needed.
- Generate SQL Queries: Enter your query when prompted. The script will search the Marqo index, use the LLM to generate SQL, and execute the query against ADW.