Skip to content

SQLxAI/multilingual_sql_grpo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multilingual Text-to-SQL with GRPO and Contrastive Rewards

This repository contains the implementation of a multilingual Text-to-SQL system that fine-tunes Llama-3 models using GRPO (Generalized Reinforcement Policy Optimization) with contrastive rewards.

Project Overview

This research focuses on improving Text-to-SQL capabilities across multiple languages through:

  1. Contrastive Learning: Cross-lingual embeddings using XLM-RoBERTa with custom projection layers
  2. Reinforcement Learning: Fine-tuning with GRPO using both execution and contrastive rewards
  3. Multilingual Support: Optimized for all 7 languages in MultiSpider: English (EN), Spanish (ES), German (DE), French (FR), Japanese (JA), Chinese (ZH), and Vietnamese (VI)

Key Components

  • Data: MultiSpider dataset (dreamerdeo/multispider on Hugging Face)
  • Base Model: Meta-Llama-3-3B-Instruct
  • Cross-lingual Encoder: XLM-RoBERTa with custom projection layers
  • Training Methods: Contrastive learning and GRPO fine-tuning
  • Evaluation Metrics: Execution Accuracy (ExecAcc), Semantic Accuracy (SemAcc), SQL length, Embedding Similarity

Project Structure

multilingual_sql_grpo/
├── data/                   # For MultiSpider dataset and schemas
├── models/                 # For base models and fine-tuned checkpoints
├── src/                    # Core implementation code
│   ├── encoder/            # Contrastive encoder implementation
│   ├── training/           # GRPO training implementation
│   ├── evaluation/         # Evaluation metrics and scripts
│   ├── rewards/            # Reward functions (contrastive, execution)
│   └── utils/              # Utility functions
├── configs/                # Configuration files
├── notebooks/              # Jupyter notebooks for analysis
├── scripts/                # Utility scripts
└── README.md               # Project documentation

Key Features

  • GPU-Optimized Training: Efficient GRPO training implementation
  • Cross-Lingual Similarity: ~0.9 cross-lingual similarity using contrastive learning
  • Performance Improvement: ~7-10% average improvement in ExecAcc across languages
  • Ablation Studies: Comparison of models with and without contrastive rewards

Requirements

See requirements.txt for detailed dependencies.

Getting Started

  1. Setup Environment:

    pip install -r requirements.txt
    
  2. Prepare Data:

    python -m src.utils.prepare_data
    
  3. Train Contrastive Encoder:

    python -m src.encoder.train
    
  4. GRPO Fine-tuning:

    python -m src.training.grpo_trainer
    
  5. Evaluation:

    python -m src.evaluation.evaluate
    

Citation

If you use this code in your research, please cite our paper:

@inproceedings{multilingual-sql-2025,
  title={Improving Multilingual Text-to-SQL with Contrastive and Execution Rewards},
  author={},
  booktitle={Improving Multilingual Text-to-SQL with Contrastive and Execution Rewards},
  year={2025}
}

License

MIT

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors