Skip to content

emorynlp/CoSQL-LLM

Repository files navigation

Merging Models

This directory serves as a repository for the code for merging and training CoCodeS, presented in Advancing Conversational Text-to-SQL: Current Landscape and Future Directions with Large Language Models.

This repository contains scripts to prepare datasets, generate GQR summaries, fine-tune, run zero-shot inference, and merge model checkpoints.

Create and activate Conda env

conda create -n codes python=3.8.5
conda activate merging
conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia
pip install -r requirements.txt

Data

Download the following datasets and place them under data/sft_data_collections/:

Layout:

data/
  sft_data_collections/
    spider/
      ...
    cosql/
      ...
    sparc/
      ...
    bird/
      ...

Prepare GQR Summaries

python3 prepare_summary_prompts.py

Prepare Datasets for SFT

python3 prepare_sft_datasets.py

Fine-tuning

Follow the instructions on the CodeS GitHub to set up accelerate for finetuning, and download the sic_ckpt folder from there as well.
Example: fine-tuning BIRD CodeS model on GQR CoSQL:

accelerate launch train_causal_lm.py   --per_device_train_batch_size 1   --block_size 4096   --seed 42   --pretrained_model_name_or_path seeklhy/codes-7b-bird   --epochs 4   --lr 5e-6   --warmup_ratio 0.05   --checkpointing_steps 100000   --tensorboard_log_dir ./train_logs/codes-7b-spider-cosql-bird   --mode sft   --output_ckpt_dir ./ckpts/codes-7b-bird-cosql-gqr   --text2sql_data_dir ./data/sft_cosql_train_gpt.json   --table_num 6   --column_num 10

Inference

With GQR summaries

python -u prepare_zero_shot_summaries.py   --llm_path path/to/ckpt   --sic_path ./sic_ckpts/sic_spider   --table_num 6   --column_num 10   --max_tokens 4096   --max_new_tokens 256   --output_path results.txt

On full history

python -u prepare_zero_shot.py   --llm_path path/to/ckpt   --sic_path ./sic_ckpts/sic_spider   --table_num 6   --column_num 10   --max_tokens 4096   --max_new_tokens 256   --output_path results.txt

Merge Models

To Merge models, use the following:

python3 merge.py   --llm_path1 path/to/ckpt1   --llm_path2 path/to/ckpt2   --save_path path/to/ckpt_merged

After this, you can run inference on the new model.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages