Skip to content

DeepShareAI/HighMath

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ENHANCING MULTILINGUAL REASONING IN LLMS:INSIGHTS FROM CROSS-LINGUISTIC CORRELATIONS AND OPTIMAL DATA PROPORTIONS

📃 Paper | 🤗 Huggingface | 📭 Contact

alt text

⛰️ Overview

  • This repository shares the code and dataset of our latest work on multilingual reasoning. In this work, we present a novel construction of dataset which performs targeted language alignment for best use of the LLMs English reasoning abilities.
  • Utilizing this dataset, you can finetune open-source LLMs into strong multilingual reasoning systems. For example, our fine-tuned LLaMA2-7B achieves superior multilingual performance, significantly outperforming baseline models of equivalent size.
  • Overall, our method effectively reduces the performance disparity of LLMs across English and non-English languages, showing a new paradigm to unlock LLM’s capabilities to accompolish multilingual tasks.
  • Please note that the core contribution of our work is to provide an idea for constructing datasets and to open-source the HighMath dataset. The code in the repository is just an example of how to use it, so please adjust the usage according to your actual situation.

📈 Benchmarks

Below we present LLMs' average answer accuracy (zero-shot) on multilingual reasoning benchmarks. With HighMath, our fine-tuned LLM surpasses the unaligned counterpart and the translate-training baseline by a large margin.

System (7B) Monolingual Supervision Multilingual Supervision mGSM mSVAMP
HighMath (ours) - HighMath 52.5 65.2
MetaMath MetaMathQA - 38.4 46.2
MathOctopus - GSM8KInstruct 40.0 44.1
WizardMath GSM8K & MATH - 23.0 32.5
MAmmoTh MathInstruct - 21.3 26.3
RFT GSM8k-ScRel - 20.6 31.3
SFT GSM8K - 22.6 30.9

📂 Dataset

In the table below, we list datasets that are used in this project. All datasets are available through the url.

Dataset Usage Size Languages
HighMath Training 395,000 En, Bn, Th, Sw, Ja, Zh, De, Fr, Ru, Es
MetaMathQA Training 395,000 En
GSM8KInstruct Training 73,559 En, Bn, Th, Sw, Ja, Zh, De, Fr, Ru, Es
mGSM Evaluation 2,500 En, Bn, Th, Sw, Ja, Zh, De, Fr, Ru, Es
mSVAMP Evaluation 10,000 En, Bn, Th, Sw, Ja, Zh, De, Fr, Ru, Es

🛠️ Training

We develope our training pipeline based on the stanford_alpaca repository.

To perform finetuning on pre-trained LLMs, use the following command. When fine-tuning the 70B model, we utilize DeepSpeed to save memory. You can find our deepspeed configuration in the repo.

Please note that the training configuration in the repository is provided as a reference example. Please customize the settings according to your actual requirements and hardware conditions.

The recommended configuration is 8xA100 GPUs.

  • Exanple: finetuning LLaMA2-7B
bash ./ds_run.sh

🌲 Citation

If you find this repository helpful, feel free to cite our paper:

@inproceedings{
wang2025enhancing,
title={Enhancing Multilingual Reasoning in {LLM}s: Insights from Cross-Linguistic Correlations and Optimal Data Proportions},
author={Jiangkuo Wang and Suyv Ma and Mingpeng Wei},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=S6cBH99BhB}
}

About

HighMath

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published