ESSAM

A novel zeroth-order fine-tuning method for improving the mathematical reasoning ability of large language models, which combines Evolution Strategies (ES) with Sharpness-Aware Maximization (SAM).

Create a Python environment

Using conda:

conda create -n essam_env python=3.10
conda activate essam_env

Or using venv:

python -m venv essam_env
source essam_env/bin/activate

Install dependencies

pip install -r requirements.txt

Quick Start

Run ESSAM on the GSM8K dataset:

bash essam_run.sh

Try using the accelerated version of ESSAM, namely ESSAM-F, which can achieve approximately 2× speedup while still obtaining competitive performance:

bash essam-fen_run.sh

Citation

If you find this work helpful in your research, please cite:

@misc{sun2026essamnovelcompetitiveevolution,
      title={ESSAM: A Novel Competitive Evolution Strategies Approach to Reinforcement Learning for Memory Efficient LLMs Fine-Tuning}, 
      author={Zhishen Sun and Sizhe Dang and Guang Dai and Haishan Ye},
      year={2026},
      eprint={2602.01003},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2602.01003}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
gsm8k		gsm8k
utils		utils
LICENSE		LICENSE
README.md		README.md
es-sam-SGD_fine-tuning_gsm8k_accl.py		es-sam-SGD_fine-tuning_gsm8k_accl.py
es-sam-fen-SGD_fine-tuning_gsm8k_accl.py		es-sam-fen-SGD_fine-tuning_gsm8k_accl.py
essam-fen_run.sh		essam-fen_run.sh
essam_run.sh		essam_run.sh
eval_gsm8k_vllm.py		eval_gsm8k_vllm.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ESSAM

Create a Python environment

Install dependencies

Quick Start

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ESSAM

Create a Python environment

Install dependencies

Quick Start

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages