Improving Weak-to-Strong Generalization with Scalable Oversight and Ensemble Learning

Jitao Sang, Yuhang Wang, Jing Zhang, Yanxu Zhu, Chao Kong, Junhong Ye, Shuyu Wei and Jinlin Xiao

Department of Computer Science

Beijing Jiaotong University

Our study simulates two phases of superalignment under the W2SG framework: the development of general superhuman models and the progression towards superintelligence. In the first phase, based on human supervision, the quality of weak supervision is enhanced through a combination of scalable oversight and ensemble learning, reducing the capability gap between weak teachers and strong students. In the second phase, an automatic alignment evaluator is employed as the weak supervisor. By recursively updating this auto aligner, the capabilities of the weak teacher models are synchronously enhanced, achieving weak-to-strong supervision over stronger student models.

This project contains experimental code for exploring the first phase in our paper.

Get Started

Please prepare the environment and install packages according to the weak-to-strong project.

Experimental dataset

The dataset utilized in our study was derived through sampling from the SciQ dataset (see Raw Data LINK for more details). Our constructed dataset can be accessed via the following LINK.

Code

The code related to section 3, 4, and 5 in the paper is located in the src directory. The code will be released after organization.

Acknowledgement

The open-source project weak-to-strong.
Hugging Face for their open-source transformer models.

Citation

@misc{sang2024improving,
      title={Improving Weak-to-Strong Generalization with Scalable Oversight and Ensemble Learning}, 
      author={Jitao Sang and Yuhang Wang and Jing Zhang and Yanxu Zhu and Chao Kong and Junhong Ye and Shuyu Wei and Jinlin Xiao},
      year={2024},
      eprint={2402.00667},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
images		images
src		src
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

images

images

src

src

.gitignore

.gitignore

README.md

README.md

Repository files navigation

Improving Weak-to-Strong Generalization with Scalable Oversight and Ensemble Learning

Get Started

Experimental dataset

Code

Acknowledgement

Citation

About

Releases

Packages

Contributors 7

Languages

ADaM-BJTU/W2SG

Folders and files

Latest commit

History

Repository files navigation

Improving Weak-to-Strong Generalization with Scalable Oversight and Ensemble Learning

Get Started

Experimental dataset

Code

Acknowledgement

Citation

About

Resources

Stars

Watchers

Forks

Languages