A list of awesome papers and resources of the intersection of Large Language Models and Evolutionary Computation.
🎉 News: Our survey has been released. Evolutionary Computation in the Era of Large Language Model: Survey and Roadmap
The related work and projects will be updated soon and continuously.
If our work has been of assistance to you, please feel free to cite our survey. Thank you.
@article{wu2024evolutionary,
title={Evolutionary Computation in the Era of Large Language Model: Survey and Roadmap},
author={Wu, Xingyu and Wu, Sheng-hao and Wu, Jibin and Feng, Liang and Tan, Kay Chen},
journal={CoRR},
volume={abs/2401.10034},
year={2024}
}
- Interdisciplinary Research on LLM and Evolutionary Computation
- Table of Contents
Name | Paper | Venue | Year | Code | Enhancement Aspect |
---|---|---|---|---|---|
OptiChat | Diagnosing Infeasible Optimization Problems Using Large Language Models | arXiv | 2023 | Python | Identify potential sources of infeasibility |
AS-LLM | Large Language Model-Enhanced Algorithm Selection: Towards Comprehensive Algorithm Representation | IJCAI | 2024 | Python | Algorithm representation and algorithm selection |
GP4NLDR | Explaining Genetic Programming Trees Using Large Language Models | arXiv | 2024 | N/A | Provide explainability for results of EA |
Singh et al. | Enhancing Decision-Making in Optimization through LLM-Assisted Inference: A Neural Networks Perspective | IJCNN | 2024 | N/A | Provide explainability for results of EA |
Note: Approaches discussed here primarily focus on LLM architecture search, and their techniques are based on EAs.
Name | Paper | Venue | Year | Code | LLM |
---|---|---|---|---|---|
AutoBERT-Zero | AutoBERT-Zero: Evolving BERT Backbone from Scratch | AAAI | 2022 | Python | BERT |
SuperShaper | SuperShaper: Task-Agnostic Super Pre-training of BERT Models with Variable Hidden Dimensions | arXiv | 2021 | N/A | BERT |
AutoTinyBERT | AutoTinyBERT: Automatic Hyper-parameter Optimization for Efficient Pre-trained Language Models | ACL | 2021 | Python | BERT |
LiteTransformerSearch | LiteTransformerSearch: Training-free Neural Architecture Search for Efficient Language Models | NeurIPS | 2022 | Python | GPT-2 |
Klein et al. | Structural Pruning of Large Language Models via Neural Architecture Search | AutoML | 2023 | N/A | BERT |
Choong et al. | Jack and Masters of All Trades: One-Pass Learning of a Set of Model Sets from Foundation AI Models | IEEE CIM | 2023 | N/A | M2M100-418M, ResNet-18 |
Name | Paper | Venue | Year | Code | Enhancement Aspect |
---|---|---|---|---|---|
Length-Adaptive Transformer Model | Length-Adaptive Transformer: Train Once with Length Drop, Use Anytime with Search | ACL | 2021 | Python | Automatically adjust the sequence length according to different computational resource constraints |
HexGen | HexGen: Generative Inference of Large-Scale Foundation Model over Heterogeneous Decentralized Environment | arXiv | 2023 | Python | Deploy generative inference services for LLMs in a heterogeneous distributed environment |
LongRoPE | LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens | arXiv | 2023 | Python | Extend the context window of LLMs to 2048k tokens |
Evolutionary Model Merge | Evolutionary Optimization of Model Merging Recipes | arXiv | 2024 | Python | Utilize CMA-ES algorithm to optimize merged LLM in both parameter and data flow space |
BLADE | BLADE: Enhancing Black-box Large Language Models with Small Domain-Specific Models | arXiv | 2024 | N/A | Find soft prompts that optimizes the consistency between the outputs of two models |
Self-evolution in LLM | A Survey on Self-Evolution of Large Language Models | arXiv | 2024 | Summary | Some studies for LLM self-evolution also adopted the ideas of EAs |
Name | Paper | Venue | Year | Code | Applicable scenarios |
---|---|---|---|---|---|
Kang et al. | Towards Objective-Tailored Genetic Improvement Through Large Language Models | Workshop at ICSE | 2023 | N/A | Software Optimization |
Brownlee et al. | Enhancing Genetic Improvement Mutations Using Large Language Models | SSBSE | 2023 | N/A | Software Optimization |
TitanFuzz | Large Language Models Are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models | ISSTA | 2023 | N/A | Software Testing |
CodaMOSA | CODAMOSA: Escaping Coverage Plateaus in Test Generation with Pre-trained Large Language Models | ICSE | 2023 | Python | Software Testing |
SBSE | Search-based Optimisation of LLM Learning Shots for Story Point Estimation | SSBSE | 2023 | N/A | Software Project Planning |
Note: Methods reviewed here leverage the synergistic combination of EAs and LLMs, which are more versatile and not limited to LLM architecture search alone, applicable to a broader range of NAS tasks..
Hope our conclusion can help your work.