Skip to content

openevolve does not correctly clean up when there are failures that lead to retries #288

@theahura

Description

@theahura

Recently was running an experiment with openevolve and our compute cluster was, unbenownst to me, down. This lead to some interesting openevolve behavior -- it would fail, retry a bunch of times based on the config, and then instead of actually exiting when all the retries failed, it would...keep going?

Here's an example of the output logs:

2025-10-11 21:36:42,862 - ERROR - All 1 attempts failed with error: Connection error.                                                                                                                                                                                                                                                                                                     
2025-10-11 21:36:42,862 - DEBUG - Raising connection error                                                                                                                                                                                                                                                                                                                                
2025-10-11 21:36:42,862 - ERROR - LLM generation failed: Connection error.                                                                                                                                                                                                                                                                                                                
2025-10-11 21:36:42,862 - ERROR - LLM generation failed: Connection error.                                                                                                                                                                                                                                                                                                                
2025-10-11 21:36:42,862 - ERROR - All 1 attempts failed with error: Connection error.                                                                                                                                                                                                                                                                                                     
2025-10-11 21:36:42,862 - ERROR - LLM generation failed: Connection error.                                                                                                                                                                                                                                                                                                                
2025-10-11 21:36:42,863 - ERROR - LLM generation failed: Connection error.                                                                                                                                                                                                                                                                                                                
2025-10-11 21:36:42,864 - WARNING - Iteration 147 error: LLM generation failed: Connection error.                                                                                                                                                                                                                                                                                         
2025-10-11 21:36:42,864 - DEBUG - Sampled parent 3b530fc4-8c86-4afe-809a-2773e5d8d917 and 0 
inspirations from island 0                                                                                                                                                                                                                                                                    
2025-10-11 21:36:42,864 - WARNING - Iteration 148 error: LLM generation failed: Connection error.                                                                                                                                                                                                                                                                                         
2025-10-11 21:36:42,865 - DEBUG - Sampled parent 3b530fc4-8c86-4afe-809a-2773e5d8d917 and 0 
inspirations from island 0                                                                                                                                                                                                                                                                    
2025-10-11 21:36:42,865 - WARNING - Iteration 149 error: LLM generation failed: Connection error.                                                                                                                                                                                                                                                                                         
2025-10-11 21:36:42,865 - DEBUG - Using selector: EpollSelector                                                                                                                                                                                                                                                                                                                           
2025-10-11 21:36:42,865 - DEBUG - Sampled parent 3b530fc4-8c86-4afe-809a-2773e5d8d917 and 0 
inspirations from island 0                                                                                                                                                                                                                                                                    
2025-10-11 21:36:42,865 - INFO - Sampled model: gpt-oss-120b                                                                                                                                                                                                                                                                                                                              
2025-10-11 21:36:42,865 - WARNING - Iteration 150 error: LLM generation failed: Connection error.                                                                                                                                                                                                                                                                                         
2025-10-11 21:36:42,865 - DEBUG - Using selector: EpollSelector                                                                                                                                                                                                                                                                                                                           
2025-10-11 21:36:42,865 - DEBUG - Sampled parent 3b530fc4-8c86-4afe-809a-2773e5d8d917 and 0 
inspirations from island 0                                                                                                                                                                                                                                                                    
2025-10-11 21:36:42,866 - INFO - Sampled model: gpt-oss-120b                                                                                                                                                                                                                                                                                                                              
2025-10-11 21:36:42,866 - WARNING - Iteration 151 error: LLM generation failed: Connection error.                                                                                                                                                                                                                                                                                         
2025-10-11 21:36:42,866 - DEBUG - Using selector: EpollSelector                                                                                                                                                                                                                                                                                                                           
2025-10-11 21:36:42,866 - DEBUG - Sampled parent 3b530fc4-8c86-4afe-809a-2773e5d8d917 and 0 
inspirations from island 0                                                                                                                                                                                                                                                                    
2025-10-11 21:36:42,866 - INFO - Sampled model: gpt-oss-120b                                                                                                                                                                                                                                                                                                                              
2025-10-11 21:36:42,866 - WARNING - Iteration 152 error: LLM generation failed: Connection error.                                                                                                                                                                                                                                                                                         
2025-10-11 21:36:42,866 - DEBUG - Using selector: EpollSelector                                                                                                                                                                                                                                                                                                                           
2025-10-11 21:36:42,866 - DEBUG - Sampled parent 3b530fc4-8c86-4afe-809a-2773e5d8d917 and 0 
inspirations from island 0                                                                                                                                                                                                                                                                    
2025-10-11 21:36:42,867 - INFO - Sampled model: gpt-oss-120b                                                                                                                                                                                                                                                                                                                              
2025-10-11 21:36:42,867 - WARNING - Iteration 153 error: LLM generation failed: Connection error.                                                                                                                                                                                                                                                                                         
2025-10-11 21:36:42,867 - DEBUG - Using selector: EpollSelector                                                                                                                                                                                                                                                                                                                           
2025-10-11 21:36:42,867 - DEBUG - Sampled parent 3b530fc4-8c86-4afe-809a-2773e5d8d917 and 0 
inspirations from island 0                                                                                                                                                                                                                                                                    
2025-10-11 21:36:42,867 - INFO - Sampled model: gpt-oss-120b                                                                                                                                                                                                                                                                                                                              
2025-10-11 21:36:42,867 - WARNING - Iteration 154 error: LLM generation failed: Connection error.     

Is there any kind of error handling that I am missing here? It seems like this is all happening inside the openevolve run function, so I can't figure out how to stop this behavior without monkeypatching the library.

More generally, I can imagine all kinds of issues with our server that I'd want openevolve to robustly fail for, e.g. if openevolve hits a max token limit or times out our server

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions