Different handling of programs that fail to run?

I'm not sure how common this is, but in the experiments I've been doing (relatively big program, mostly using Gemini 2.5 pro and Claude 4 Opus), in most iterations the LLM gives back a program that at least works. But non-functional programs are somewhat common too (using nonexistent API, syntax errors). As I understand it, the common practice in this case is to give the iteration some made up low `combined_score`. That's ok, but it seems a bit of a waste. I wonder if anyone else thinks it might be useful to build in an additional special case iteration with the LLM where a prompt is generated that says something like "You gave me this code XXX, but it failed with this error YYY. Try to fix the error without significant logic changes".

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Different handling of programs that fail to run? #192

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Different handling of programs that fail to run? #192

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions