Skip to content

Bug: --test flag doesn't attempt to fix failing tests #4214

@matt-152

Description

@matt-152

Issue

When using the --test flag, the test output is added to the chat, but the AI does not attempt to fix any failing tests. As a workaround, I can substitute --test for -m "/test", while changing nothing else, and it works as expected.

Testing using pytest in a repo with a single file test_main.py:

# fib() is not defined, test throws an error
def test_fib():
    assert fib(0) == 0
    assert fib(1) == 1
    assert fib(2) == 1

Running with --test

tmp.cNBFMciAtZ % aider --test
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Aider v0.84.0
Main model: anthropic/claude-sonnet-4-20250514 with diff edit format, infinite output
Weak model: anthropic/claude-3-5-haiku-20241022
Git repo: .git with 0 files
Repo-map: using 4096 tokens, auto refresh
============================= test session starts ==============================
platform darwin -- Python 3.13.4, pytest-8.4.0, pluggy-1.6.0
rootdir: /private/var/folders/3d/zh24g9gj33l0dvn1cvvdk8r80000gp/T/tmp.cNBFMciAtZ
collected 1 item                                                               

test_main.py F                                                           [100%]

=================================== FAILURES ===================================
___________________________________ test_fib ___________________________________

    def test_fib():
>       assert fib(0) == 0
               ^^^
E       NameError: name 'fib' is not defined

test_main.py:2: NameError
=========================== short test summary info ============================
FAILED test_main.py::test_fib - NameError: name 'fib' is not defined
============================== 1 failed in 0.05s ===============================
Added 20 lines of output to the chat.
tmp.cNBFMciAtZ % 

Running with -m "/test"

tmp.cNBFMciAtZ % aider -m "/test"
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Aider v0.84.0
Main model: anthropic/claude-sonnet-4-20250514 with diff edit format, infinite output
Weak model: anthropic/claude-3-5-haiku-20241022
Git repo: .git with 0 files
Repo-map: using 4096 tokens, auto refresh

============================= test session starts ==============================
platform darwin -- Python 3.13.4, pytest-8.4.0, pluggy-1.6.0
rootdir: /private/var/folders/3d/zh24g9gj33l0dvn1cvvdk8r80000gp/T/tmp.cNBFMciAtZ
collected 1 item                                                               

test_main.py F                                                           [100%]

=================================== FAILURES ===================================
___________________________________ test_fib ___________________________________

    def test_fib():
>       assert fib(0) == 0
               ^^^
E       NameError: name 'fib' is not defined

test_main.py:2: NameError
=========================== short test summary info ============================
FAILED test_main.py::test_fib - NameError: name 'fib' is not defined
============================== 1 failed in 0.04s ===============================
Added 20 lines of output to the chat.
I can see that the test is failing because the fib function is not defined. The test in test_main.py is trying to call fib(0) but it can't find the function.                                                                                                                                                                 

To help you fix this, I need to see the current files. Could you please add test_main.py to the chat so I can see what imports it has and what it's trying to test? I may also need to see other files like main.py or wherever the fib function should be defined.                                                           


Tokens: 3.2k sent, 109 received. Cost: $0.01 message, $0.01 session.
tmp.cNBFMciAtZ % 

Did a little digging, I think the issue is with these two segments:

main.py

1048     if args.test:                                                                
1049         if not args.test_cmd:                                                    
1050             io.tool_error("No --test-cmd provided.")                             
1051             analytics.event("exit", reason="No test command provided")           
1052             return 1                                                             
1053         coder.commands.cmd_test(args.test_cmd)                                   
1054         if io.placeholder:                                                       
1055             coder.run(io.placeholder)                                            

commands.py

 971     def cmd_test(self, args):                                                    
...
 982             return self.cmd_run(args, True)                                      
...
 991     def cmd_run(self, args, add_on_nonzero_exit=False):                          
...
1024             if add_on_nonzero_exit and exit_status != 0:                         
1025                 # Return the formatted output message for test failures          
1026                 return msg                                                       
1027             elif add and exit_status != 0:                                       
1028                 self.io.placeholder = "What's wrong? Fix"                        

It seems that main.py expects io.placeholder to be set after running the test command, if any tests failed. However, when cmd_test invokes cmd_run, it sets add_on_nonzero_exit to True, and that particular path never sets io.placeholder.

Is this intended behavior? It seems like a bug to me, since as its written the flag does nothing except run the test command.

Version and model info

Aider v0.84.0
Main model: anthropic/claude-sonnet-4-20250514 with diff edit format, infinite output
Weak model: anthropic/claude-3-5-haiku-20241022
Git repo: .git with 0 files
Repo-map: using 4096 tokens, auto refresh

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions