New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New watch mode: only re-run failing tests until they pass #1033
Comments
Hi @ngryman! Thanks for the proposal! I'm not sure that this is a good idea though, as you would lose the information/feedback about whether your other tests (in the same file or the files affected by the modified file). If you only re-run a previous failing test, then you might break 10 other tests without knowing about it. |
Hey @jfmengels, In the algorithm I proposed, all failing tests are re-run, so if you break things you'll know. Let's call this set of tests a Not all tests have a direct relationship to each other. Directly related tests are often located in the same file and mostly fail fast if you break something. So there are chances that the first batch of failing tests will be the only one. I know this proposal seems weird to read, and my english does not help, but when you think about it, it's exactly what you do manually. You rarely want all your test suite to be run on each file save. You isolate tests you want to focus on. And when they pass, you check that the whole test suite passes and you didn't break anything at a larger scale. Another way to define my proposal would be to implement an evolutive test scope: test current file first, test failing first. Let's illustrate with a typical scenario:
√: Pass |
You should not worry about that, it's good enough ;) Okay, I think I may have read the proposal a bit too fast and skipped over the fact that tests are (almost) all re-run when the tests pass. You still won't get the feedback that you broke other tests as long as you don't fix all the previous failing tests. In case you start with a lot of failing tests or a very tricky one, you might break a lot of things before you notice and end up not knowing which parts broke the other tests. I'm not sure that's a very nice DX. I think that after a few minutes I'd get stressed out and re-run AVA just to make sure nothing else broke, but that might just be me. There's also an issue with the algorithm. You only re-run the tests that failed, even if all previously failed tests pass this time. That means that you'd have to re-run AVA a second time to know which previously passing tests will now break. So in essence, this would be the workflow you'd get (Nice way of presenting it btw):
√: Pass This will be the scenario, unless AVA re-runs tests following the results of the tests, but then this becomes a pretty big change in how AVA works. Should AVA re-run everything when the failed tests are fixed, then you get to the scenarios you described. (Just FYI, I'm used to writing a lot more tests not dealing with I/O / HTTP, so my tests all run very fast, and don't end up using |
Thanks for your time and feedback man.
First I should have precised that while writing the second comment, I slightly changed the algorithm. If all of the tests in your So your table becomes again:
Yes and for me that's the whole point 😃! And that's already the case with the current If I may explain a bit more why I think this feature would be great, instead of going down to the implementation itself. I think we should separate test driven development and integration testing. When you add a new feature or fix a bug on your machine, you use Then when you're satisfied with your implementation and your unit test pass, comes the integration testing phase where you sanity check that your modifications to the codebase did not add regressions. Most of the time it should be ok. The mode/optin I'm ✊ for is for tdd. It's made precisely to only test what is directly related to your feature/bug and discard on purpose the rest while this precise test doesn't pass. |
Only skimmed the discussion so far, so my apologies if my summary is wrong. Sounds like the suggestion is to prioritize rerunning the known failing tests, as well as any new test files. Track other tests that ordinarily would have rerun (due to source changes). Run those once there are no more failing tests. Right now that would require a new test run, though ideally we'd add to an existing test run. So there's some architectural limitations there. |
@novemberborn Yes, my proposition was to only re-run already failing tests until they pass. If there are technical limitations on this, I can understand. I had 3 requirements to satisfy:
I guess 3. is optional and quite complex to achieve in the sense that it would require some sort of intelligence. That's probably totally out of scope of a test runner. But I think 1. and 2. may be achieved with minimal modifications. We could prioritize failing tests to run first, then the others. It would allow to reduce drastically A bonus would be to prioritize failing tests by number of failures. If a test fails since |
@ngryman I like it! The watcher is already quite complex; to be able to do what you're suggesting I think we need to do a fair bit of refactoring. My thinking is that we should be able to start a test run and add more files to it before it finishes. That's a problem because currently (when not using the That way the runner can prioritize files with failing tests, while maintaining a second set of non-failing test files that should be rerun. If we could reliably identify individual tests (e.g. if they have a unique title) we could prioritize those over other tests in the same file. |
This should not be a problem as we now enforce tests to have unique titles. |
When writing integration tests, it can be tempting to run them serially, with an assumed order. This means you can't just run one of the tests. The proposed behavior would break watch mode for those users, but I think that's fine. |
AVA now runs failing tests first, and we do dependency analysis. I think that's good enough. |
Hi,
I would like to open a discussion around a new feature that might be interesting to use during development.
Usually when you add a new feature or fix a bug on a project, you add a new test, watch it fail and then code until it finally passes. This is
tdd
, nothing new. During this process, you are likely to be interested only in this particular test for several reason:So you end up adding/removing
only
modifiers manually. It can be quite frustrating sometimes and would be awesome if somehow,ava
was smart about it.watch
mode is already half smart as it only re-runs tests included in the test file you modify, which is a huge improvement in d2d development workflow 👍 Still I think we can push things further and make it a bit smarter by adding a more aggressive test isolation algorithm that would only re-run previously failing tests until they pass. I don't know if it should be part of thewatch
default behavior or via anothercli
flag (i.e.--smart-watch
👓). But I know I would love this feature 😍.Here is pseudo-code of the
smart-watch
algorithm, not related to your implementation:Basically it only re-runs tests that are failing, falling back to running all tests. It does not change the actual
watch
behavior.This would involve keeping track of
t-1
failures, so we can know att
which tests have failed. I'm not familiar withava
internals but I guess it can be done quite easily.What do you think guys?
The text was updated successfully, but these errors were encountered: