Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only run tests for changed code #115

Closed
sindresorhus opened this issue Oct 31, 2015 · 19 comments · Fixed by #544
Closed

Only run tests for changed code #115

sindresorhus opened this issue Oct 31, 2015 · 19 comments · Fixed by #544
Assignees
Labels
enhancement new functionality

Comments

@sindresorhus
Copy link
Member

Many times you only change code that affects one or a few tests, but still have to suffer through the whole test suite. Would be nice to only run the affected tests instead. This would be a huge win for people with lots of tests.

This is easy on a test file level as we can just recurse the dependency tree and see which test files depend on the changed file. It gets harder for individual tests. Here we could take advantage of code coverage, diff what's changed between runs, and check which lines were affected.

Prior art

This is not something that is likely to come in the short term, but opening for discussion and brainstorming. Help making this happen more than welcome.

@sindresorhus sindresorhus added enhancement new functionality question labels Oct 31, 2015
@schnittstabil
Copy link

Nice challenge! But, to clarify: computability theory tell us, that we cannot only take the diff of two source states and calculate which tests are affected.

Possible approaches to solve this:

  1. we run the test before the change and gain information from it (we may persist this in some way or other)
  2. we use some restrictions, e.g.: if you use this or that naming pattern for your tests, we run the tests in cases of changes

It also means we cannot do it in a perfect way, we would always run tests unnecessarily, the benefit would be to decrease the number of tests to run.

@madbence
Copy link

madbence commented Nov 2, 2015

imho just store dependencies of the tests, if some of them changes, we can rerun the tests. there is no better way to do this (that is simple).

@vadimdemedes
Copy link
Contributor

This is the way I solved it in my side-project. Run tests under istanbul, check how coverage changes between tests and create a mapping of test -> depending file line. After the initial test run, which creates a mapping for all tests, I see what lines in my project have changed. And then I run only tests that "touch" those lines.

@Qix-
Copy link
Contributor

Qix- commented Nov 2, 2015

@vdemedes that's clever, but still not theoretically sound. We're still dealing with the halting problem here; we have no idea if the program will run differently given different inputs. Keep in mind time is an input.

This means that even though a small code change might only affect one or two tests according to the last run's data, the inputs might change (or the way inputs are interpreted might change) to which it will actually affect many tests - which, given the isolate data (the profiling/coverage data), do not get run. This will almost definitely introduce regressions.

As well, this will require a command line flag for re-running all tests. Removing a cache file or something isn't really user friendly, and now you're storing something in the working directory in order to keep that data. People will have to migrate to this model and include something in their gitignore.


Side note: istanbul uses V8's profiler, which can dramatically slow things down. Keep that in mind, too.

@sindresorhus
Copy link
Member Author

and now you're storing something in the working directory in order to keep that data. People will have to migrate to this model and include something in their gitignore.

We would store it in ~/.cache/ava/project-name-from-package-json, not in the project directory.

@Qix-
Copy link
Contributor

Qix- commented Nov 2, 2015

What if you change branches in the project?

@sindresorhus
Copy link
Member Author

We would use an LRU cache, so it wouldn't expunge it right away when switching branches.

@Qix-
Copy link
Contributor

Qix- commented Nov 2, 2015

4hpqm6d

@Qix-
Copy link
Contributor

Qix- commented Nov 2, 2015

tumblr_nme6nattci1rbqdfho1_400

@Qix-
Copy link
Contributor

Qix- commented Nov 2, 2015

sigh and not even a history to prove that I'm not a potato. Thanks Github.

#BDFLAbuse #PotatoLivesMatter

@schnittstabil
Copy link

We're still dealing with the halting problem here; we have no idea if the program will run differently given different inputs.

We should not be worrying too much about obfuscated code, it should only be reasonable and practical…

require(new Date().getTime() === new Date(2017,3,16).getTime() ? './upload-genisys.js' : './');

@Qix-
Copy link
Contributor

Qix- commented Nov 2, 2015

@schnittstabil not quite sure what you mean there.

@schnittstabil
Copy link

@Qix- I think it is ok to only consider code patterns usually used. As you already mentioned, this feature can not be theoretically sound.

@Qix-
Copy link
Contributor

Qix- commented Nov 2, 2015

But by even introducing the possibility of skipping tests that could have been affected by change, you're introducing a chance of introducing regressions into your code base. Further, it becomes the test suite's fault - not something a test suite should do. Ever.

It's a sweet idea though.

@sindresorhus
Copy link
Member Author

Wallaby.js somehow did it, though.

8b05b30a-925e-4e3c-a6bd-a5feffa21cf5

@schnittstabil
Copy link

Clearly this feature shouldn't be preset. I would imagine a --quick CLI flag, or similar, --help may note its inadequacies…

We may also use this in some watch mode: We run the possible affected test first and run the remaining tests afterwards.

@Qix-
Copy link
Contributor

Qix- commented Nov 2, 2015

+1 for watch mode. That's a great idea, @schnittstabil. However, I think the functionality should be initial run of watch-mode runs them all, and then subsequent triggers as it's watching just does the affected tests. THAT would be cool.

@sindresorhus
Copy link
Member Author

Clearly this feature shouldn't be preset. I would imagine a --quick CLI flag, or similar, --help may note its inadequacies…

👍

Yes, the intention is to combine it with some kind of watch mode. That's where it would shine. Being able to code and see the test results almost live.

@ArtemGovorov
Copy link
Contributor

Wallaby.js core developer here, here's our 2c of experience with it:

As it has been mentioned a few times in the discussion, there's no theoretically waterproof solution. We use runtime analysis powered by instrumentation collected data plus a bazillion of heuristic rules of various nature, plus framework specific hacks, plus additional data received directly from a specific code editor that we integrate with. So while a simple dependency/call tree analysis covers like 70% of cases, the rest 29.9999% is really painful to cover.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement new functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants