-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for batch compiling all test entry points #2138
Comments
We currently handle test suites independently as soon as we read them. Batching up all tests for a platform before taking any action on an individual suite will require some restructuring of the runner. I don't have any estimate for the work involved. We do use a @jakemac53 do you recall if we looked at keeping state between suite compiles? |
Afaik we do provide the previous compilation result when we compile each test, which isn't perfect but generally works pretty well assuming most tests share roughly the same transitive deps. Grouping the tests into a single compile also means that no tests can run if a single test fails to compile, which could be very annoying during development of new features (it may be intentional that some of your tests can't be ran yet). |
This might be most useful as a flag you enable for CI, or maybe for some interactive cases where you expect everything to compile. When the code is being edited, it's more likely that running a single test suite will be more useful than running all test suites with a single compile. |
This did come up in our discussions with @jensjoha and I don't think compile time errors preclude batch compiles because you can just prune libraries which contain compilation errors from the dill and get runnable dill. This way you get to have your cake and eat it too: you report all compilation errors and you run all tests without compilation errors. |
We don't actually read the kernel file at all and there is no public API for doing so, so I am not sure how we could do that. Possibly the frontend server could have a special entry point and handle this logic (as well as reporting exactly which files failed)? Either way though, I am not currently convinced that the strategy we have right now is insufficient, when I run VM tests locally in real packages I never see any loading messages after the very first one, test compilation is not getting in my way or dominating the overall time to run tests in real world packages that we maintain. I could see if a package has a single tiny unit test in each file this could dominate more but I don't know that we need to optimize for that scenario (and it should still perform decently anyways). |
I just did some measurements for @dnfield, so I am going to include them here. If we take $ time flutter test -v test/rendering 2>&1 | tee /tmp/testing.log
...
________________________________________________________
Executed in 20.94 secs fish external
usr time 46.18 secs 0.10 millis 46.18 secs
sys time 9.84 secs 2.05 millis 9.83 secs
$ cat /tmp/testing.log | grep "Compiling.*took.*" | awk '{t+=substr($6, 1, length($6)-2)} END{print NR, t}'
71 9956 So there are 71 test files and 10s spent in compilation (out of total 20 seconds of runtime). Now if a manually collapse all tests together by renaming $ flutter test -v test/rendering 2>&1 | tee /tmp/testing2.log
$ cat /tmp/testing2.log | grep "Compiling.*took.*" | awk '{t+=substr($6, 1, length($6)-2)} END{print NR, t}'
1 3736 So this will shave ~6.3 seconds, which is approximately 1/3 of the total running time. This is on M1 - on GH Actions this translates into something more like 30s of savings of 100s total testing time, because I have observed GH to be 5-6x slower than my M1. That being said: savings on compilation are just part of the story here. Having mega dill allows to use an isolate group for running tests (spawn test process with mega dill loaded as "root" isolate - then run individual test entry points by spawning isolates from root isolate). This will allow to share all kinds of VM metadata and JITed code. Which would lead to faster tests.
Yep, I would think this is going to be an extension to CFE/frontend server logic. |
I can't speak to flutter tests, it is a different platform that likely has very different behavior, and also different constraints. That platform is developed in the flutter repo so we might want to move this issue there? |
If you think that no changes will be necessary in I can't easily reproduce any bad behaviour by doing |
Well, now that you mention it we probably would need to do something in order to enable batch compilation at all, because today the way platforms work they are only told to "load" (which today involves compiling) individual tests. So maybe it would need to be a special kind of platform which instead is given all the tests at once to "load". But, I would start instead by trying to improve upon the existing model (which has other benefits) first, since it does work well for VM tests, it should also be able to work well for flutter tests? |
Yes, this is what I was referring to in the first paragraph of #2138 (comment) Adding support for batch compilation could involve some significant changes to the test runner. |
I brought this up in another context, but will put it here just to surface it again, one reason we don't compile all tests together is that they don't necessarily support the same platforms (namely, So it might be intentional that some tests don't for instance compile for the VM, because they are specifically targetting web (and have dart:html etc imports). In general I think flutter handles this a lot differently than the normal test runner, so I am not sure if this is relevant or not to flutter, but it does matter for general Dart tests. |
Yes, and this will mainly limit where and how we can add batching. We filter which suites get passed to a given platform, so it should be feasible to have a filtered batch to compile. Edit: Also, as long as the compilation errors are only surfaced when the platform would want to run the test, the filtering shouldn't even need to happen before compile (outside of performance concerns). |
I wrote a custom plugin that overrides load so that it uses something already compiled. It did not require any changes to package test. |
Currently when doing
dart test
orflutter test
we seem to load*_test.dart
files one by one.Would it be possible to change the implementation of test loading to support batch loading, so that all entry points are loaded together?
The idea here is to reduce compilation overhead: i.e. compile all
*_test.dart
files together into Kernel using a single CFE process and then use this Kernel blob to run individual test entry points./cc @jakemac53 @natebosch
/cc @jensjoha @derekxu16
The text was updated successfully, but these errors were encountered: