New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance performance of selecting tests using --include
and --exclude
#3579
Comments
I have some ideas how to speed things up. I will upload a draft of a solution as a pull request, but I doubt I will have the skill (and time) to make it through the review process. |
Employ memoization to speed up tag filtering. - This is a first draft. - Unit tests not edited at all. - Not sure about storing to self._memo everywhere. - Not sure if normalization is applied everywhere needed. + Normalization applied in most places. + Memo dictionary is passed from callers, created empty if needed. + Memo dictionary uses normalized pattern or tag as the key. + Multiple copies of the same pattern for SingleTagPattern do not duplicate matchers. + Each such matcher has memo dictionary to store (normalized) tag match results. + Calls to the same matcher with the same tag return the memoized result. - Matcher memo should probably have different name than result memo. Signed-off-by: Vratko Polak <vrpolak@cisco.com>
So I managed to create #3580 |
My main question is whether it is an attempt that goes in a good direction, or some other approach is preferred. |
I don't have time to look at this in more detail right now so only quick comments:
|
We have the same problem. It takes about a minute to select even one test by a unique tag. Choosing a test by name did not improve the situation in any way, even using the full test name. |
Any updates related to this @vrpolakatcisco? Based on the comment by @skhomuti others have encountered this issue so it would be nice to fix it. I just needed to know a bit about the situation to be able to reproduce it myself. |
Ah, sorry for dropping this.
In our case the --test was sufficient enough (although we have one --suite matching a whole tree) for us to start looking elsewhere when trying to speed up the overall job time.
I created quick&dirty reproducer: https://gerrit.fd.io/r/c/csit/+/31492 Hopefully it is self-explanatory enough. |
I adapted the example at https://gerrit.fd.io/r/c/csit/+/31492 so that I could run it on my machine and it took only 0.3 seconds. Could someone create a generic example demonstrating the issue I could try running? If this isn't anymore a problem for anyone, we can also just close this issue. |
The original example is still slow for me, although the slowness is less severe (better hardware, newer Python ad robotframework versions). Less severe means 40-50 seconds. Anyway, I created a more scalable reproducer with synthetic suites (no git clone needed), still https://gerrit.fd.io/r/c/csit/+/31492 but now patch set #4. For 16k test cases (one test case per suite file, plus the same amount of empty init.robot files in case that matters), in my setup robot takes around 10 seconds to go through suites, plus roughly 10 seconds per 40 --include options. Worst case for tests in our current production would be selecting 300 test cases (out of 400k possibilities) using more complicated tag expressions. |
The new script is nice, but it could have had a bit better usage instructions. I initially tried using it with values After generating tests and the Our tags are case, space and underscore insensitive, and matching them requires normalizing both patterns and tags. At the moment normalization is done each time tags are matched, which means that if there are multiple of patterns normalization is done multiple times. I changed the code so that normalization was done only once and that dropped the execution time to ~12s. That means selecting tests now took ~5s which is quite a bit faster than ~13s earlier. The change explained above is relatively simple and all our acceptance tests pass with it. There's thus no reason not to commit it. Selecting tests still takes some time if there are lot of tests and lot of patterns, but I consider this enhancement good enough to close this issue. |
--include
and --exclude
When there are lot of tests, one way to enhance performance is using the new functionality to convert the data to JSON (#3902). For example, the from robot.running import TestSuite
suite = TestSuite.from_file_system('t')
suite.to_json('t.rbt') Running that |
Tries to improves upon robotframework#3579 but makes it worse instead.
Sorry about that.
I confirm both old and new reproducer now spend roughly half the previous time on tag filtering.
I also confirm that. After some thinking I realized an approach based on --test will always be significantly faster than any approach based on --include. You cannot beat one regexp match per testcase and --test item, when --include needs at least one regexp per testcase and test tag and --include item (not to mention ANDed subitems). [2] c375381 |
Hello.
In FD.io CSIT project we are running many robotframework tests. Imagine single robot invocation selecting hundreds out of tens of thousands test cases. Due to both historic inertia and desire for clarity, we ended up with a scheme where we construct really long tag expressions. Basically each test case is selected by a separate --include option, each ANDing ~8 single tags.
The downside is in robot taking too long to figure out which tests are selected.
Here is an example console output [0] with al the gory details (using robotframework==3.1.2 and CPython3.6.9.final.0-64 in case tat matters).
See the time gap between
14:04:31 +++ robot
and
15:07:45 ===
In middle term we are working on a solution which uses --test instead of --include.
But I think in long run it would be good if robotframework was faster also when dealing with tags.
[0] https://logs.fd.io/production/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-2n-skx/898/console-timestamp.log.gz
The text was updated successfully, but these errors were encountered: