Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assess test target execution time & define test schedule #2037

Closed
smlambert opened this issue Nov 3, 2020 · 5 comments
Closed

Assess test target execution time & define test schedule #2037

smlambert opened this issue Nov 3, 2020 · 5 comments

Comments

@smlambert
Copy link
Contributor

smlambert commented Nov 3, 2020

We have nightly and weekly targets defined in the build pipelines. We eventually want to enable the entire 'grid' of test levels x groups at the project for all platforms that we release. We will embrace some notion of graduated testing to conserve machines, as we can not run all testing every night.

grid

This issue will gather test duration times for all top-level test targets on all platforms for all versions to attempt to develop a schedule that works with the current machine capacity (and/or advice where we are needing to augment capacity), ideally creating a script where this can be rerun/revisited on a quarterly cadence.

Average Duration for Nightly test targets:

version target xlinux avgDuration (mins) mac avgDuration (mins) aix avgDuration (mins) aarch64 win64 s390x
jdk8 sanity.openjdk 22.78 41.04 101 21 85 120
sanity.system 135.49 158.79 135 76 66 210
extended.system 91.29 128.88 75 14 95 150
sanity.perf 47.02 33.08 8 14 55 127
sanity.functional 81.5 83 120 223 83 128
extended.functional 221 112 270 120 141 174
~599 min (~10hrs) ~557 min (~9.2hrs) ~694 (~11.5hrs) ~468 (~7.8hrs) ~525 (~8.75hrs) ~909 (~15hrs)

Average Duration for Weekly test targets: (TBD)

version target xlinux avgDuration (TRSS query)
jdk8 extended.openjdk TBD query
special.functional TBD query
extended.perf TBD query
sanity.external
@andrew-m-leonard
Copy link
Contributor

fyi: #2050

@andrew-m-leonard
Copy link
Contributor

#2051

@smlambert
Copy link
Contributor Author

smlambert commented Nov 24, 2020

Updated average nightly test execution times for xlinux and mac in the table in an earlier comment. Currently if all top-level targets were run serially on a machine... should complete at around 10hrs for jdk8. Note: the execution time varies across platforms (for example, based on information shared earlier in this issue, on mac shows a 9.2hr average execution time). This exercise needs to eventually take that into account, but for a spitball initial estimate, we will assume ~10hr execution time .

We do not run test targets serially, but rather try to divide and queue the top-level targets up across the machines we have. (and multiplied by 4 impls, hotspot, dragonwell, openj9, openj9XL and for jdk11 x 5 impls with addition of corretto on xlinux, we additionally run sanity.openjdk and sanity.system against upstream builds for jdk8/jdk11 on xlinux/aarch64/win64). 40/50/30 hrs of test execution time nightly per each version respectively.

Now to look at machine resources, execution time has to be shared across 19 xlinux machines and across 7 mac machines and 3 aix machines. If all test targets were granularly divided in 1hr segments, and if no other jobs utilize test machines and all machines are online, the shortest completion time for a nightly build is execution time % number of machines. In reality, the queuing/scheduling across Jenkins resources is never entirely optimal, nonetheless, if the execution time % num of machines is too large we will need to consider a different schedule or more machines or not using the default set of tests on that platform, but rather a reduced set of tests.

Note: since only a small percentage of functional tests are tagged for hotspot impl (more are applicable, TODO: review and tag the set), it reduces the execution time for sanity.functional and extended.functional for that impl (by hrs).

Version Platform/Spec Nightly Execution Time (impls x versions x avgDuration in hrs) Num of Test Machines Execution Time / Machines (shortest completion time possible)
jdk8 xlinux 42
jdk11 xlinux 52
jdk15 xlinux 32
all xlinux 126 hrs 19 6.63 hrs
jdk8 mac 30
jdk11 mac 30
jdk15 mac 30
all mac 90 hrs 7 12.86 hrs
jdk8 aix 14
jdk11 aix 14
jdk15 aix 14
all aix 42 hrs 3 14 hrs
jdk8 aarch64 16
jdk11 aarch64 24
jdk15 aarch64 24
all aarch64 64 hrs 10 6.4 hrs
jdk8 win64 + win32 54
jdk11 win64 + win32 45
jdk15 win64 + win32 45
all win64 144 hrs 9 online/3 offline 16 hrs (reduces to 12hrs if all machines online)
jdk8 s390x 45
jdk11 s390x 45
jdk15 s390x 45
all s390x 135 hrs 4 33.75 hrs
jdk8 ppc64le 24
jdk11 ppc64le 24
jdk15 ppc64le 24
all ppc64le 72 hrs 9 8 hrs

@smlambert
Copy link
Contributor Author

In the middle of this assessment, I have also managed to locate a similar assessment done in April 2019, adding it for completeness.

Screen Shot 2020-11-25 at 10 52 12 PM

Screen Shot 2020-11-25 at 10 52 27 PM

@smlambert
Copy link
Contributor Author

Closing this as stale and no longer relevant.

The queries linked in the table above (which call an API in TRSS) are still valid and could be used to tabulate data in the future should it be needed, example:

https://trss.adoptopenjdk.net/api/getTestAvgDuration?level=sanity&jdkVersion=8&group=openjdk&platform=x86-64_linux

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
aqa-tests
  
TODO
Development

No branches or pull requests

2 participants