Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upUse an already-installed Homebrew at /usr/local #24753
Merged
+52
−35
Conversation
https://community-tc.services.mozilla.com/tasks/fcbLrz33RHeshgRZGvSAjg/runs/0/logs/https%3A%2F%2Fcommunity-tc.services.mozilla.com%2Fapi%2Fqueue%2Fv1%2Ftask%2FfcbLrz33RHeshgRZGvSAjg%2Fruns%2F0%2Fartifacts%2Fpublic%2Flogs%2Flive.log#L1359 Note that the above is on macOS 10.15. Maybe previous versions provided zlib system-wide?
|
@bors-servo r+ |
|
|
This adds lines such as ``` Completed cssparser v0.27.1 custom-build in 2.4s Completed cssparser v0.27.1 custom-build (run) in 0.6s Completed cssparser v0.27.1 in 1.1s ```
```
(git init servo.git &&
cd servo.git &&
time git fetch https://github.com/servo/servo master $ARGS &&
); rm -rf servo.git
Full: 724.75 MiB
57s home fiber in Paris
1m25s AWS us-west-2 Oregon
3m23s Macstadium DC1 Atlanta
4m22s Macstadium DC2 Las Vegas
--depth 100: 129.00 MiB
1m21s home
1m18s AWS
1m30s Macstadium 1
1m24s Macstadium 2
--depth 50: 97.62 MiB
30s home
30s AWS
41s Macstadium 1
40s Macstadium 2
--depth 30: 92.47 MiB
17s home
18s AWS
27s Macstadium 1
26s Macstadium 2
--depth 10: 88.25 MiB
11s home
12s AWS
26s Macstadium 1
25s Macstadium 2
--depth 1: 87.53 MiB
10s home
10s AWS
22s Macstadium 1
28s Macstadium 2
```
|
@bors-servo try=mac |
bors-servo
added a commit
that referenced
this pull request
Nov 16, 2019
Use an already-installed Homebrew at /usr/local This requires servo/taskcluster-config#4 to be deployed. Having the standard location helps `pkg-config` (CC #24688), and allows installing pre-compiled pakcages (which is much faster than compiling from source).
|
|
|
@bors-servo try=mac |
bors-servo
added a commit
that referenced
this pull request
Nov 16, 2019
Use an already-installed Homebrew at /usr/local This requires servo/taskcluster-config#4 to be deployed. Having the standard location helps `pkg-config` (CC #24688), and allows installing pre-compiled pakcages (which is much faster than compiling from source).
|
|
|
|
|
@bors-servo retry |
bors-servo
added a commit
that referenced
this pull request
Nov 17, 2019
Use an already-installed Homebrew at /usr/local This requires servo/taskcluster-config#4 to be deployed. Having the standard location helps `pkg-config` (CC #24688), and allows installing pre-compiled pakcages (which is much faster than compiling from source).
|
@bors-servo retry #24765 |
bors-servo
added a commit
that referenced
this pull request
Nov 17, 2019
Use an already-installed Homebrew at /usr/local This requires servo/taskcluster-config#4 to be deployed. Having the standard location helps `pkg-config` (CC #24688), and allows installing pre-compiled pakcages (which is much faster than compiling from source).
|
Oops, I still had a browser tab opened that did not update after #24753 (comment) |
|
|
SimonSapin
added a commit
that referenced
this pull request
Nov 18, 2019
## Before this Before this PR, we had roughly as many chunks as available workers. Because the the number of test files is a poor estimate for the time needed to run them, we have significant variation in the completion time between chunks when testing one given PR. servo/taskcluster-config#9 adds a tool to collect this data. Here are two full runs of `test_wpt` before this PR: https://community-tc.services.mozilla.com/tasks/groups/DBt9ki9gTdWmwAk-VDorzw ``` count 1, total 0:00:32, max: 0:00:32 docker 0:00:32 count 1, total 0:59:14, max: 0:59:14 macos-disabled-mac1 0:59:14 count 6, total 4:12:16, max: 1:01:14 macos-disabled-mac1 WPT 0:40:29 0:18:55 0:46:50 0:44:38 1:01:14 0:40:10 count 1, total 0:55:19, max: 0:55:19 macos-disabled-mac9 0:55:19 count 6, total 4:25:09, max: 1:01:40 macos-disabled-mac9 WPT 0:37:58 0:37:24 0:27:18 1:01:40 0:46:17 0:54:31 ``` Times for a given chunk vary between 19 minutes and 61 minutes. Assuming no `try` testing, with Homu’s serial scheduling of `r+` testing this means that that worker sits idle for 42 minutes and our limited CPU resources are under-utilized. When there *are* `try` PRs being tested however, they compete with each other and any `r+` PR for the same workers. If we get unlucky, a 61 minute task could only *start* after some other tasks have finished, Increasing the overall time-to-merge a lot. ## This This PR changes the number of chunks to be significantly more than the number of available workers. When one of them finishes, that worker can pick up another one instead of sitting idle. Now the ratio of number of tasks to number of workers doesn’t matter: the differences in run time between tasks becomes somewhat of an advantage and the distribution to workers evens out on average. The number 30 is a bit arbitrary. A higher number reduces resource under-utilization, but increases the effect of per-task overhead. The git cache added in #24753 reduced that overhead, though. Another worry I had if whether this would make wose the similar problem of unequal scheduling between processes within a task, where some CPU cores sit idle while the rest processes finish their assigned work. This turned out not to be enough of a problem to negatively affect the total machine time: ``` https://community-tc.services.mozilla.com/tasks/groups/VnDac92HQU6QmrpzWPCR2w count 1, total 0:00:48, max: 0:00:48 docker 0:00:48 count 1, total 0:39:04, max: 0:39:04 macos-disabled-mac9 0:39:04 count 31, total 4:03:29, max: 0:15:29 macos-disabled-mac9 WPT 0:07:26 0:08:39 0:04:21 0:07:13 0:12:47 0:10:11 0:04:01 0:03:36 0:10:43 0:12:57 0:04:47 0:04:06 0:10:09 0:12:00 0:12:42 0:04:40 0:04:24 0:12:20 0:12:15 0:03:03 0:07:35 0:11:35 0:07:01 0:04:16 0:09:40 0:05:08 0:05:01 0:06:29 0:15:29 0:02:28 0:06:27 ``` (4h03min is even lower than above, but seems within variation.) ## After this #23655 proposes automatically restarting failed WPT tasks, in case the failure is intermittent. With the test suite split into more chunks we have fewer tests per chunk, and therefore lower probability that a given one fails. Restarting one of them also causes less repeated work.
bors-servo
added a commit
that referenced
this pull request
Nov 18, 2019
Split WPT macOS testing into many more chunks ## Before this Before this PR, we had roughly as many chunks as available workers. Because the the number of test files is a poor estimate for the time needed to run them, we have significant variation in the completion time between chunks when testing one given PR. servo/taskcluster-config#9 adds a tool to collect this data. Here are two full runs of `test_wpt` before this PR: https://community-tc.services.mozilla.com/tasks/groups/DBt9ki9gTdWmwAk-VDorzw ``` count 1, total 0:00:32, max: 0:00:32 docker 0:00:32 count 1, total 0:59:14, max: 0:59:14 macos-disabled-mac1 0:59:14 count 6, total 4:12:16, max: 1:01:14 macos-disabled-mac1 WPT 0:40:29 0:18:55 0:46:50 0:44:38 1:01:14 0:40:10 count 1, total 0:55:19, max: 0:55:19 macos-disabled-mac9 0:55:19 count 6, total 4:25:09, max: 1:01:40 macos-disabled-mac9 WPT 0:37:58 0:37:24 0:27:18 1:01:40 0:46:17 0:54:31 ``` Times for a given chunk vary between 19 minutes and 61 minutes. Assuming no `try` testing, with Homu’s serial scheduling of `r+` testing this means that that worker sits idle for 42 minutes and our limited CPU resources are under-utilized. When there *are* `try` PRs being tested however, they compete with each other and any `r+` PR for the same workers. If we get unlucky, a 61 minute task could only *start* after some other tasks have finished, Increasing the overall time-to-merge a lot. ## This This PR changes the number of chunks to be significantly more than the number of available workers. When one of them finishes, that worker can pick up another one instead of sitting idle. Now the ratio of number of tasks to number of workers doesn’t matter: the differences in run time between tasks becomes somewhat of an advantage and the distribution to workers evens out on average. The number 30 is a bit arbitrary. A higher number reduces resource under-utilization, but increases the effect of per-task overhead. The git cache added in #24753 reduced that overhead, though. Another worry I had was whether this would make worse the similar problem of unequal scheduling between processes within a task, where some CPU cores sit idle while the rest processes finish their assigned work. This turned out not to be enough of a problem to negatively affect the total machine time: https://community-tc.services.mozilla.com/tasks/groups/VnDac92HQU6QmrpzWPCR2w ``` count 1, total 0:00:48, max: 0:00:48 docker 0:00:48 count 1, total 0:39:04, max: 0:39:04 macos-disabled-mac9 0:39:04 count 31, total 4:03:29, max: 0:15:29 macos-disabled-mac9 WPT 0:07:26 0:08:39 0:04:21 0:07:13 0:12:47 0:10:11 0:04:01 0:03:36 0:10:43 0:12:57 0:04:47 0:04:06 0:10:09 0:12:00 0:12:42 0:04:40 0:04:24 0:12:20 0:12:15 0:03:03 0:07:35 0:11:35 0:07:01 0:04:16 0:09:40 0:05:08 0:05:01 0:06:29 0:15:29 0:02:28 0:06:27 ``` (4h03min is even lower than above, but seems within variation.) ## After this #23655 proposes automatically restarting failed WPT tasks, in case the failure is intermittent. With the test suite split into more chunks we have fewer tests per chunk, and therefore lower probability that a given one fails. Restarting one of them also causes less repeated work.
SimonSapin
added a commit
that referenced
this pull request
Nov 18, 2019
## Before this Before this PR, we had roughly as many chunks as available workers. Because the the number of test files is a poor estimate for the time needed to run them, we have significant variation in the completion time between chunks when testing one given PR. servo/taskcluster-config#9 adds a tool to collect this data. Here are two full runs of `test_wpt` before this PR: https://community-tc.services.mozilla.com/tasks/groups/DBt9ki9gTdWmwAk-VDorzw ``` count 1, total 0:00:32, max: 0:00:32 docker 0:00:32 count 1, total 0:59:14, max: 0:59:14 macos-disabled-mac1 0:59:14 count 6, total 4:12:16, max: 1:01:14 macos-disabled-mac1 WPT 0:40:29 0:18:55 0:46:50 0:44:38 1:01:14 0:40:10 count 1, total 0:55:19, max: 0:55:19 macos-disabled-mac9 0:55:19 count 6, total 4:25:09, max: 1:01:40 macos-disabled-mac9 WPT 0:37:58 0:37:24 0:27:18 1:01:40 0:46:17 0:54:31 ``` Times for a given chunk vary between 19 minutes and 61 minutes. Assuming no `try` testing, with Homu’s serial scheduling of `r+` testing this means that that worker sits idle for 42 minutes and our limited CPU resources are under-utilized. When there *are* `try` PRs being tested however, they compete with each other and any `r+` PR for the same workers. If we get unlucky, a 61 minute task could only *start* after some other tasks have finished, Increasing the overall time-to-merge a lot. ## This This PR changes the number of chunks to be significantly more than the number of available workers. When one of them finishes, that worker can pick up another one instead of sitting idle. Now the ratio of number of tasks to number of workers doesn’t matter: the differences in run time between tasks becomes somewhat of an advantage and the distribution to workers evens out on average. The number 30 is a bit arbitrary. A higher number reduces resource under-utilization, but increases the effect of per-task overhead. The git cache added in #24753 reduced that overhead, though. Another worry I had was whether this would make worse the similar problem of unequal scheduling between processes within a task, where some CPU cores sit idle while the rest processes finish their assigned work. This turned out not to be enough of a problem to negatively affect the total machine time: https://community-tc.services.mozilla.com/tasks/groups/VnDac92HQU6QmrpzWPCR2w ``` count 1, total 0:00:48, max: 0:00:48 docker 0:00:48 count 1, total 0:39:04, max: 0:39:04 macos-disabled-mac9 0:39:04 count 31, total 4:03:29, max: 0:15:29 macos-disabled-mac9 WPT 0:07:26 0:08:39 0:04:21 0:07:13 0:12:47 0:10:11 0:04:01 0:03:36 0:10:43 0:12:57 0:04:47 0:04:06 0:10:09 0:12:00 0:12:42 0:04:40 0:04:24 0:12:20 0:12:15 0:03:03 0:07:35 0:11:35 0:07:01 0:04:16 0:09:40 0:05:08 0:05:01 0:06:29 0:15:29 0:02:28 0:06:27 ``` (4h03min is even lower than above, but seems within variation.) ## After this #23655 proposes automatically restarting failed WPT tasks, in case the failure is intermittent. With the test suite split into more chunks we have fewer tests per chunk, and therefore lower probability that a given one fails. Restarting one of them also causes less repeated work.
bors-servo
added a commit
that referenced
this pull request
Nov 18, 2019
Split WPT macOS testing into many more chunks ## Before this Before this PR, we had roughly as many chunks as available workers. Because the the number of test files is a poor estimate for the time needed to run them, we have significant variation in the completion time between chunks when testing one given PR. servo/taskcluster-config#9 adds a tool to collect this data. Here are two full runs of `test_wpt` before this PR: https://community-tc.services.mozilla.com/tasks/groups/DBt9ki9gTdWmwAk-VDorzw ``` count 1, total 0:00:32, max: 0:00:32 docker 0:00:32 count 1, total 0:59:14, max: 0:59:14 macos-disabled-mac1 0:59:14 count 6, total 4:12:16, max: 1:01:14 macos-disabled-mac1 WPT 0:40:29 0:18:55 0:46:50 0:44:38 1:01:14 0:40:10 count 1, total 0:55:19, max: 0:55:19 macos-disabled-mac9 0:55:19 count 6, total 4:25:09, max: 1:01:40 macos-disabled-mac9 WPT 0:37:58 0:37:24 0:27:18 1:01:40 0:46:17 0:54:31 ``` Times for a given chunk vary between 19 minutes and 61 minutes. Assuming no `try` testing, with Homu’s serial scheduling of `r+` testing this means that that worker sits idle for 42 minutes and our limited CPU resources are under-utilized. When there *are* `try` PRs being tested however, they compete with each other and any `r+` PR for the same workers. If we get unlucky, a 61 minute task could only *start* after some other tasks have finished, Increasing the overall time-to-merge a lot. ## This This PR changes the number of chunks to be significantly more than the number of available workers. When one of them finishes, that worker can pick up another one instead of sitting idle. Now the ratio of number of tasks to number of workers doesn’t matter: the differences in run time between tasks becomes somewhat of an advantage and the distribution to workers evens out on average. The number 30 is a bit arbitrary. A higher number reduces resource under-utilization, but increases the effect of per-task overhead. The git cache added in #24753 reduced that overhead, though. Another worry I had was whether this would make worse the similar problem of unequal scheduling between processes within a task, where some CPU cores sit idle while the rest processes finish their assigned work. This turned out not to be enough of a problem to negatively affect the total machine time: https://community-tc.services.mozilla.com/tasks/groups/VnDac92HQU6QmrpzWPCR2w ``` count 1, total 0:00:48, max: 0:00:48 docker 0:00:48 count 1, total 0:39:04, max: 0:39:04 macos-disabled-mac9 0:39:04 count 31, total 4:03:29, max: 0:15:29 macos-disabled-mac9 WPT 0:07:26 0:08:39 0:04:21 0:07:13 0:12:47 0:10:11 0:04:01 0:03:36 0:10:43 0:12:57 0:04:47 0:04:06 0:10:09 0:12:00 0:12:42 0:04:40 0:04:24 0:12:20 0:12:15 0:03:03 0:07:35 0:11:35 0:07:01 0:04:16 0:09:40 0:05:08 0:05:01 0:06:29 0:15:29 0:02:28 0:06:27 ``` (4h03min is even lower than above, but seems within variation.) ## After this #23655 proposes automatically restarting failed WPT tasks, in case the failure is intermittent. With the test suite split into more chunks we have fewer tests per chunk, and therefore lower probability that a given one fails. Restarting one of them also causes less repeated work.
bors-servo
added a commit
that referenced
this pull request
Nov 18, 2019
Split WPT macOS testing into many more chunks ## Before this Before this PR, we had roughly as many chunks as available workers. Because the the number of test files is a poor estimate for the time needed to run them, we have significant variation in the completion time between chunks when testing one given PR. servo/taskcluster-config#9 adds a tool to collect this data. Here are two full runs of `test_wpt` before this PR: https://community-tc.services.mozilla.com/tasks/groups/DBt9ki9gTdWmwAk-VDorzw ``` count 1, total 0:00:32, max: 0:00:32 docker 0:00:32 count 1, total 0:59:14, max: 0:59:14 macos-disabled-mac1 0:59:14 count 6, total 4:12:16, max: 1:01:14 macos-disabled-mac1 WPT 0:40:29 0:18:55 0:46:50 0:44:38 1:01:14 0:40:10 count 1, total 0:55:19, max: 0:55:19 macos-disabled-mac9 0:55:19 count 6, total 4:25:09, max: 1:01:40 macos-disabled-mac9 WPT 0:37:58 0:37:24 0:27:18 1:01:40 0:46:17 0:54:31 ``` Times for a given chunk vary between 19 minutes and 61 minutes. Assuming no `try` testing, with Homu’s serial scheduling of `r+` testing this means that that worker sits idle for 42 minutes and our limited CPU resources are under-utilized. When there *are* `try` PRs being tested however, they compete with each other and any `r+` PR for the same workers. If we get unlucky, a 61 minute task could only *start* after some other tasks have finished, Increasing the overall time-to-merge a lot. ## This This PR changes the number of chunks to be significantly more than the number of available workers. When one of them finishes, that worker can pick up another one instead of sitting idle. Now the ratio of number of tasks to number of workers doesn’t matter: the differences in run time between tasks becomes somewhat of an advantage and the distribution to workers evens out on average. The number 30 is a bit arbitrary. A higher number reduces resource under-utilization, but increases the effect of per-task overhead. The git cache added in #24753 reduced that overhead, though. Another worry I had was whether this would make worse the similar problem of unequal scheduling between processes within a task, where some CPU cores sit idle while the rest processes finish their assigned work. This turned out not to be enough of a problem to negatively affect the total machine time: https://community-tc.services.mozilla.com/tasks/groups/VnDac92HQU6QmrpzWPCR2w ``` count 1, total 0:00:48, max: 0:00:48 docker 0:00:48 count 1, total 0:39:04, max: 0:39:04 macos-disabled-mac9 0:39:04 count 31, total 4:03:29, max: 0:15:29 macos-disabled-mac9 WPT 0:07:26 0:08:39 0:04:21 0:07:13 0:12:47 0:10:11 0:04:01 0:03:36 0:10:43 0:12:57 0:04:47 0:04:06 0:10:09 0:12:00 0:12:42 0:04:40 0:04:24 0:12:20 0:12:15 0:03:03 0:07:35 0:11:35 0:07:01 0:04:16 0:09:40 0:05:08 0:05:01 0:06:29 0:15:29 0:02:28 0:06:27 ``` (4h03min is even lower than above, but seems within variation.) ## After this #23655 proposes automatically restarting failed WPT tasks, in case the failure is intermittent. With the test suite split into more chunks we have fewer tests per chunk, and therefore lower probability that a given one fails. Restarting one of them also causes less repeated work.
bors-servo
added a commit
that referenced
this pull request
Nov 18, 2019
Split WPT macOS testing into many more chunks ## Before this Before this PR, we had roughly as many chunks as available workers. Because the the number of test files is a poor estimate for the time needed to run them, we have significant variation in the completion time between chunks when testing one given PR. servo/taskcluster-config#9 adds a tool to collect this data. Here are two full runs of `test_wpt` before this PR: https://community-tc.services.mozilla.com/tasks/groups/DBt9ki9gTdWmwAk-VDorzw ``` count 1, total 0:00:32, max: 0:00:32 docker 0:00:32 count 1, total 0:59:14, max: 0:59:14 macos-disabled-mac1 0:59:14 count 6, total 4:12:16, max: 1:01:14 macos-disabled-mac1 WPT 0:40:29 0:18:55 0:46:50 0:44:38 1:01:14 0:40:10 count 1, total 0:55:19, max: 0:55:19 macos-disabled-mac9 0:55:19 count 6, total 4:25:09, max: 1:01:40 macos-disabled-mac9 WPT 0:37:58 0:37:24 0:27:18 1:01:40 0:46:17 0:54:31 ``` Times for a given chunk vary between 19 minutes and 61 minutes. Assuming no `try` testing, with Homu’s serial scheduling of `r+` testing this means that that worker sits idle for 42 minutes and our limited CPU resources are under-utilized. When there *are* `try` PRs being tested however, they compete with each other and any `r+` PR for the same workers. If we get unlucky, a 61 minute task could only *start* after some other tasks have finished, Increasing the overall time-to-merge a lot. ## This This PR changes the number of chunks to be significantly more than the number of available workers. When one of them finishes, that worker can pick up another one instead of sitting idle. Now the ratio of number of tasks to number of workers doesn’t matter: the differences in run time between tasks becomes somewhat of an advantage and the distribution to workers evens out on average. The number 30 is a bit arbitrary. A higher number reduces resource under-utilization, but increases the effect of per-task overhead. The git cache added in #24753 reduced that overhead, though. Another worry I had was whether this would make worse the similar problem of unequal scheduling between processes within a task, where some CPU cores sit idle while the rest processes finish their assigned work. This turned out not to be enough of a problem to negatively affect the total machine time: https://community-tc.services.mozilla.com/tasks/groups/VnDac92HQU6QmrpzWPCR2w ``` count 1, total 0:00:48, max: 0:00:48 docker 0:00:48 count 1, total 0:39:04, max: 0:39:04 macos-disabled-mac9 0:39:04 count 31, total 4:03:29, max: 0:15:29 macos-disabled-mac9 WPT 0:07:26 0:08:39 0:04:21 0:07:13 0:12:47 0:10:11 0:04:01 0:03:36 0:10:43 0:12:57 0:04:47 0:04:06 0:10:09 0:12:00 0:12:42 0:04:40 0:04:24 0:12:20 0:12:15 0:03:03 0:07:35 0:11:35 0:07:01 0:04:16 0:09:40 0:05:08 0:05:01 0:06:29 0:15:29 0:02:28 0:06:27 ``` (4h03min is even lower than above, but seems within variation.) ## After this #23655 proposes automatically restarting failed WPT tasks, in case the failure is intermittent. With the test suite split into more chunks we have fewer tests per chunk, and therefore lower probability that a given one fails. Restarting one of them also causes less repeated work.
jdm
added a commit
to jdm/servo
that referenced
this pull request
Dec 14, 2019
## Before this Before this PR, we had roughly as many chunks as available workers. Because the the number of test files is a poor estimate for the time needed to run them, we have significant variation in the completion time between chunks when testing one given PR. servo/taskcluster-config#9 adds a tool to collect this data. Here are two full runs of `test_wpt` before this PR: https://community-tc.services.mozilla.com/tasks/groups/DBt9ki9gTdWmwAk-VDorzw ``` count 1, total 0:00:32, max: 0:00:32 docker 0:00:32 count 1, total 0:59:14, max: 0:59:14 macos-disabled-mac1 0:59:14 count 6, total 4:12:16, max: 1:01:14 macos-disabled-mac1 WPT 0:40:29 0:18:55 0:46:50 0:44:38 1:01:14 0:40:10 count 1, total 0:55:19, max: 0:55:19 macos-disabled-mac9 0:55:19 count 6, total 4:25:09, max: 1:01:40 macos-disabled-mac9 WPT 0:37:58 0:37:24 0:27:18 1:01:40 0:46:17 0:54:31 ``` Times for a given chunk vary between 19 minutes and 61 minutes. Assuming no `try` testing, with Homu’s serial scheduling of `r+` testing this means that that worker sits idle for 42 minutes and our limited CPU resources are under-utilized. When there *are* `try` PRs being tested however, they compete with each other and any `r+` PR for the same workers. If we get unlucky, a 61 minute task could only *start* after some other tasks have finished, Increasing the overall time-to-merge a lot. ## This This PR changes the number of chunks to be significantly more than the number of available workers. When one of them finishes, that worker can pick up another one instead of sitting idle. Now the ratio of number of tasks to number of workers doesn’t matter: the differences in run time between tasks becomes somewhat of an advantage and the distribution to workers evens out on average. The number 30 is a bit arbitrary. A higher number reduces resource under-utilization, but increases the effect of per-task overhead. The git cache added in servo#24753 reduced that overhead, though. Another worry I had was whether this would make worse the similar problem of unequal scheduling between processes within a task, where some CPU cores sit idle while the rest processes finish their assigned work. This turned out not to be enough of a problem to negatively affect the total machine time: https://community-tc.services.mozilla.com/tasks/groups/VnDac92HQU6QmrpzWPCR2w ``` count 1, total 0:00:48, max: 0:00:48 docker 0:00:48 count 1, total 0:39:04, max: 0:39:04 macos-disabled-mac9 0:39:04 count 31, total 4:03:29, max: 0:15:29 macos-disabled-mac9 WPT 0:07:26 0:08:39 0:04:21 0:07:13 0:12:47 0:10:11 0:04:01 0:03:36 0:10:43 0:12:57 0:04:47 0:04:06 0:10:09 0:12:00 0:12:42 0:04:40 0:04:24 0:12:20 0:12:15 0:03:03 0:07:35 0:11:35 0:07:01 0:04:16 0:09:40 0:05:08 0:05:01 0:06:29 0:15:29 0:02:28 0:06:27 ``` (4h03min is even lower than above, but seems within variation.) ## After this servo#23655 proposes automatically restarting failed WPT tasks, in case the failure is intermittent. With the test suite split into more chunks we have fewer tests per chunk, and therefore lower probability that a given one fails. Restarting one of them also causes less repeated work.
jdm
added a commit
to jdm/servo
that referenced
this pull request
Dec 20, 2019
## Before this Before this PR, we had roughly as many chunks as available workers. Because the the number of test files is a poor estimate for the time needed to run them, we have significant variation in the completion time between chunks when testing one given PR. servo/taskcluster-config#9 adds a tool to collect this data. Here are two full runs of `test_wpt` before this PR: https://community-tc.services.mozilla.com/tasks/groups/DBt9ki9gTdWmwAk-VDorzw ``` count 1, total 0:00:32, max: 0:00:32 docker 0:00:32 count 1, total 0:59:14, max: 0:59:14 macos-disabled-mac1 0:59:14 count 6, total 4:12:16, max: 1:01:14 macos-disabled-mac1 WPT 0:40:29 0:18:55 0:46:50 0:44:38 1:01:14 0:40:10 count 1, total 0:55:19, max: 0:55:19 macos-disabled-mac9 0:55:19 count 6, total 4:25:09, max: 1:01:40 macos-disabled-mac9 WPT 0:37:58 0:37:24 0:27:18 1:01:40 0:46:17 0:54:31 ``` Times for a given chunk vary between 19 minutes and 61 minutes. Assuming no `try` testing, with Homu’s serial scheduling of `r+` testing this means that that worker sits idle for 42 minutes and our limited CPU resources are under-utilized. When there *are* `try` PRs being tested however, they compete with each other and any `r+` PR for the same workers. If we get unlucky, a 61 minute task could only *start* after some other tasks have finished, Increasing the overall time-to-merge a lot. ## This This PR changes the number of chunks to be significantly more than the number of available workers. When one of them finishes, that worker can pick up another one instead of sitting idle. Now the ratio of number of tasks to number of workers doesn’t matter: the differences in run time between tasks becomes somewhat of an advantage and the distribution to workers evens out on average. The number 30 is a bit arbitrary. A higher number reduces resource under-utilization, but increases the effect of per-task overhead. The git cache added in servo#24753 reduced that overhead, though. Another worry I had was whether this would make worse the similar problem of unequal scheduling between processes within a task, where some CPU cores sit idle while the rest processes finish their assigned work. This turned out not to be enough of a problem to negatively affect the total machine time: https://community-tc.services.mozilla.com/tasks/groups/VnDac92HQU6QmrpzWPCR2w ``` count 1, total 0:00:48, max: 0:00:48 docker 0:00:48 count 1, total 0:39:04, max: 0:39:04 macos-disabled-mac9 0:39:04 count 31, total 4:03:29, max: 0:15:29 macos-disabled-mac9 WPT 0:07:26 0:08:39 0:04:21 0:07:13 0:12:47 0:10:11 0:04:01 0:03:36 0:10:43 0:12:57 0:04:47 0:04:06 0:10:09 0:12:00 0:12:42 0:04:40 0:04:24 0:12:20 0:12:15 0:03:03 0:07:35 0:11:35 0:07:01 0:04:16 0:09:40 0:05:08 0:05:01 0:06:29 0:15:29 0:02:28 0:06:27 ``` (4h03min is even lower than above, but seems within variation.) ## After this servo#23655 proposes automatically restarting failed WPT tasks, in case the failure is intermittent. With the test suite split into more chunks we have fewer tests per chunk, and therefore lower probability that a given one fails. Restarting one of them also causes less repeated work.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
SimonSapin commentedNov 15, 2019
This requires servo/taskcluster-config#4 to be deployed.
Having the standard location helps
pkg-config(CC #24688), and allows installing pre-compiled pakcages (which is much faster than compiling from source).