-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(test runner): improve sharding algorithm to better spread similar tests among shards #30962
base: main
Are you sure you want to change the base?
Changes from 11 commits
54bbba8
ee64b15
6543baf
59c625e
d3b4fad
4bcfc78
061f559
14ca30a
ce09f88
d42c499
b5b8174
389e571
6884fd7
ef2d35f
d33e8da
6b051cb
32169b7
fd90424
c7b58d7
6cf1217
7dd7115
cabfa74
899c068
8aee7df
843d629
2536da7
087a57f
7aa2d95
29f67f3
7b5f7c6
65a2b55
df6d05d
3fed470
050e837
f483ac6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -482,6 +482,25 @@ export default defineConfig({ | |
``` | ||
|
||
|
||
## property: TestConfig.shardingMode | ||
|
||
* since: v1.45 | ||
- type: ?<[ShardingMode]<"partition"|"round-robin"|"duration-round-robin">> | ||
|
||
Defines the algorithm to be used for sharding. Defaults to `'partition'`. | ||
* `'partition'` - divide the set of test groups by number of shards. e.g. first | ||
half goes to shard 1/2 and seconds half to shard 2/2. | ||
* `'round-robin'` - spread test groups to shards in a round-robin way. e.g. loop | ||
over test groups and always assign to the shard that has the lowest number of | ||
tests. | ||
* `'duration-round-robin'` - use duration info from `.last-run.json` to spread | ||
test groups to shards in a round-robin way. e.g. loop over test groups and | ||
always assign to the shard that has the lowest duration of tests. new tests | ||
which were not present in the last run will use an average duration time. When | ||
no `.last-run.json` could be found the behavior is identical to | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Given the comment above about the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I disagree. Also please note that I don't think it would be a good to choose a different algorithm based on whether last run info is available or not. |
||
`'round-robin'`. | ||
|
||
|
||
## property: TestConfig.shardingSeed | ||
|
||
* since: v1.45 | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,14 +17,15 @@ | |
import fs from 'fs'; | ||
import path from 'path'; | ||
import os from 'os'; | ||
import type { Config, Fixtures, Project, ReporterDescription } from '../../types/test'; | ||
import type { Config, Fixtures, PlaywrightTestConfig, Project, ReporterDescription } from '../../types/test'; | ||
import type { Location } from '../../types/testReporter'; | ||
import type { TestRunnerPluginRegistration } from '../plugins'; | ||
import { getPackageJsonPath, mergeObjects } from '../util'; | ||
import type { Matcher } from '../util'; | ||
import type { ConfigCLIOverrides } from './ipc'; | ||
import type { FullConfig, FullProject } from '../../types/testReporter'; | ||
import { setTransformConfig } from '../transform/transform'; | ||
import type { LastRunInfo } from '../runner/runner'; | ||
|
||
export type ConfigLocation = { | ||
resolvedConfigFile?: string; | ||
|
@@ -55,7 +56,9 @@ export class FullConfigInternal { | |
cliFailOnFlakyTests?: boolean; | ||
testIdMatcher?: Matcher; | ||
defineConfigWasUsed = false; | ||
shardingMode: Exclude<PlaywrightTestConfig['shardingMode'], undefined>; | ||
shardingSeed: string | null; | ||
lastRunInfo?: LastRunInfo; | ||
|
||
constructor(location: ConfigLocation, userConfig: Config, configCLIOverrides: ConfigCLIOverrides) { | ||
if (configCLIOverrides.projects && userConfig.projects) | ||
|
@@ -93,6 +96,7 @@ export class FullConfigInternal { | |
workers: 0, | ||
webServer: null, | ||
}; | ||
this.shardingMode = takeFirst(configCLIOverrides.shardingMode, userConfig.shardingMode, 'partition'); | ||
this.shardingSeed = takeFirst(configCLIOverrides.shardingSeed, userConfig.shardingSeed, null); | ||
for (const key in userConfig) { | ||
if (key.startsWith('@')) | ||
|
@@ -265,7 +269,7 @@ export function toReporters(reporters: BuiltInReporter | ReporterDescription[] | | |
return reporters; | ||
} | ||
|
||
export const builtInReporters = ['list', 'line', 'dot', 'json', 'junit', 'null', 'github', 'html', 'blob', 'markdown'] as const; | ||
export const builtInReporters = ['list', 'line', 'dot', 'json', 'junit', 'null', 'github', 'html', 'blob', 'markdown', 'lastrun'] as const; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think we need |
||
export type BuiltInReporter = typeof builtInReporters[number]; | ||
|
||
export type ContextReuseMode = 'none' | 'when-possible'; | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -184,9 +184,13 @@ async function runTests(args: string[], opts: { [key: string]: any }) { | |
if (!config) | ||
return; | ||
|
||
if (opts.lastFailed) { | ||
if (opts.lastFailed || config.shardingMode === 'duration-round-robin') { | ||
const lastRunInfo = await readLastRunInfo(config); | ||
config.testIdMatcher = id => lastRunInfo.failedTests.includes(id); | ||
if (opts.lastFailed) | ||
config.testIdMatcher = id => lastRunInfo.failedTests.includes(id); | ||
|
||
if (config.shardingMode === 'duration-round-robin') | ||
config.lastRunInfo = lastRunInfo; | ||
} | ||
|
||
config.cliArgs = args; | ||
|
@@ -281,6 +285,7 @@ function overridesFromOptions(options: { [key: string]: any }): ConfigCLIOverrid | |
retries: options.retries ? parseInt(options.retries, 10) : undefined, | ||
reporter: resolveReporterOption(options.reporter), | ||
shard: shardPair ? { current: shardPair[0], total: shardPair[1] } : undefined, | ||
shardingMode: options.shardingMode ? options.shardingMode : undefined, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should validate that |
||
shardingSeed: options.shardingSeed ? options.shardingSeed : undefined, | ||
timeout: options.timeout ? parseInt(options.timeout, 10) : undefined, | ||
ignoreSnapshots: options.ignoreSnapshots ? !!options.ignoreSnapshots : undefined, | ||
|
@@ -359,6 +364,7 @@ const testOptions: [string, string][] = [ | |
['--reporter <reporter>', `Reporter to use, comma-separated, can be ${builtInReporters.map(name => `"${name}"`).join(', ')} (default: "${defaultReporter}")`], | ||
['--retries <retries>', `Maximum retry count for flaky tests, zero for no retries (default: no retries)`], | ||
['--shard <shard>', `Shard tests and execute only the selected shard, specify in the form "current/all", 1-based, for example "3/5"`], | ||
['--sharding-mode <mode>', `Sharding algorithm to use; "partition", "round-robin" or "duration-round-robin". Defaults to "partition".`], | ||
['--sharding-seed <seed>', `Seed string for randomizing the test order before sharding. Defaults to not randomizing the order.`], | ||
['--timeout <timeout>', `Specify test timeout threshold in milliseconds, zero for unlimited (default: ${defaultTimeout})`], | ||
['--trace <mode>', `Force tracing mode, can be ${kTraceModes.map(mode => `"${mode}"`).join(', ')}`], | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
/** | ||
* Copyright (c) Microsoft Corporation. | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
import fs from 'fs'; | ||
import path from 'path'; | ||
import type { FullResult, TestCase, TestResult } from '../../types/testReporter'; | ||
import type { LastRunInfo } from '../runner/runner'; | ||
import { BaseReporter, resolveOutputFile } from './base'; | ||
|
||
type LastRunOptions = { | ||
outputFile?: string, | ||
configDir: string, | ||
_mode: 'list' | 'test' | 'merge', | ||
}; | ||
|
||
// The blob reporter adds a suffix to test ids to avoid collisions. But we need | ||
// to remove that suffix to match the test ids from the last run. | ||
function unsaltTestId(testId: string): string { | ||
return testId.split('-').map(s => s.substring(0, 20)).join('-'); | ||
} | ||
|
||
class LastRunReporter extends BaseReporter { | ||
|
||
private lastRun: LastRunInfo = { | ||
failedTests: [], | ||
status: 'passed', | ||
testDurations: {}, | ||
}; | ||
|
||
private resolvedOutputFile: string | undefined; | ||
private testId: (testId: string) => string; | ||
|
||
constructor(options: LastRunOptions) { | ||
super(); | ||
this.resolvedOutputFile = resolveOutputFile('LASTRUN', { fileName: '.last-run.json', ...options })?.outputFile; | ||
this.testId = options._mode === 'merge' ? unsaltTestId : (testId: string) => testId; | ||
} | ||
|
||
override printsToStdio() { | ||
return !this.resolvedOutputFile; | ||
} | ||
|
||
override onTestEnd(test: TestCase, result: TestResult): void { | ||
super.onTestEnd(test, result); | ||
this.lastRun.testDurations![this.testId(test.id)] = result.duration; | ||
if (result.status === 'failed') | ||
this.lastRun.failedTests.push(this.testId(test.id)); | ||
} | ||
|
||
override async onEnd(result: FullResult) { | ||
await super.onEnd(result); | ||
this.lastRun.status = result.status; | ||
await this.outputReport(this.lastRun, this.resolvedOutputFile); | ||
} | ||
|
||
async outputReport(lastRun: LastRunInfo, resolvedOutputFile: string | undefined) { | ||
const reportString = JSON.stringify(lastRun, undefined, 2); | ||
if (resolvedOutputFile) { | ||
await fs.promises.mkdir(path.dirname(resolvedOutputFile), { recursive: true }); | ||
await fs.promises.writeFile(resolvedOutputFile, reportString); | ||
} else { | ||
console.log(reportString); | ||
} | ||
} | ||
} | ||
|
||
export default LastRunReporter; | ||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -151,6 +151,7 @@ export class Runner { | |||||
export type LastRunInfo = { | ||||||
status: FullResult['status']; | ||||||
failedTests: string[]; | ||||||
testDurations?: { [testId: string]: number }; | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We are a bit worried that this will force export type LastRunInfo = {
status: FullResult['status'],
projects: {
name: string,
files: {
path: string,
tests: {
title: string[],
duration: number,
status: TestStatus,
}[],
}[],
}[],
}; Here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, more verbose is also better for future use-cases that will consume the last-run info. Having the titles and paths in there is more user friendly… 👍 However, removing the already existing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There are no backwards compatibility concerns as of today, since we do not expect That said, you can leave it as is, and we'll iterate on it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I just tried to write the LastRunInfo in the proposed format, but its not as straight forward as I thought it would be.
You mean, I don't need to proceed here to get this merged? 🙏 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
In our CI setup, I'm using pipeline steps to restore the last-run.json from the last successful main build. So, in our case last-run info from previous playwright versions will be used, and it would be great if there is some level of backward compatibility... at least it should not fail when the format is incorrect. |
||||||
}; | ||||||
|
||||||
async function writeLastRunInfo(testRun: TestRun, status: FullResult['status']) { | ||||||
|
@@ -161,7 +162,12 @@ async function writeLastRunInfo(testRun: TestRun, status: FullResult['status']) | |||||
await fs.promises.mkdir(outputDir, { recursive: true }); | ||||||
const lastRunReportFile = path.join(outputDir, '.last-run.json'); | ||||||
const failedTests = testRun.rootSuite?.allTests().filter(t => !t.ok()).map(t => t.id); | ||||||
const lastRunReport = JSON.stringify({ status, failedTests }, undefined, 2); | ||||||
const testDurations = testRun.rootSuite?.allTests().reduce((map, t) => { | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
if (t.results.length) | ||||||
map[t.id] = t.results.reduce((a, b) => a + b.duration, 0); | ||||||
return map; | ||||||
}, {} as { [testId: string]: number }); | ||||||
Comment on lines
+167
to
+171
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm actually not sure it is the right way to sum all the durations… maybe it makes more sense to calc the average? Or only include durations from successful test runs… 🤔 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is ok for now, we'll iterate based on feedback. |
||||||
const lastRunReport = JSON.stringify({ status, failedTests, testDurations }, undefined, 2); | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since we are still writing the file here, why do we need a separate There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The Maybe it would be better to implicitly write the lastrun info when merging blob reports? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think so. From the user's perspective, lastrun just working out of the box would be easier. Whether it is a reporter internally or not is an implementation detail. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How about always using the last run reporter? Then we would only need to have a single implementation…. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That definitely works. We'll have to add it here when |
||||||
await fs.promises.writeFile(lastRunReportFile, lastRunReport); | ||||||
} | ||||||
|
||||||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -14,7 +14,9 @@ | |||||
* limitations under the License. | ||||||
*/ | ||||||
|
||||||
import type { PlaywrightTestConfig } from '../../types/test'; | ||||||
import type { Suite, TestCase } from '../common/test'; | ||||||
import type { LastRunInfo } from './runner'; | ||||||
|
||||||
export type TestGroup = { | ||||||
workerHash: string; | ||||||
|
@@ -130,14 +132,38 @@ export function createTestGroups(projectSuite: Suite, workers: number): TestGrou | |||||
return result; | ||||||
} | ||||||
|
||||||
export function filterForShard(shard: { total: number, current: number }, testGroups: TestGroup[]): Set<TestGroup> { | ||||||
export function filterForShard( | ||||||
mode: PlaywrightTestConfig['shardingMode'], | ||||||
shard: { total: number, current: number }, | ||||||
testGroups: TestGroup[], | ||||||
lastRunInfo?: LastRunInfo, | ||||||
): Set<TestGroup> { | ||||||
// Note that sharding works based on test groups. | ||||||
// This means parallel files will be sharded by single tests, | ||||||
// while non-parallel files will be sharded by the whole file. | ||||||
// | ||||||
// Shards are still balanced by the number of tests, not files, | ||||||
// even in the case of non-paralleled files. | ||||||
|
||||||
if (mode === 'round-robin') | ||||||
return filterForShardRoundRobin(shard, testGroups); | ||||||
if (mode === 'duration-round-robin') | ||||||
return filterForShardRoundRobin(shard, testGroups, lastRunInfo); | ||||||
return filterForShardPartition(shard, testGroups); | ||||||
} | ||||||
|
||||||
/** | ||||||
* Shards tests by partitioning them into equal parts. | ||||||
* | ||||||
* ``` | ||||||
* [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12] | ||||||
* Shard 1: ^---------^ : [ 1, 2, 3 ] | ||||||
* Shard 2: ^---------^ : [ 4, 5, 6 ] | ||||||
* Shard 3: ^---------^ : [ 7, 8, 9 ] | ||||||
* Shard 4: ^---------^ : [ 10,11,12 ] | ||||||
* ``` | ||||||
*/ | ||||||
function filterForShardPartition(shard: { total: number, current: number }, testGroups: TestGroup[]): Set<TestGroup> { | ||||||
let shardableTotal = 0; | ||||||
for (const group of testGroups) | ||||||
shardableTotal += group.tests.length; | ||||||
|
@@ -162,3 +188,44 @@ export function filterForShard(shard: { total: number, current: number }, testGr | |||||
} | ||||||
return result; | ||||||
} | ||||||
|
||||||
/** | ||||||
* Shards tests by round-robin. | ||||||
* | ||||||
* ``` | ||||||
* [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12] | ||||||
* Shard 1: ^ ^ ^ : [ 1, 5, 9 ] | ||||||
* Shard 2: ^ ^ ^ : [ 2, 6,10 ] | ||||||
* Shard 3: ^ ^ ^ : [ 3, 7,11 ] | ||||||
* Shard 4: ^ ^ ^ : [ 4, 8,12 ] | ||||||
* ``` | ||||||
*/ | ||||||
function filterForShardRoundRobin( | ||||||
shard: { total: number, current: number }, | ||||||
testGroups: TestGroup[], | ||||||
lastRunInfo?: LastRunInfo, | ||||||
): Set<TestGroup> { | ||||||
|
||||||
const weights = new Array(shard.total).fill(0); | ||||||
const shardSet = new Array(shard.total).fill(0).map(() => new Set<TestGroup>()); | ||||||
const averageDuration = lastRunInfo ? Object.values(lastRunInfo?.testDurations || {}).reduce((a, b) => a + b, 1) / Math.max(1, Object.values(lastRunInfo?.testDurations || {}).length) : 0; | ||||||
const weight = (group: TestGroup) => { | ||||||
if (!lastRunInfo) | ||||||
// If we don't have last run info, we just count the number of tests. | ||||||
return group.tests.length; | ||||||
// If we have last run info, we use the duration of the tests. | ||||||
return group.tests.reduce((sum, test) => sum + Math.max(1, lastRunInfo.testDurations?.[test.id] || averageDuration), 0); | ||||||
}; | ||||||
|
||||||
// We sort the test groups by the number of tests in descending order. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
const sortedTestGroups = testGroups.slice().sort((a, b) => weight(b) - weight(a)); | ||||||
|
||||||
// Then we add each group to the shard with the smallest number of tests. | ||||||
for (const group of sortedTestGroups) { | ||||||
const index = weights.reduce((minIndex, currentLength, currentIndex) => currentLength < weights[minIndex] ? currentIndex : minIndex, 0); | ||||||
weights[index] += weight(group); | ||||||
shardSet[index].add(group); | ||||||
} | ||||||
|
||||||
return shardSet[shard.current - 1]; | ||||||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this is not a really useful strategy, because it is not stable. Whenever I add a test or move it between files, all my shards will be completely new. This introduces a lot of variability on CI, which from our experience is something to avoid. I wonder whether we can make this stable somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my current project
round-robin
yields much better distribution thanpartition
. Here is why, our tests are organised something like this (simplified)…Our framework tests are pretty exhaustive and take the most time. The logged-in tests require more time than logged-out tests due to additional bootstrap and teardown. By using the
partition
algorithm the first shards require significantly more time as they have all the long-tests from either framework or logged-in suites. Withround-robin
we're already achieving a much better balanced overall runtime, even though the algorithm doesn't know anything about per test duration.I don't feel like the round-robin algorithm is "not stable". It's deterministic, given the same set of tests it always yields the same result. Also I think in CI there should be no assumption that one test goes after another, unless they are explicitly configured to execute sequentially within the same test group.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that's fair. If you would like to opt-in into
round-robin
and it works for you, that's great. I still think it is less stable than I would prefer, but it doesn't hurt to have it as an option anyway.