Skip to content
Merged
106 changes: 106 additions & 0 deletions .claude/skills/debug-e2e/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
---
name: debug-e2e
description: Debug flaky Playwright E2E test failures from CI
---

# Debugging Flaky E2E Test Failures

Use this skill when investigating flaky Playwright E2E test failures from CI.

## Workflow

### 1. Get CI failure details

```bash
# View PR checks status
gh pr checks <PR_NUMBER>

# Get failed test logs
gh run view <RUN_ID> --log-failed 2>&1 | head -500

# Search for specific failure patterns
gh run view <RUN_ID> --log-failed 2>&1 | rg -C 30 "FAILED|Error:|Expected|Timed out"
```

### 2. Reproduce locally with repeat-each

Run the failing test multiple times to reproduce flaky behavior:

```bash
# Run specific test 20 times
npm run e2e -- test/e2e/<file>.e2e.ts --repeat-each=20 -g "<test name pattern>"

# Target specific browser if failure is browser-specific
npm run e2e -- test/e2e/<file>.e2e.ts --repeat-each=20 -g "<test name>" --project=<browser>
```

### 2.1 Calibrate test duration

Start small to get signal quickly, then scale up only if needed.

- 5-10 repeats on a single browser is usually under a few minutes.
- 20+ repeats across all browsers can take a long time, especially for full files.

Always run repeats on a single test or at most one file. Never repeat the whole suite.
Prefer narrowing with `-g` and `--project` first, then increase `--repeat-each` once the fix looks stable.

### 3. Analyze failure artifacts

When tests fail, Playwright saves traces and error context:

```bash
# View error context (page snapshot at failure time)
cat test-results/<test-name>-<browser>/error-context.md

# Open trace viewer (interactive)
npx playwright show-trace test-results/<test-name>-<browser>/trace.zip
```

### 4. Common flakiness patterns

**React Aria NumberInput flakiness**: When `fill()` on a NumberInput doesn't stick, it's often because:

- The component re-renders after a prop change (e.g., `maxValue` changing)
- Hydration race conditions

**Fix pattern - NumberInput helper**:

```typescript
import { fillNumberInput } from './utils'

await fillNumberInput(input, 'value')
```

**Form hydration issues**: Wait for a field that only renders after mount:

```typescript
await expect(page.getByRole('radiogroup', { name: 'Block size' })).toBeVisible()
```

**State change timing**: When clicking changes form state, wait for visual confirmation:

```typescript
await page.getByRole('radio', { name: 'Local' }).click()
// Wait for dependent UI to update
await expect(page.getByRole('radiogroup', { name: 'Block size' })).toBeHidden()
```

### 5. Sleep as last resort

If deterministic waits don't work, use `sleep()` from `test/e2e/utils.ts`:

```typescript
import { sleep } from './utils'

await sleep(200) // Use sparingly, prefer deterministic waits
```

### 6. Verify fix is stable

Run at least 30-50 iterations to confirm flakiness is resolved:

```bash
npm run e2e -- test/e2e/<file>.e2e.ts --repeat-each=50 -g "<test name>"
```

A good target is 0 failures out of 100+ runs. Test on all browsers that failed in CI.
1 change: 1 addition & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
# Testing code

- Run local checks before sending PRs: `npm run lint`, `npm run tsc`, `npm test run`, and `npm run e2ec`; pass `-- --ui` for Playwright UI mode or project/name filters like `npm run e2ec -- instance -g 'boot disk'`.
- You don't usually need to run all the e2e tests, so try to filter by filename. CI will run the full set.
- Keep Playwright specs focused on user-visible behavior—use accessible locators (`getByRole`, `getByLabel`), the helpers in `test/e2e/utils.ts` (`expectToast`, `expectRowVisible`, `selectOption`, `clickRowAction`), and close toasts so follow-on assertions aren’t blocked.
- Cover role-gated flows by logging in with `getPageAsUser`; exercise negative paths (e.g., forbidden actions) alongside happy paths as shown in `test/e2e/system-update.e2e.ts`.
- Consider `expectVisible` and `expectNotVisible` deprecated: prefer `expect().toBeVisible()` and `toBeHidden()` in new code.
Expand Down
2 changes: 1 addition & 1 deletion OMICRON_VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
c765b3539203e34f65cd402f139cf604035d5993
558f89ecd26ee7e2fcea526c2aed0a6fa637eb65
35 changes: 32 additions & 3 deletions app/api/__generated__/Api.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion app/api/__generated__/OMICRON_VERSION

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 10 additions & 0 deletions app/api/__generated__/msw-handlers.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

23 changes: 21 additions & 2 deletions app/api/__generated__/validate.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 9 additions & 0 deletions app/components/StateBadge.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -90,3 +90,12 @@ export const DiskTypeBadge = (props: { diskType: DiskType; className?: string })
{props.diskType}
</Badge>
)

// span is here to prevent it getting underlined in the LinkCell
export const ReadOnlyBadge = () => (
<span>
<Badge color="neutral" className="relative">
Read only
</Badge>
</span>
)
33 changes: 15 additions & 18 deletions app/components/form/fields/DiskSizeField.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -5,23 +5,19 @@
*
* Copyright Oxide Computer Company
*/
import type {
FieldPath,
FieldPathByValue,
FieldValues,
ValidateResult,
} from 'react-hook-form'

import { MAX_DISK_SIZE_GiB } from '@oxide/api'
import type { FieldPathByValue, FieldValues, ValidateResult } from 'react-hook-form'

import { NumberField } from './NumberField'
import type { TextFieldProps } from './TextField'

interface DiskSizeProps<
TFieldValues extends FieldValues,
TName extends FieldPath<TFieldValues>,
> extends TextFieldProps<TFieldValues, TName> {
minSize?: number
TName extends FieldPathByValue<TFieldValues, number>,
> extends Omit<TextFieldProps<TFieldValues, TName>, 'min' | 'max' | 'validate'> {
// replace max and min with our own because original max/min allow string
min?: number
/** Undefined means no client-side limit (e.g., for local disks) */
max: number | undefined
validate?(diskSizeGiB: number): ValidateResult
}

Expand All @@ -31,7 +27,8 @@ export function DiskSizeField<
>({
required = true,
name,
minSize = 1,
min = 1,
max,
validate,
...props
}: DiskSizeProps<TFieldValues, TName>) {
Expand All @@ -40,18 +37,18 @@ export function DiskSizeField<
units="GiB"
required={required}
name={name}
min={minSize}
max={MAX_DISK_SIZE_GiB}
min={min}
max={max}
validate={(diskSizeGiB) => {
// Run a number of default validators
if (Number.isNaN(diskSizeGiB)) {
return 'Disk size is required'
}
if (diskSizeGiB < minSize) {
return `Must be at least ${minSize} GiB`
if (diskSizeGiB < min) {
return `Must be at least ${min} GiB`
}
if (diskSizeGiB > MAX_DISK_SIZE_GiB) {
return `Can be at most ${MAX_DISK_SIZE_GiB} GiB`
if (max !== undefined && diskSizeGiB > max) {
return `Can be at most ${max} GiB`
}
// Run any additional validators passed in from the callsite
return validate?.(diskSizeGiB)
Expand Down
Loading
Loading