[PM-35062] llm: Add converting-xctest-to-swift-testing skill by KatherineInCode · Pull Request #2539 · bitwarden/ios

KatherineInCode · 2026-04-13T18:00:26Z

🎟️ Tracking

📔 Objective

Adds a project-level Claude skill for converting XCTest files to Swift Testing. The skill covers Bitwarden-specific patterns (processor, coordinator, and service tests), includes a do-not-convert checklist (ViewInspector, snapshot, UI automation tests), and bundles a copy of steipete's Swift Testing playbook and full Apple API documentation, with modifications, as references.

…eeding

…tion table

github-actions · 2026-04-13T18:01:44Z

🤖 Bitwarden Claude Code Review

Overall Assessment: APPROVE

Reviewed the addition of a project-level Claude Code skill for converting XCTest files to Swift Testing. The change consists of a new 508-line SKILL.md and a single-line registration entry in .claude/CLAUDE.md; there is no production, test, or dependency code involved. Technical guidance (BitwardenTestCase UI.animated coupling, assertAsyncThrows handling, struct-per-test isolation) was spot-checked against the codebase and is accurate. Prior review feedback across the PR's 12+ iterative commits has been incorporated, including the assertion-table fixes, the Equatable requirement for typed error values, the test_ prefix cleanup, and the exclusion of alert/loading overlay tests from conversion.

Code Review Details

No actionable findings. The skill documentation is internally consistent, matches the codebase's real test-helper patterns, and appropriately scopes which test files should remain in XCTest (ViewInspector, snapshot, UI automation, performance, Objective-C, alert-dependent, and loading-overlay-dependent).

github-actions · 2026-04-13T18:06:43Z

Checkmarx One – Scan Summary & Details – ae8483f3-b5fe-44f4-b163-6499b2a46443

Great job! No new security vulnerabilities introduced in this pull request

codecov · 2026-04-13T18:17:26Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.08%. Comparing base (b479386) to head (f104d61).
⚠️ Report is 20 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2539      +/-   ##
==========================================
- Coverage   87.08%   86.08%   -1.01%     
==========================================
  Files        1872     2116     +244     
  Lines      165835   181513   +15678     
==========================================
+ Hits       144417   156253   +11836     
- Misses      21418    25260    +3842

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

vvolkgang · 2026-04-13T21:12:20Z

+
+- [`Copyable`](https://developer.apple.com/documentation/Swift/Copyable?changes=_8)
+- [`CustomDebugStringConvertible`](https://developer.apple.com/documentation/Swift/CustomDebugStringConvertible?changes=_8)
+- [`CustomStringConvertible`](https://d


LLM wise we need to figure something else that isn't 14k lines of unprocessed crawled docs.

We need to be careful with these, happened to be a referral link but could as easily be a prompt injection attempt.

I'm not sure what a good solution would be when it's information the model hasn't been trained on already, but having the information contained in the docs is useful (and per the blog post, it can't just query Apple's online docs).

Agreed, and part of the reason to bring this into the repo rather than referring to the original gist was to avoid potential future prompt injection issues.

Oi vey... We're between a rock and a hard place here, huh?
Apple won't let their docs be crawled but we need Claude to have the most reliable and up-to-date patterns.

Couple questions:

Was this the raw file we tried to add to our repo? file-swift-testing_api-md

Would you be open to connecting with me this sprint to see how/if I could break this down into byte-sized skills/references for Claude Code to consume? I don't have Swift talents (yet), but I'm willing to help y'all out wherever I can for you to succeed with Claude. Lemme know

Yes. As noted, my original idea was to actually just have this skill point at steipete's playbook in general, which would also get that raw file, but that would potentially introduce a supply chain attack, so it was suggested I bring that into our repo.

Sure

That said, I have to wonder how much of this is getting updated into each successive Claude model. Steinberger wrote that guide back when Swift testing was very new, so you could assume the model hadn't been trained on the API and such yet. It wouldn't surprise me if Anthropic has found a way to crawl Apple's docs (after all, Steinberger wrote a tool for it) in terms of sourcing training data. Still runs into the problem of "we can't point at new docs easily", though.

vvolkgang · 2026-04-13T21:19:38Z

+| Pattern | Why |
+|---------|-----|
+| **ViewInspector tests** | `imports ViewInspector`; requires XCTest infrastructure |
+| **Snapshot tests** | Functions named `disabletest_*`, uses `assertSnapshot` or `SnapshotTesting`; depends on `BitwardenTestCase` |


vvolkgang · 2026-04-13T21:38:19Z

+
+### Migration Action Items
+- [ ] Ensure all developer machines and CI runners are on macOS 15+ and Xcode 16+.
+- [ ] For projects supporting Linux/Windows, add the `swift-testing` SPM package to your `Package.swift`. It's bundled in Xcode and not needed for Apple platforms.


💭 Your skill is doing a much better job addressing our needs, feels like the value add of this playbook could be consolidated and merged into it.

🤔 Part of the reason I like having the playbook more or less wholesale is that when he updates it with new information from e.g. WWDC, we can more easily update our copy, rather than having to look over the diff

theMickster

A reference file of over 14K lines of code is impossible to peer review. I also doubt that Claude will properly consume that massive file either.

Please break the work into smaller PRs and smaller reference files. Thank you much!

…ing XCTest

theMickster

Approving.

Please note that a local Claude Code review produced the following suggestions. I'm adding it because I am field testing a potential addition to our plugin marketplace and am interested in feedback as to whether or not the output is correct. I'm not super concerned about whether or not Claude chose the correct level/emoji at the moment, more focused on content correctness.

Please let me know what you think of the feedback.

Thanks!

Local multi-agent review [PM-35062] llm: Add converting-xctest-to-swift-testing skill (#2539)

Date: 2026-04-22 | Reviewed by: Claude Code (Opus 4.7)

Summary

Severity	Count
🛑 Blocker	0
⚠️ Important	0
♻️ Refactor	0
💡 Suggestion	4

Documentation-only PR adding a project-level Claude Code skill (506-line markdown) plus a 1-line registration in .claude/CLAUDE.md. No production code changes, no security impact. The skill will be executed mechanically by an LLM, so the findings target rules or examples that could mislead a mechanical pass. Finding #1 is the one worth prioritizing.

Findings

💡 Suggestions

`BitwardenTestCase` → `struct` rule drops base-class setUp defaults without flagging them

.claude/skills/converting-xctest-to-swift-testing/SKILL.md:14-20

Details

BitwardenTestCase.setUp() sets UI.animated = false and UI.sizeCategory = .large; class-level setUp() runs UI.applyDefaultAppearances(). The exclusion table covers snapshot/ViewInspector/UI/perf/ObjC tests, and the Class → Struct rule (file lines 51-53) then unconditionally drops BitwardenTestCase inheritance.

Current processor/coordinator/service tests rarely exercise the code paths that read UI.animated / UI.sizeCategory, so the practical blast radius today is small. The concern is that the skill is executed mechanically by an LLM — any future test that does reference those globals would silently lose the override with no guard against it. Add a one-line precondition to Step 2 or the exclusion table: "If the file reads UI.animated / UI.sizeCategory directly, verify the defaults are still satisfied before dropping BitwardenTestCase inheritance." Highest-priority of the suggestions because it's the only one that can cause a silent behavioral change, not just doc drift.

"Specific Error value" match requires `Equatable` conformance

.claude/skills/converting-xctest-to-swift-testing/SKILL.md:247

Details

#expect(throws: BitwardenTestError.example) (error-value form) requires the error type to conform to Equatable. A new error enum with associated values won't auto-conform and will fail to compile when a converter follows the row literally. Add a one-line note naming the precondition so readers don't chase a confusing compile error.

Parameterized examples retain `test` prefix, contradicting Step 5's own rename rule

.claude/skills/converting-xctest-to-swift-testing/SKILL.md:187,292

Details

Step 5 tells readers to remove the test_ prefix when adding @Test. The examples func testUserCount_minimum() (line 187) and func testFlavorContainsNuts(...) (line 292) keep the prefix. An LLM consuming this skill will pattern-match on the examples as much as on the rule; the inconsistency produces drift. Rename to userCount_minimum / flavorContainsNuts, or annotate them as illustrative side-by-side comparisons.

`assertAsyncThrows` row is missing the third "Notes" column cell

.claude/skills/converting-xctest-to-swift-testing/SKILL.md:250

Details

The assertion-conversion table is three columns (XCTest | Swift Testing | Notes), but the assertAsyncThrows row only supplies two cells, so the row renders malformed. Add an empty | | cell or a brief note.

theMickster · 2026-04-22T06:08:44Z

 | `planning-ios-implementation` | "plan implementation", "design approach", "architecture plan" |
 | `implementing-ios-code` | "implement", "write code", "add screen", "create feature" |
 | `testing-ios-code` | "write tests", "add test coverage", "unit test" |
+| `converting-xctest-to-swift-testing` | "convert to Swift Testing", "migrate XCTest", "xctest to swift testing" |


⛏️ In the skill frontmatter, we have different casing in the wording. I doubt it matters, but something I saw.
In the long run, I don't think that this table will be necessary in CLAUDE.md because this feels like human documentation that's a burden to maintain. I also have my doubts as to whether or not this helps Claude whatsoever in its progressive disclosure.

@SaintPatrck Are we doing this type of documenting in other places mapping skills to triggers?

Skill frontmatter

description: Convert an XCTest file to Swift Testing framework. Use when asked to "convert to Swift Testing", "migrate XCTest", "convert test file", "xctest to swift testing", "migrate tests to Swift Testing", or when explicitly asked to convert existing XCTest-based tests.

"Swift Testing" and "XCTest" are proper nouns, at least, so the capitalization nominally makes sense 🤔

Are you thinking just the front matter is sufficient for progressive disclosure?

I kinda thought that the frontmatter is sufficient but wanted Patrick to chime in too.

@theMickster the table is an additional "nudge" to encourage usage of the repo-local skills in appropriate situations. It is present in the Android repo also. After doing evals I saw a significant increase in hit-rates when the repo-local skills are present in CLAUDE.md. The improvement was enough to warrant the small maintenance burden, imo. The most we would realistically be doing for maintenance is adding/removing skills.

The table's value manifests even more when agents are executed in a parent directory and navigate down into this project directory (e.g., run /plan-implement-review from a "workspace" type of environment). Not only do they immediately get all of the project specific guidance from CLAUDE.md, they also get a nice "cheatsheet" of skills that can aid in their task(s).

🌱 That being said, my tests were done several weeks ago and the frontier models have gotten more reliable in regards to skill discovery, so it would be worth re-running those evals with the latest models. The new /skill-creator should make that relatively easy. Out of scope for this PR, obviously, but something we can easily follow-up on.

💭 Also, when_to_use was recently introduced, which may be a better place to improve hit-rates in place of this table.

TLDR; Keep. Re-assess value after adding when_to_use to skill frontmatter.

@SaintPatrck
Makes sense. Sure, but until then, will you also please add these block to the clients and server repos to match? That way we are consistent across the board?

…ssertion

…sting conversion

KatherineInCode · 2026-04-22T19:51:08Z

@theMickster The things that review skill generated up were reasonable to fix, so I've done that to clean it up a bit. Something like that skill would be really useful to run locally; is that more generally available, or are you still working on it?

KatherineInCode added 5 commits April 13, 2026 12:33

[PM-35062] llm: Add converting-xctest-to-swift-testing skill

89fcd68

[PM-35062] llm: Clean up steipete-specific content in playbook reference

c9ac104

[PM-35062] llm: Remove name reference from skill

d0e86c6

[PM-35062] llm: Instruct skill to read playbook reference before proc…

542c445

…eeding

[PM-35062] llm: Move XCTFail conversion to playbook, trim skill asser…

99b16ac

…tion table

KatherineInCode added the ai-review-vnext Request a Claude code review using the vNext workflow label Apr 13, 2026

github-actions Bot added the t:llm Change Type - LLM related change (e.g. CLAUDE.md files) label Apr 13, 2026

KatherineInCode marked this pull request as ready for review April 13, 2026 18:38

KatherineInCode requested review from a team as code owners April 13, 2026 18:38

vvolkgang reviewed Apr 13, 2026

View reviewed changes

theMickster requested changes Apr 14, 2026

View reviewed changes

KatherineInCode added 4 commits April 14, 2026 08:36

Remove testing API documentation

4659ea0

[PM-35062] llm: Note that Foundation import may be needed after remov…

3813bb0

…ing XCTest

Combine playbook into skill

32b0c55

Fix newline

c17db86

KatherineInCode requested review from theMickster and vvolkgang April 21, 2026 14:01

theMickster previously approved these changes Apr 22, 2026

View reviewed changes

KatherineInCode added 4 commits April 22, 2026 13:46

[PM-35062] llm: Fix assertAsyncThrows row in assertion conversion table

a65b257

[PM-35062] llm: Fix test prefix in Swift Testing example functions

24323e7

[PM-35062] llm: Note Equatable requirement for specific error value a…

67948ab

…ssertion

[PM-35062] llm: Exclude alert and loading overlay tests from Swift Te…

f104d61

…sting conversion

KatherineInCode dismissed theMickster’s stale review via f104d61 April 22, 2026 19:50

KatherineInCode requested a review from theMickster April 22, 2026 19:51

theMickster approved these changes Apr 23, 2026

View reviewed changes

KatherineInCode merged commit e790288 into main Apr 23, 2026
36 of 37 checks passed

KatherineInCode deleted the pm-35062/swift-testing-playbook branch April 23, 2026 14:30

Conversation

KatherineInCode commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🎟️ Tracking

📔 Objective

Uh oh!

github-actions Bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🤖 Bitwarden Claude Code Review

Uh oh!

github-actions Bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Great job! No new security vulnerabilities introduced in this pull request

Uh oh!

codecov Bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

theMickster left a comment

Choose a reason for hiding this comment

Uh oh!

theMickster left a comment

Choose a reason for hiding this comment

Summary

Findings

💡 Suggestions

BitwardenTestCase → struct rule drops base-class setUp defaults without flagging them

"Specific Error value" match requires Equatable conformance

Parameterized examples retain test prefix, contradicting Step 5's own rename rule

assertAsyncThrows row is missing the third "Notes" column cell

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SaintPatrck Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

KatherineInCode commented Apr 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

KatherineInCode commented Apr 13, 2026 •

edited

Loading

github-actions Bot commented Apr 13, 2026 •

edited

Loading

github-actions Bot commented Apr 13, 2026 •

edited

Loading

codecov Bot commented Apr 13, 2026 •

edited

Loading

`BitwardenTestCase` → `struct` rule drops base-class setUp defaults without flagging them

"Specific Error value" match requires `Equatable` conformance

Parameterized examples retain `test` prefix, contradicting Step 5's own rename rule

`assertAsyncThrows` row is missing the third "Notes" column cell

SaintPatrck Apr 22, 2026 •

edited

Loading