Skip to content

[NoQA] Add agent-device glue-code skill for mobile testing#87662

Merged
rlinoz merged 15 commits intoExpensify:mainfrom
callstack-internal:agent-device-local-skill
Apr 22, 2026
Merged

[NoQA] Add agent-device glue-code skill for mobile testing#87662
rlinoz merged 15 commits intoExpensify:mainfrom
callstack-internal:agent-device-local-skill

Conversation

@kacper-mikolajczak
Copy link
Copy Markdown
Contributor

@kacper-mikolajczak kacper-mikolajczak commented Apr 10, 2026

Explanation of Change

Adds an agent-device glue-code skill (.claude/skills/agent-device/SKILL.md) that enables Claude Code to drive iOS and Android devices for local mobile development - testing, debugging, performance profiling, bug reproduction, and feature verification.

What this PR ships:

A single lean skill file that:

  1. Verifies agent-device CLI is installed (hard stop if missing, with install instructions)
  2. Points the agent to read the CLI's bundled skill files from the installed npm package for canonical device automation guidance
  3. Provides usage principles (fail fast, deviations are signal) to keep interaction developer-directed
  4. Covers the full local dev workflow - not just testing, but debugging, perf, bug repro, and feature verification

What this PR does NOT ship:

  • No inlined copies of the upstream agent-device skill files - the CLI's bundled skills are the single source of truth
  • No autonomous QA skill (deferred to Phase 2)

Design decisions:

  • Glue-code, not vendored copy. Earlier iterations inlined the full agent-device skill tree (bootstrap, exploration, verification, debugging, coordinate-system references). Reviewer feedback correctly identified these as non-Expensify-specific. They now live exclusively in the agent-device npm package and auto-update with npm install -g agent-device.
  • Developer-directed. The skill follows the developer's lead - full test plan, step-by-step instructions, or ad-hoc questions. It fails fast on deviations and surfaces problems clearly.

Fixed Issues

$ #87030

Tests

No runtime code is changed - these are Claude Code skill definitions only.

  1. Install the CLI: npm install -g agent-device
  2. Open the repo in Claude Code
  3. Run a prompt like "test the login flow on iOS" - verify the agent-device skill is picked up
  4. Verify .claude/skills/agent-device/SKILL.md exists and contains only the pre-flight gate + usage principles (no inlined references)
  • Verify that no errors appear in the JS console

Offline tests

N/A - changes are Claude Code skill definitions only, no runtime app code affected.

QA Steps

N/A - no runtime code changes. These are developer tooling files (Claude Code skill definitions).

  • Verify that no errors appear in the JS console

PR Author Checklist

  • I linked the correct issue in the ### Fixed Issues section above
  • I wrote clear testing steps that cover the changes made in this PR
    • I added steps for local testing in the Tests section
    • I added steps for the expected offline behavior in the Offline steps section
    • I added steps for Staging and/or Production testing in the QA steps section
    • I added steps to cover failure scenarios (i.e. verify an input displays the correct error message if the entered data is not correct)
    • I turned off my network connection and tested it while offline to ensure it matches the expected behavior (i.e. verify the default avatar icon is displayed if app is offline)
    • I tested this PR with a High Traffic account against the staging or production API to ensure there are no regressions (e.g. long loading states that impact usability).
  • I included screenshots or videos for tests on all platforms
  • I ran the tests on all platforms & verified they passed on:
    • Android: Native
    • Android: mWeb Chrome
    • iOS: Native
    • iOS: mWeb Safari
    • MacOS: Chrome / Safari
  • I verified there are no console errors (if there's a console error not related to the PR, report it or open an issue for it to be fixed)
  • I followed proper code patterns (see Reviewing the code)
    • I verified that any callback methods that were added or modified are named for what the method does and never what callback they handle (i.e. toggleReport and not onIconClick)
    • I verified that comments were added to code that is not self explanatory
    • I verified that any new or modified comments were clear, correct English, and explained "why" the code was doing something instead of only explaining "what" the code was doing.
    • I verified any copy / text shown in the product is localized by adding it to src/languages/* files and using the translation method
      • If any non-english text was added/modified, I used JaimeGPT to get English > Spanish translation. I then posted it in #expensify-open-source and it was approved by an internal Expensify engineer. Link to Slack message:
    • I verified all numbers, amounts, dates and phone numbers shown in the product are using the localization methods
    • I verified any copy / text that was added to the app is grammatically correct in English. It adheres to proper capitalization guidelines (note: only the first word of header/labels should be capitalized), and is either coming verbatim from figma or has been approved by marketing (in order to get marketing approval, ask the Bug Zero team member to add the Waiting for copy label to the issue)
    • I verified proper file naming conventions were followed for any new files or renamed files. All non-platform specific files are named after what they export and are not named "index.js". All platform-specific files are named for the platform the code supports as outlined in the README.
    • I verified the JSDocs style guidelines (in STYLE.md) were followed
  • If a new code pattern is added I verified it was agreed to be used by multiple Expensify engineers
  • I followed the guidelines as stated in the Review Guidelines
  • I tested other components that can be impacted by my changes (i.e. if the PR modifies a shared library or component like Avatar, I verified the components using Avatar are working as expected)
  • I verified all code is DRY (the PR doesn't include any logic written more than once, with the exception of tests)
  • I verified any variables that can be defined as constants (ie. in CONST.ts or at the top of the file that uses the constant) are defined as such
  • I verified that if a function's arguments changed that all usages have also been updated correctly
  • If any new file was added I verified that:
    • The file has a description of what it does and/or why is needed at the top of the file if the code is not self explanatory
  • If a new CSS style is added I verified that:
    • A similar style doesn't already exist
    • The style can't be created with an existing StyleUtils function (i.e. StyleUtils.getBackgroundAndBorderStyle(theme.componentBG))
  • If new assets were added or existing ones were modified, I verified that:
    • The assets are optimized and compressed (for SVG files, run npm run compress-svg)
    • The assets load correctly across all supported platforms.
  • If the PR modifies code that runs when editing or sending messages, I tested and verified there is no unexpected behavior for all supported markdown - URLs, single line code, code blocks, quotes, headings, bold, strikethrough, and italic.
  • If the PR modifies a generic component, I tested and verified that those changes do not break usages of that component in the rest of the App (i.e. if a shared library or component like Avatar is modified, I verified that Avatar is working as expected in all cases)
  • If the PR modifies a component related to any of the existing Storybook stories, I tested and verified all stories for that component are still working as expected.
  • If the PR modifies a component or page that can be accessed by a direct deeplink, I verified that the code functions as expected when the deeplink is used - from a logged in and logged out account.
  • If the PR modifies the UI (e.g. new buttons, new UI components, changing the padding/spacing/sizing, moving components, etc) or modifies the form input styles:
    • I verified that all the inputs inside a form are aligned with each other.
    • I added Design label and/or tagged @Expensify/design so the design team can review the changes.
  • If a new page is added, I verified it's using the ScrollView component to make it scrollable when more elements are added to the page.
  • I added unit tests for any new feature or bug fix in this PR to help automatically prevent regressions in this user flow.
  • If the main branch was merged into this PR after a review, I tested again and verified the outcome was still expected according to the Test steps.

Screenshots/Videos

Android: Native

N/A - no UI changes

Android: mWeb Chrome

N/A - no UI changes

iOS: Native

N/A - no UI changes

iOS: mWeb Safari

N/A - no UI changes

MacOS: Chrome / Safari

N/A - no UI changes

- Install callstackincubator/agent-device skill (+ bundled dogfood skill)
- Add agent-device-app-testing wrapper skill with Expensify-specific context:
  package name, sign-in flow, usage guidance, and proactive triggers
@kacper-mikolajczak kacper-mikolajczak changed the title Install agent-device skill for mobile testing [NoQA] Install agent-device skill for mobile testing Apr 10, 2026
@kacper-mikolajczak
Copy link
Copy Markdown
Contributor Author

CC @Julesssss @adhorodyski

Copy link
Copy Markdown
Contributor

@Julesssss Julesssss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't a full review yet, but some intitial thoughts:

...se after mobile/React Native code changes

  1. This description is likely to trigger more that I would like for the initial skill implementation. Could we restrict this so that the use when scope is reduced

  2. The skills are a bit verbose in some cases. It would be good to reduce where possible, I can point out in more details with a second review shortly

@kacper-mikolajczak
Copy link
Copy Markdown
Contributor Author

Thanks for the feedback! I will work on adjusting the skill properly. Looking forward for further ideas from your side ❤️

- Flatten .agents/skills/ into .claude/skills/ (remove symlink indirection
  and skills-lock.json created by `npx skills add`)
- Add CLI prerequisites section to wrapper skill
- Replace .rock/cache/ CI paths with local build as primary flow
- Add agent-device-output/ to .gitignore
- Fix email pattern and dev/release package names
- Tighten trigger scope to explicit user requests only
- Reduce verbosity per reviewer feedback
Per Jules's comment: local testing is directed by user, not
prescribed by the skill. Remove step-by-step workflow - the base
agent-device skill handles interaction. Keep only the App-specific
facts that avoid repetitive lookups (package names, build commands,
sign-in creds, RN gotchas).
Reduces context overhead for the PoC. dogfood (autonomous QA) is
better suited for Phase 2/Melvin. macOS desktop and remote tenancy
references are not relevant for local mobile testing.
Comment thread .claude/skills/agent-device/SKILL.md Outdated
Comment thread .claude/skills/agent-device/references/bootstrap-install.md Outdated
Comment thread .claude/skills/agent-device/references/debugging.md Outdated
Comment thread .claude/skills/agent-device/references/coordinate-system.md Outdated
Comment thread .claude/skills/agent-device/references/exploration.md Outdated
…flow

- Replace removed scrollintoview command with scroll + re-snapshot pattern
- Add shell loop example for off-screen element discovery
- Add diff screenshot section to verification reference
- Rework app-testing skill with gated startup flow (device, metro, dev app)
- Remove release build references, enforce dev-only app policy
Remove all inlined agent-device skill files and references - the CLI's
bundled skills are the canonical source. The repo skill is now a thin
glue layer: pre-flight check, usage principles, and a pointer to read
the bundled skills from the installed package.
@kacper-mikolajczak kacper-mikolajczak changed the title [NoQA] Install agent-device skill for mobile testing [NoQA] Add agent-device glue-code skill for mobile testing Apr 14, 2026
- Widen skill trigger to cover testing, debugging, perf, bug repro, feature verification
- Add usage principles (fail fast, deviations are signal)
- Add early-development footnote with Expensify Slack contact
- Add agent-device.json with iOS mobile defaults
@kacper-mikolajczak
Copy link
Copy Markdown
Contributor Author

kacper-mikolajczak commented Apr 14, 2026

Okay, we are directly referencing agent-device skills from its installation path. This way, we don't have any loosely-dangling dependencies, e.g. in form of inlined skills. The only thing added is a glue layer, so the agent is aware of our dependency on agent-device and wraps it with tiny set of pre-requisite instructions.

Let me know what you'd like to see next as a part of this integration! What I was thinking of:

  • Extend the "glue" layer with instructions that will help streamline the automation process including App-specific context instructions (e.g. running app from scratch), documenting repetitive workflows (e.g. performing login) or cross-referencing other docs to (as we are starting and any feedback is appreciated, I added a footer note)
  • Create accompanying skill that works on top, but is more task-focused. For example record-pr-evidence that will perform tests steps and record or make screenshots that verifies implementation. This alone probably won't suffice, because of long running thinking (recording would be too long) but we can use a middle-man in form of record-testing-flow where it uses replay system to first create a flow that will be then resolved quicker
  • Taking advantage of agent-device built-in features like replayability and screenshot diffing. Both of those should definitely help in our upcoming Phase 2 where we will emphasise testing/reviewing more.

Also, for the matter of performance testing we could consider agent-react-devtools integration, which is a really nice auxiliary package, but it covers a specific niche - not all developers are interested in the performance tooling. Keeping reference to it here as an inspiration for further discussions :)


Here's how it works locally. As input, the agent was told to follow the testing steps of one of the PRs and it picked up the agent-device then properly assessed after performing list of steps that the App does not fulfil testing scenario (as it was different checkout that did not contained fixes):

Screen.Recording.Apr.14.2026.from.Online.Video.Cutter.mp4

CC @adhorodyski @Julesssss

@BartekObudzinski
Copy link
Copy Markdown
Contributor

Nice progress!

For Phase 1 this covers the basics well. A few things worth considering as we iterate:

Edge cases for the current flow:

  1. Simulator/emulator not booted. The pre-flight checks if agent-device CLI is installed but not whether a target device is actually running. Could we add a quick device availability check right after the version gate? This would also handle the case where both an iOS simulator and Android emulator are running and the agent needs to pick one, or let the developer specify which platform to target.

I think there is no action needed: If I correctly pointed out the platform it correctly runs it:
image

  1. Multi-account test scenarios. A lot of our flows require two accounts (sending money, approving reports, chat between users). How would agent-device handle cases where we need to log out, switch accounts, or drive two simulators in parallel? This comes up constantly in QA. It's worth to think about it in the future.

Ideas for future phases:

  1. Console error investigation on failure. Since we have access to React DevTools and the console, when the agent hits a failure during testing it could automatically check the console for errors. Based on the flow context and the error message, it could start investigating the root cause instead of just reporting "something went wrong." That would turn a failed test run into a useful debugging session, but wonder if it's not too much.
  2. Sentry span measurements via console. We already output Sentry span timings to the console. The agent could capture those during test flows and report performance numbers alongside functional results. That way every test run doubles as a lightweight perf check without needing a separate measurement setup.

I have just run the current implementation and works well for phase 1
image

@Julesssss Julesssss requested a review from rlinoz April 15, 2026 17:52
@adhorodyski
Copy link
Copy Markdown
Contributor

adhorodyski commented Apr 15, 2026

I do work with react worktrees a lot here and would personally appreciate agent-react-devtools on board 👍🏻

@adhorodyski
Copy link
Copy Markdown
Contributor

Sentry span measurements via console

that'd be lovely

@Julesssss
Copy link
Copy Markdown
Contributor

instructions that will help streamline the automation process including App-specific context instructions

Yes one annoyance mentioned in the docs is onboarding modals, we have a few that. Though we also have the SKIP_ONBOARDING=true env flag.

From the docs it sounds like Batching is how we would pre-define flows for repeating smoke tests, is that correct? Or would that be replayability? Anyway I like the idea of running exploratory tests and then recording these to expand the 'testing steps?' library for future reruns.

not all developers are interested in the performance tooling

Oh interesting, as you know we are heavily focusing on performance so that is definately worth thinking about later 👍


How would agent-device handle cases where we need to log out, switch accounts, or drive two simulators in parallel?

Great point.

Console error investigation on failure.

Sentry span measurements via console

Interesting ideas. To avoid having too many duplicated workflows it might be preferred to trigger our existing triage agents. Enabling the triage agent to take advantage of this tool would be great though. Melvin already has the ability to verify (simple) web bugs via playwright.

@kacper-mikolajczak
Copy link
Copy Markdown
Contributor Author

kacper-mikolajczak commented Apr 15, 2026

First things first, thank you all for all of the interesting feedback! There are couple of really nifty use-cases for agent-device (and its surrounding) mentioned ❤️ However, I'd like to keep this PR lean and not to inflate it too much.

We will benefit from defining what features are necessity as a part of initial integration and get that merged, so agent-device is already available for the developers.

Then carry on with another set of PRs for functionality added on top. We'd get the developers feedback about current setup in parallel and we can enhance our discussion of those upcoming features this way.

Let me know what you think and if you agree let's define the short-list of what's missing still. Thanks!

@kacper-mikolajczak
Copy link
Copy Markdown
Contributor Author

kacper-mikolajczak commented Apr 15, 2026

From the docs it sounds like Batching is how we would pre-define flows for repeating smoke tests, is that correct? Or would that be replayability? Anyway I like the idea of running exploratory tests and then recording these to expand the 'testing steps?' library for future reruns.

Good question @Julesssss! There are three related but distinct pieces here:

  • batch - runs a known sequence of commands in one shot (JSON step array). Think of it as a scripted macro - great for deterministic flows you've already figured out, but no built-in recording or assertion layer.

  • replay - plays back a recorded .ad session file. Every exploratory session automatically records actions, so you get the "explore then replay" loop for free. replay -u re-runs the script and auto-updates drifted selectors in place.

  • test - runs one or more .ad files as a suite with retries, timeouts, --fail-fast, and JUnit XML output (--report-junit). This is the regression harness - point it at a folder of .ad scripts and it gives you a pass/fail report.

So, all of them have a use-case in our methodology. We could (just an idea for now):

  1. Maintain "macros" - a suite of flows that are repetitive (e.g. logging in) and make agent reference them instead of working its way from scratch every time

Yes one annoyance mentioned in the docs is onboarding modals, we have a few that. Though we also have the SKIP_ONBOARDING=true env flag.

It should help with those, in cases we are not having other tooling set up like env flags already.

  1. Maintain "tests" (you mentioned) - a suite of flows that are important for us to have established for verifying critical paths correctness/performance. For example, we could ask maintainers of Sentry metrics to develop such flows and then, whenever work on metric is being performed, we can validate against them

*Naming is really arbitrary - just wanted to convey the idea :D

@kacper-mikolajczak
Copy link
Copy Markdown
Contributor Author

kacper-mikolajczak commented Apr 15, 2026

Sentry span measurements via console

Agree! Let's keep it as an upcoming improvement after baseline is merged.

@kacper-mikolajczak
Copy link
Copy Markdown
Contributor Author

kacper-mikolajczak commented Apr 15, 2026

Multi-account test scenarios. A lot of our flows require two accounts (sending money, approving reports, chat between users). How would agent-device handle cases where we need to log out, switch accounts, or drive two simulators in parallel? This comes up constantly in QA. It's worth to think about it in the future.

@BartekObudzinski agent-device already supports named sessions and device targeting, so you can run isolated sessions on separate simulators:

agent-device --session user-a --device "iPhone 16 Pro" open com.expensify.chat
agent-device --session user-b --device "iPhone 16" open com.expensify.chat

Each session is independently addressable via --session <name>, --device, --udid, or --serial flags.

However, coordinated cross-device orchestration (trigger on device A, assert on device B in one flow) is tracked as a backlog item (callstackincubator/agent-device#100) and not yet shipped.

For now, the practical path for two-account flows is sequential on one simulator: complete account A's actions, log out, log in as account B, verify. Not as fast as true parallel, but would it be enough for approval/send-money scenarios?

With that said, I agree we might benefit from either having pre-defined tests flows for such use-cases or add explanatory instructions in form of auxiliary skill. Let's track this as upcoming improvement!

@kacper-mikolajczak
Copy link
Copy Markdown
Contributor Author

kacper-mikolajczak commented Apr 15, 2026

I do work with react worktrees a lot here and would personally appreciate agent-react-devtools on board

Sure thing @adhorodyski! Given it's a distinct tool with its own setup, I'd propose a separate follow-up PR - sounds good?

Copy link
Copy Markdown
Contributor

@rlinoz rlinoz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exciting!

Comment thread .claude/skills/agent-device/SKILL.md Outdated
Comment thread .claude/skills/agent-device/SKILL.md Outdated
The `agent-device` CLI ships with built-in skills under `skills/` in the installed package. These contain the canonical reference for device automation - bootstrap, exploration, verification, debugging, and more. Use `agent-device --help` to discover available commands and skill names. Read the skill files directly from the installed package path when you need detailed guidance:

```bash
# Find the package location
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish we could run commands automatically instead of asking the agent to do something...

Locally this fails with permission though, so not sure how we could do it. Can we update the settings and allow this commands and inject the context without a tool call?

Copy link
Copy Markdown
Contributor Author

@kacper-mikolajczak kacper-mikolajczak Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, thanks @rlinoz! I will look into permissions issue.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @rlinoz, latest PR commit uses dynamic calls to perform pre-flight. Let me know if it works for you the same way as it has worked for me.

CC @BartekObudzinski if you'd like to test it, too. Thanks!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested end-to-end on iOS. Dynamic pre-flight works, devices, open, snapshot, press all ran cleanly LGTM

Add a Mobile Device Testing subsection parallel to Browser Testing in
CLAUDE.md, and an optional AI-assisted testing callout in README after
Platform-Specific Setup. Makes the agent-device skill discoverable
for Claude Code users without claiming it's required setup.
@kacper-mikolajczak
Copy link
Copy Markdown
Contributor Author

Hi @Julesssss @rlinoz! The review remarks were addressed, let me know what you think of current state of the agent-device setup:

  • SKILL.md - minimal pre-flight. !`agent-device --version` and !`npm root -g` inject at skill load; fails fast with install instructions if the CLI is missing. Points at the installed package's bundled docs - no vendored copy, auto-updates with npm install -g agent-device.
  • .claude/settings.json - Bash(agent-device *) pre-approved, no permission prompt on fresh checkouts.
  • CLAUDE.md + README.md - new Mobile Device Testing section paralleling Browser Testing, plus an opt-in callout in README.

Comment thread README.md
Comment on lines +51 to +52
**Optional AI-assisted mobile testing:** If you use Claude Code, the [`/agent-device` skill](.claude/skills/agent-device/SKILL.md) drives iOS and Android simulators or devices for interactive testing, debugging, and performance profiling. Requires `npm install -g agent-device`.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NAB: We should start documenting our skills better -- outside scope here of course

Julesssss
Julesssss previously approved these changes Apr 21, 2026
Copy link
Copy Markdown
Contributor

@Julesssss Julesssss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me as a first step

@kacper-mikolajczak kacper-mikolajczak marked this pull request as ready for review April 21, 2026 08:28
@kacper-mikolajczak kacper-mikolajczak requested a review from a team as a code owner April 21, 2026 08:28
@melvin-bot melvin-bot Bot requested review from Julesssss and abzokhattab and removed request for a team April 21, 2026 08:28
@melvin-bot
Copy link
Copy Markdown

melvin-bot Bot commented Apr 21, 2026

@abzokhattab @Julesssss One of you needs to copy/paste the Reviewer Checklist from here into a new comment on this PR and complete it. If you have the K2 extension, you can simply click: [this button]

@kacper-mikolajczak
Copy link
Copy Markdown
Contributor Author

kacper-mikolajczak commented Apr 21, 2026

Hi @rlinoz! Jules told me he is going to be OOO for the rest of the week. Let me know if you need anything on my end to push this further. Thanks!

Reassure tests are failing but the PR does not include any code-related changes, so it is most likely a CI issue.

P.S.

Created a separate issue for further exploration of automated flows as we have discussed as next steps #88388

@cretadn22
Copy link
Copy Markdown
Contributor

@Julesssss since I'm assigned to the related issue, would you like me to review this PR?"

@rlinoz
Copy link
Copy Markdown
Contributor

rlinoz commented Apr 21, 2026

Hey @cretadn22 sorry for the ping, no need for a review.

@rlinoz
Copy link
Copy Markdown
Contributor

rlinoz commented Apr 21, 2026

Reviewer Checklist

  • I have verified the author checklist is complete (all boxes are checked off).
  • I verified the correct issue is linked in the ### Fixed Issues section above
  • I verified testing steps are clear and they cover the changes made in this PR
    • I verified the steps for local testing are in the Tests section
    • I verified the steps for Staging and/or Production testing are in the QA steps section
    • I verified the steps cover any possible failure scenarios (i.e. verify an input displays the correct error message if the entered data is not correct)
    • I turned off my network connection and tested it while offline to ensure it matches the expected behavior (i.e. verify the default avatar icon is displayed if app is offline)
  • I checked that screenshots or videos are included for tests on all platforms
  • I included screenshots or videos for tests on all platforms
  • I verified that the composer does not automatically focus or open the keyboard on mobile unless explicitly intended. This includes checking that returning the app from the background does not unexpectedly open the keyboard.
  • I verified tests pass on all platforms & I tested again on:
    • Android: HybridApp
    • Android: mWeb Chrome
    • iOS: HybridApp
    • iOS: mWeb Safari
    • MacOS: Chrome / Safari
  • If there are any errors in the console that are unrelated to this PR, I either fixed them (preferred) or linked to where I reported them in Slack
  • I verified there are no new alerts related to the canBeMissing param for useOnyx
  • I verified proper code patterns were followed (see Reviewing the code)
    • I verified that any callback methods that were added or modified are named for what the method does and never what callback they handle (i.e. toggleReport and not onIconClick).
    • I verified that comments were added to code that is not self explanatory
    • I verified that any new or modified comments were clear, correct English, and explained "why" the code was doing something instead of only explaining "what" the code was doing.
    • I verified any copy / text shown in the product is localized by adding it to src/languages/* files and using the translation method
    • I verified all numbers, amounts, dates and phone numbers shown in the product are using the localization methods
    • I verified any copy / text that was added to the app is grammatically correct in English. It adheres to proper capitalization guidelines (note: only the first word of header/labels should be capitalized), and is either coming verbatim from figma or has been approved by marketing (in order to get marketing approval, ask the Bug Zero team member to add the Waiting for copy label to the issue)
    • I verified proper file naming conventions were followed for any new files or renamed files. All non-platform specific files are named after what they export and are not named "index.js". All platform-specific files are named for the platform the code supports as outlined in the README.
    • I verified the JSDocs style guidelines (in STYLE.md) were followed
  • If a new code pattern is added I verified it was agreed to be used by multiple Expensify engineers
  • I verified that this PR follows the guidelines as stated in the Review Guidelines
  • I verified other components that can be impacted by these changes have been tested, and I retested again (i.e. if the PR modifies a shared library or component like Avatar, I verified the components using Avatar have been tested & I retested again)
  • I verified all code is DRY (the PR doesn't include any logic written more than once, with the exception of tests)
  • I verified any variables that can be defined as constants (ie. in CONST.ts or at the top of the file that uses the constant) are defined as such
  • If a new component is created I verified that:
    • A similar component doesn't exist in the codebase
    • All props are defined accurately and each prop has a /** comment above it */
    • The file is named correctly
    • The component has a clear name that is non-ambiguous and the purpose of the component can be inferred from the name alone
    • The only data being stored in the state is data necessary for rendering and nothing else
    • For Class Components, any internal methods passed to components event handlers are bound to this properly so there are no scoping issues (i.e. for onClick={this.submit} the method this.submit should be bound to this in the constructor)
    • Any internal methods bound to this are necessary to be bound (i.e. avoid this.submit = this.submit.bind(this); if this.submit is never passed to a component event handler like onClick)
    • All JSX used for rendering exists in the render method
    • The component has the minimum amount of code necessary for its purpose, and it is broken down into smaller components in order to separate concerns and functions
  • If any new file was added I verified that:
    • The file has a description of what it does and/or why is needed at the top of the file if the code is not self explanatory
  • If a new CSS style is added I verified that:
    • A similar style doesn't already exist
    • The style can't be created with an existing StyleUtils function (i.e. StyleUtils.getBackgroundAndBorderStyle(theme.componentBG)
  • If the PR modifies code that runs when editing or sending messages, I tested and verified there is no unexpected behavior for all supported markdown - URLs, single line code, code blocks, quotes, headings, bold, strikethrough, and italic.
  • If the PR modifies a generic component, I tested and verified that those changes do not break usages of that component in the rest of the App (i.e. if a shared library or component like Avatar is modified, I verified that Avatar is working as expected in all cases)
  • If the PR modifies a component related to any of the existing Storybook stories, I tested and verified all stories for that component are still working as expected.
  • If the PR modifies a component or page that can be accessed by a direct deeplink, I verified that the code functions as expected when the deeplink is used - from a logged in and logged out account.
  • If the PR modifies the UI (e.g. new buttons, new UI components, changing the padding/spacing/sizing, moving components, etc) or modifies the form input styles:
    • I verified that all the inputs inside a form are aligned with each other.
    • I added Design label and/or tagged @Expensify/design so the design team can review the changes.
  • If a new page is added, I verified it's using the ScrollView component to make it scrollable when more elements are added to the page.
  • For any bug fix or new feature in this PR, I verified that sufficient unit tests are included to prevent regressions in this flow.
  • If the main branch was merged into this PR after a review, I tested again and verified the outcome was still expected according to the Test steps.
  • I have checked off every checkbox in the PR reviewer checklist, including those that don't apply to this PR.

Screenshots/Videos

Android: HybridApp
Android: mWeb Chrome
iOS: HybridApp
iOS: mWeb Safari
MacOS: Chrome / Safari

Comment thread .claude/settings.json Outdated
The skill bootstraps by reading files from the installed npm package,
resolved via an echo bang-command. Without this allowlist entry,
every skill load prompted for permission.
@kacper-mikolajczak
Copy link
Copy Markdown
Contributor Author

@rlinoz PR up-to-date and ready for merge ✅

@rlinoz rlinoz merged commit f6174d4 into Expensify:main Apr 22, 2026
17 checks passed
@github-actions
Copy link
Copy Markdown
Contributor

🚧 @rlinoz has triggered a test Expensify/App build. You can view the workflow run here.

@OSBotify
Copy link
Copy Markdown
Contributor

✋ This PR was not deployed to staging yet because QA is ongoing. It will be automatically deployed to staging after the next production release.

@OSBotify
Copy link
Copy Markdown
Contributor

🚀 Deployed to staging by https://github.com/rlinoz in version: 9.3.62-0 🚀

platform result
🕸 web 🕸 success ✅
🤖 android 🤖 success ✅
🍎 iOS 🍎 success ✅

Bundle Size Analysis (Sentry):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants