Add Freighter-mobile best practices LLM reference docs by leofelix077 · Pull Request #810 · stellar/freighter-mobile

leofelix077 · 2026-04-09T16:15:40Z

Skill Eval: `freighter-mobile-best-practices` — Benchmark Report

Date: 2026-04-09
Iterations: 1
Repo: stellar/freighter-mobile
Model: Claude Opus 4.6

Summary

on further passes both get close to 100%, as Claude picks up automatically on the rules and the entry points

Metric	With Skill	Without Skill	Delta
Pass rate	87.1%	67.2%	+19.9pp

Assertion Results (aggregated across 3 iterations)

architecture-screen

#	Assertion	With Skill	Without Skill
1	All components use arrow function expressions (not function declarations)	1/1 (100%)	1/1 (100%)
2	All user-facing text uses t() via useAppTranslation hook	1/1 (100%)	1/1 (100%)
3	Creates proper screen directory: screens/ with index.tsx, components/, hooks/ su...	0/1 (0%)	0/1 (0%)
4	Creates typed route param list for the staking navigator	1/1 (100%)	1/1 (100%)
5	Provides both en and pt translations	0/1 (0%)	0/1 (0%)
6	Screen files use default export; hooks/helpers use named exports	1/1 (100%)	1/1 (100%)
7	Uses absolute imports from src/ root (no relative paths)	1/1 (100%)	0/1 (0%)
8	Uses route enum constants, not raw strings	1/1 (100%)	1/1 (100%)

architecture-zustand

#	Assertion	With Skill	Without Skill
1	Does not show generic error without context about what operation failed	1/1 (100%)	1/1 (100%)
2	Error handling uses normalizeError() from config/logger	1/1 (100%)	0/1 (0%)
3	Follows Zustand async pattern: set({ isLoading: true, error: null }) -> try/catc...	1/1 (100%)	1/1 (100%)
4	No direct store mutations (uses set() to create new state)	1/1 (100%)	1/1 (100%)
5	No empty catch blocks	1/1 (100%)	1/1 (100%)
6	Reports unexpected errors to Sentry (not expected validation errors)	1/1 (100%)	1/1 (100%)
7	Shows user-facing errors via Toast, NOT Alert.alert()	0/1 (0%)	0/1 (0%)
8	Uses absolute imports	1/1 (100%)	1/1 (100%)
9	Uses validateTransactionParams() before building transaction	1/1 (100%)	1/1 (100%)

code-style-hook

#	Assertion	With Skill	Without Skill
1	Error handling uses normalizeError()	1/1 (100%)	0/1 (0%)
2	Hook file uses named export (not default)	1/1 (100%)	1/1 (100%)
3	Hook name starts with use prefix: useTokenPrices	1/1 (100%)	1/1 (100%)
4	JSDoc comment present on the hook function	1/1 (100%)	1/1 (100%)
5	Return value is memoized (wrapped in useMemo)	1/1 (100%)	0/1 (0%)
6	Uses ?? (nullish coalescing) instead of		for fallback values
7	Uses absolute imports (not relative paths)	1/1 (100%)	1/1 (100%)
8	Uses arrow function expression (not function declaration)	1/1 (100%)	1/1 (100%)

code-style-naming

#	Assertion	With Skill	Without Skill
1	Changes to arrow function expression (not function declaration)	1/1 (100%)	1/1 (100%)
2	Changes to default export (it's a screen)	1/1 (100%)	1/1 (100%)
3	Fixes imports to absolute paths (not relative ../../)	1/1 (100%)	1/1 (100%)
4	List item wrapped in React.memo()	1/1 (100%)	0/1 (0%)
5	No hardcoded colors	1/1 (100%)	0/1 (0%)
6	Replaces Alert.alert with Toast for error display	1/1 (100%)	0/1 (0%)
7	Replaces Image with FastImage for remote URLs	1/1 (100%)	0/1 (0%)
8	Replaces ScrollView+map with FlatList for virtualization	1/1 (100%)	1/1 (100%)
9	Replaces StyleSheet.create with NativeWind className	1/1 (100%)	0/1 (0%)
10	Replaces hardcoded strings with t() calls	1/1 (100%)	1/1 (100%)
11	Uses stable key (not array index)	1/1 (100%)	1/1 (100%)
12	Uses useAppTranslation() instead of raw useTranslation()	1/1 (100%)	1/1 (100%)
13	Uses useShallow for multi-field Zustand selectors	1/1 (100%)	0/1 (0%)
14	Zustand stores accessed via selectors, not full destructuring	1/1 (100%)	0/1 (0%)
15	totalValue computed in useMemo	1/1 (100%)	1/1 (100%)

err-handling-retry

#	Assertion	With Skill	Without Skill
1	Does not show generic 'Something went wrong' without context	1/1 (100%)	1/1 (100%)
2	Implements retry with exponential backoff (1s, 2s, 4s, 8s, 16s) for HTTP 504	0/1 (0%)	1/1 (100%)
3	Maps Horizon error codes to translated user-facing messages using t()	1/1 (100%)	1/1 (100%)
4	Maximum 5 retry attempts	1/1 (100%)	1/1 (100%)
5	Reports unexpected errors to Sentry	1/1 (100%)	1/1 (100%)
6	Shows user-facing errors via Toast, NOT Alert.alert()	1/1 (100%)	1/1 (100%)
7	Uses absolute imports (not relative paths)	1/1 (100%)	0/1 (0%)
8	Uses normalizeError() for error normalization	1/1 (100%)	0/1 (0%)

err-handling-zustand

#	Assertion	With Skill	Without Skill
1	Async actions follow set({ isLoading: true, error: null }) -> try/catch -> set r...	1/1 (100%)	0/1 (0%)
2	Clears state on account switch (not just on unmount)	1/1 (100%)	1/1 (100%)
3	Does NOT use console.error for error reporting - uses Sentry	1/1 (100%)	1/1 (100%)
4	Error handling uses normalizeError() from config/logger	1/1 (100%)	0/1 (0%)
5	No direct store mutations (no get().array.push())	1/1 (100%)	1/1 (100%)
6	Store interface defines both state fields and action functions	1/1 (100%)	1/1 (100%)
7	Uses create() from Zustand with typed interface	1/1 (100%)	1/1 (100%)
8	Uses named export for the store hook (useTransactionHistoryStore)	1/1 (100%)	1/1 (100%)

i18n-settings

#	Assertion	With Skill	Without Skill
1	All user-facing strings come from t() calls	1/1 (100%)	0/1 (0%)
2	Component name has Screen suffix	1/1 (100%)	1/1 (100%)
3	Component uses arrow function expression	1/1 (100%)	1/1 (100%)
4	Provides both English (en) and Portuguese (pt) translation entries	1/1 (100%)	1/1 (100%)
5	Screen component uses default export	1/1 (100%)	1/1 (100%)
6	Translation keys use nested dot notation	1/1 (100%)	1/1 (100%)
7	Uses NativeWind className for styling (not StyleSheet.create)	1/1 (100%)	1/1 (100%)
8	Uses absolute imports	1/1 (100%)	1/1 (100%)
9	Uses useAppTranslation() hook (not raw useTranslation)	1/1 (100%)	1/1 (100%)

nav-typed-routes

#	Assertion	With Skill	Without Skill
1	All user-facing text uses t() via useAppTranslation	1/1 (100%)	1/1 (100%)
2	Creates route enum with named constants for each screen	1/1 (100%)	1/1 (100%)
3	Creates typed param list type for the navigator	1/1 (100%)	1/1 (100%)
4	Deep link config uses correct scheme (freighterdev:// for dev)	1/1 (100%)	1/1 (100%)
5	Navigation uses enum constants, never raw strings	1/1 (100%)	1/1 (100%)
6	Optional params marked with ? in param list type	1/1 (100%)	1/1 (100%)
7	Screen components use arrow functions and default exports	1/1 (100%)	1/1 (100%)
8	navigation.navigate() calls are fully typed	1/1 (100%)	0/1 (0%)

performance-flatlist

#	Assertion	With Skill	Without Skill
1	FlatList has keyExtractor with stable ID (not array index)	1/1 (100%)	1/1 (100%)
2	FlatList has maxToRenderPerBatch prop	1/1 (100%)	1/1 (100%)
3	FlatList has removeClippedSubviews prop	1/1 (100%)	1/1 (100%)
4	FlatList has windowSize prop	1/1 (100%)	1/1 (100%)
5	List item component wrapped in React.memo()	1/1 (100%)	1/1 (100%)
6	No inline arrow functions in JSX (onPress, etc.)	1/1 (100%)	1/1 (100%)
7	Uses FastImage (not React Native Image) for remote token icons	1/1 (100%)	0/1 (0%)
8	Uses NativeWind className for styling (not StyleSheet.create)	0/1 (0%)	0/1 (0%)
9	Uses useShallow for multi-field Zustand selectors	1/1 (100%)	0/1 (0%)
10	Zustand store accessed via selectors, not full store destructuring	1/1 (100%)	1/1 (100%)
11	renderItem callback wrapped in useCallback	1/1 (100%)	1/1 (100%)

performance-selectors

#	Assertion	With Skill	Without Skill
1	All user-facing text uses t() via useAppTranslation	1/1 (100%)	0/1 (0%)
2	Derived values computed in useMemo	1/1 (100%)	1/1 (100%)
3	Does NOT create inline objects/arrays as props to child components	0/1 (0%)	1/1 (100%)
4	Uses NativeWind className for styling	1/1 (100%)	0/1 (0%)
5	Uses absolute imports	1/1 (100%)	1/1 (100%)
6	Uses arrow function expression for component	1/1 (100%)	1/1 (100%)
7	Uses useShallow from Zustand for selecting multiple fields	1/1 (100%)	0/1 (0%)
8	Zustand stores accessed via specific selectors, NOT destructuring entire store	1/1 (100%)	1/1 (100%)

security-storage

#	Assertion	With Skill	Without Skill
1	Does NOT use AsyncStorage directly for keys, seeds, or passwords	1/1 (100%)	1/1 (100%)
2	Does not use hardcoded test keys - references environment variables for test dat...	1/1 (100%)	0/1 (0%)
3	Error handling uses normalizeError() + Sentry	1/1 (100%)	0/1 (0%)
4	Never logs key material even in DEV mode	1/1 (100%)	1/1 (100%)
5	Uses absolute imports throughout	1/1 (100%)	0/1 (0%)
6	Uses dataStorage (AsyncStorage) ONLY for non-sensitive metadata (e.g., lastBacku...	1/1 (100%)	1/1 (100%)
7	Uses secureDataStorage (keychain/keystore) for storing encrypted seed data	1/1 (100%)	1/1 (100%)

security-walletconnect

#	Assertion	With Skill	Without Skill
1	Checks and sets hasRespondedRef before responding	0/1 (0%)	0/0 (0%)
2	Error responses use JSON-RPC format { code: 5000, message: '...' }	0/1 (0%)	0/0 (0%)
3	Implements Blockaid scanning: malicious=auto-reject, suspicious/scan-failed=warn...	0/1 (0%)	0/0 (0%)
4	Never trusts dApp display names/icons for security decisions	1/1 (100%)	0/0 (0%)
5	User-facing strings wrapped in t() via useAppTranslation	1/1 (100%)	0/0 (0%)
6	Uses hasRespondedRef (React ref) to prevent duplicate responses	1/1 (100%)	0/0 (0%)
7	Uses validation functions from walletKitValidation.ts	1/1 (100%)	0/0 (0%)
8	Validates chain matches active Stellar network (stellar:pubnet or stellar:testne...	1/1 (100%)	0/0 (0%)

styling-card

#	Assertion	With Skill	Without Skill
1	All user-facing text uses t()	1/1 (100%)	0/1 (0%)
2	Checks/uses SDS components from src/components/sds/ where applicable	1/1 (100%)	1/1 (100%)
3	Component wrapped in React.memo()	0/1 (0%)	0/1 (0%)
4	Does NOT use StyleSheet.create	1/1 (100%)	1/1 (100%)
5	Uses FastImage for the remote token icon, with resizeMode specified	0/1 (0%)	0/1 (0%)
6	Uses NativeWind className as primary styling approach	0/1 (0%)	0/1 (0%)
7	Uses absolute imports	1/1 (100%)	1/1 (100%)
8	Uses arrow function expression	1/1 (100%)	1/1 (100%)
9	Uses named export (component, not screen)	1/1 (100%)	0/1 (0%)

testing-zustand

#	Assertion	With Skill	Without Skill
1	Mocks use absolute paths matching import convention	1/1 (100%)	1/1 (100%)
2	Test file in tests/ directory mirroring src/ structure	0/1 (0%)	0/1 (0%)
3	Test file uses .test.ts extension	1/1 (100%)	1/1 (100%)
4	Tests network failure path (reports unexpected errors to Sentry)	0/1 (0%)	0/1 (0%)
5	Tests success path with proper state assertions	1/1 (100%)	1/1 (100%)
6	Tests validation error path (does NOT report expected validation errors to Sentr...	1/1 (100%)	1/1 (100%)
7	Uses renderHook and act from testing utilities	0/1 (0%)	0/1 (0%)
8	Uses useMyStore.setState() for setting up store state	1/1 (100%)	1/1 (100%)

Copilot

Pull request overview

Adds a “freighter mobile best practices” documentation skill and supporting context files to help AI agents and contributors navigate Freighter Mobile’s architecture, tooling, and development conventions.

Changes:

Introduces llms.txt and CLAUDE.md as entry-point context/reference docs for the repo.
Adds docs/skills/freighter-mobile-best-practices/ with a skill definition and focused reference guides (architecture, code style, security, WalletConnect, testing, etc.).

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 16 comments.

Show a summary per file

File	Description
llms.txt	High-level repo index for docs, dev, testing, and key concepts
CLAUDE.md	Consolidated AI agent / contributor context (tooling, structure, conventions)
docs/skills/freighter-mobile-best-practices/SKILL.md	Skill definition + reference index for best-practices topics
docs/skills/freighter-mobile-best-practices/references/architecture.md	Architecture + layering + duck/store patterns reference
docs/skills/freighter-mobile-best-practices/references/anti-patterns.md	Common mistakes/anti-pattern guidance
docs/skills/freighter-mobile-best-practices/references/code-style.md	Formatting, ESLint/Prettier rules, naming conventions
docs/skills/freighter-mobile-best-practices/references/dependencies.md	Dependency management + native dependency workflow
docs/skills/freighter-mobile-best-practices/references/error-handling.md	Error normalization + store async patterns + WC error responses
docs/skills/freighter-mobile-best-practices/references/git-workflow.md	Branching/commit/PR/release process guidance
docs/skills/freighter-mobile-best-practices/references/i18n.md	i18n framework usage + key structure + lint enforcement notes
docs/skills/freighter-mobile-best-practices/references/navigation.md	Navigator hierarchy + typing + deep links conventions
docs/skills/freighter-mobile-best-practices/references/performance.md	Performance rules/checklist and optimization guidance
docs/skills/freighter-mobile-best-practices/references/security.md	Storage tiers + auth/security-sensitive areas overview
docs/skills/freighter-mobile-best-practices/references/styling.md	NativeWind/SDS/bottom-sheet/modal styling guidance
docs/skills/freighter-mobile-best-practices/references/testing.md	Jest + Maestro structure, commands, and e2e guidance
docs/skills/freighter-mobile-best-practices/references/walletconnect.md	WalletConnect architecture, request handling, and validations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Rewrite AGENTS.md as the single AI agent entry point: - Add glossary section with domain-specific terminology - Add documentation link index (replaces llms.txt) - Remove sections that duplicate best-practices reference files (code style details, branch conventions, PR instructions) - Keep unique context: repo map, architecture orientation (ducks/nav/WC), security alert list, known complexity/gotchas, pre-submission checklist - Delete llms.txt (content absorbed into AGENTS.md) - Delete CLAUDE.md (content absorbed into AGENTS.md)

aristidesstaffieri · 2026-04-09T19:28:18Z

Code review

Found 1 issue:

performance.md claims "No FastImage adoption despite availability" and lists "Adopt FastImage for remote images" as a P1 action item. However, FastImage (@d11/react-native-fast-image) is already adopted and actively used in 4 files: src/components/sds/Token/index.tsx, src/helpers/validateIconUrl.ts, src/ducks/tokenIcons.ts, and src/components/analytics/DebugBottomSheet.tsx. The "Image Optimization Score: 4/10" and the P1 recommendation are stale and will give incorrect guidance.

freighter-mobile/docs/skills/freighter-mobile-best-practices/references/performance.md

Lines 144 to 150 in bab7ce4

    
           ## Image Optimization -- Score: 4/10 
        
           No FastImage adoption despite availability. React Native's default Image has no 
        
           HTTP caching. 
        
           **RULE: Use FastImage for ALL remote images (token icons, NFTs, profile 
        
           images).**

freighter-mobile/docs/skills/freighter-mobile-best-practices/references/performance.md

Lines 218 to 220 in bab7ce4

    
           | **P0**   | Add FlatList optimization props to all lists           | Improves scroll perf on 9 list components | 6/10 → 9/10  | 
        
           | **P1**   | Adopt FastImage for remote images                      | Adds HTTP caching for all images          | 4/10 → 8/10  | 
        
           | **P1**   | Extract 123 inline handlers to useCallback             | Stabilizes reference equality             | 6/10 → 8/10  |

🤖 Generated with Claude Code

_{- If this code review was useful, please react with 👍. Otherwise, react with 👎.}

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

CassioMG

Another finding — the Reanimated rule is too absolute given existing legacy Animated usage.

CassioMG

Another finding — the Hook Return Memoization example has multiple TypeScript errors.

CassioMG

Another finding — the getItemLayout claim doesn't match codebase reality.

CassioMG

Another finding — the Provider Layer list mentions a non-existent ThemeProvider.

CassioMG

Another finding — enableFreeze is presented as if used but isn't.

Copilot

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

CassioMG

Verifying the latest changes — 13 of my outstanding comments addressed in 24d64b3, 7 outstanding comments still open. One new red flag introduced by the SDS examples added in this commit.

….com:stellar/freighter-mobile into lf-add-freighter-mobile-best-practices-skill Co-authored-by: Copilot <copilot@github.com>

CassioMG

One small inconsistency in error-handling.md spotted

CassioMG · 2026-04-29T17:10:40Z

+
+`normalizeError()` feeds directly into Sentry for crash reporting. Always
+normalize errors before sending to Sentry to ensure consistent, actionable
+reports.


[Comment 22 — Suggestion] Sentry Integration section contradicts the earlier guidance

This says:

"normalizeError() feeds directly into Sentry for crash reporting. Always normalize errors before sending to Sentry to ensure consistent, actionable reports."

But the earlier "Error Normalization" section (lines 34-35) says:

"Use logger.error() to report errors — it normalizes and forwards to Sentry internally. Do not call Sentry.captureException() directly."

And the Rules section at line 159:

"Never call Sentry.captureException() directly — go through logger.error()..."

The phrasing in the Sentry Integration section implies the agent should "send to Sentry" themselves after normalizing — which contradicts the rule against calling Sentry directly. An agent reading the bottom section in isolation might still try to call Sentry.captureException(normalizedError).

Suggest collapsing the redundancy:

-## Sentry Integration - -`normalizeError()` feeds directly into Sentry for crash reporting. Always -normalize errors before sending to Sentry to ensure consistent, actionable -reports. +## Sentry Integration + +Sentry receives normalized errors automatically when you call `logger.error()` — +the logger normalizes via `normalizeError()` and forwards internally. You don't +need to call Sentry yourself.

Or delete the section entirely — the same info is already covered in "Error Normalization" and the Rules section.

* fix(troubleshooting): add emulator recreation tip for persistent memory issues Android Studio sometimes doesn't apply RAM changes to existing AVDs reliably; recreating the emulator is the definitive fix. * cleanup runtime information and clipboard not pasting * Update code review troubleshooting comments * cleanup references naming * clean up troubleshooting guide for stable xcode and IDE config * breakdown commands on troubleshooting guide for better readability * remove xcode 26 regression issue

CassioMG

Just one tiny reminder before merging.

CassioMG · 2026-04-29T18:53:38Z

@@ -0,0 +1,442 @@
+# Troubleshooting Guide: Freighter Mobile
+
+_Last updated: 2026-04-08_


[Comment 23 — Nit] Update the "Last updated" date before merging

This says 2026-04-08 but the guide has been iterated through 2026-04-29. Worth bumping to today's date so the staleness signal is accurate.

leofelix077 · 2026-05-04T13:52:37Z

Benchmark Report — 4-Config Comparison

Structured after stellar/freighter#2687 comment: how much does each layer of guidance add over a cold-start agent with no project context?

4 configurations, 2 independent agents per new config, same 4 eval tasks,
same 41 binary assertions (15 + 8 + 11 + 7).

Baseline and Minimal refs are new runs (May 2026).
Full refs and With skill reuse the Refs-only/With-skill run from the
prior benchmark (same assertions, same eval tasks).

## What each config gets

Config A — Baseline (no skill, no refs):
  Task prompt only. Agent may read existing source code to understand
  patterns but receives no explicit best-practices guidance. No SKILL.md,
  no reference files, no Quick Rules.

Config B — Minimal refs (Quick Rules only, ~30 lines):
  Task prompt + the 13-rule Quick Rules section from SKILL.md. No full
  reference files. Represents the minimum targeted guidance covering the
  most commonly missed patterns.

Config C — Full refs (all 13 reference files):
  Task prompt + AGENTS.md routing directly to all 13 reference files (no
  SKILL.md, no Quick Rules primer). Simulates the state after deleting the
  skill mechanism and moving docs/skills/.../references/ → docs/best-
  practices/ as proposed in the extension PR comment.

Config D — With skill:
  Task prompt + AGENTS.md as-is → SKILL.md (Quick Rules + routing table) →
  relevant reference files. The full skill mechanism.

All agents had access to the same source code on disk.

## Results

  Config              Eval1  Eval2  Eval3  Eval4    Total
  Baseline (A)        10/15   7/8    9/11   4/7    30/41 (73%)
  Minimal refs (B)    13/15   8/8    8/11   6/7    35/41 (85%)
  Full refs (C)       14/15   8/8    8/11   6/7    36/41 (88%)
  With skill (D)      14/15   8/8    8/11   6/7    36/41 (88%)

  A→B delta: +5 assertions (+12pp)  — Quick Rules alone
  B→C delta: +1 assertion  (+3pp)   — full reference docs over Quick Rules
  C→D delta: 0 assertions  (0pp)    — skill mechanism over full refs

## What the Quick Rules uniquely fixed (A→B)

  Assertion                          A    B    Root cause
  Toast (not Alert.alert)            ❌   ✅   Explicit rule: "never Alert.alert"
  FastImage (not Image)              ❌   ✅   Explicit rule: "FastImage for all remote images"
  useShallow for multi-field selectors❌   ✅   Explicit rule: "Multi-field selectors require useShallow"
  Zustand via selectors              ❌   ✅   Explicit rule: "via selectors, not destructuring"
  normalizeError in catch            ❌   ✅   Explicit rule: "normalizeError() for error message"
  hasRespondedRef guard check        ❌   ✅   Explicit rule: "always check and set hasRespondedRef"
  Blockaid user-decides              ❌   ✅   Explicit rule: "malicious → warning, user decides"

  7 assertions gained; 2 regressions offset the gain slightly:
  - Minimal refs (B) failed StyleSheet.create (used for fixed-size image — not
    dynamic — but the agent internalized the rule narrowly). Baseline didn't
    trigger this because it used RN Image with className.
  - usdTotal not in useMemo: failed in both A and B (rule not in Quick Rules).

## What both A and B failed — still failing in C and D

  Assertion                       A    B    C    D    Root cause
  usdTotal in useMemo             ❌   ❌   ❌   ❌   Rule not in Quick Rules.
                                                       Only Eval 1 references
                                                       it but no config
                                                       reliably applied it.
  Default export for screen       ✅   ✅   ❌   ❌   Hallucination: C and D
                                                       verbally acknowledge
                                                       export default but write
                                                       export const. A and B
                                                       correctly exported it
                                                       this run (N=1 variance).
  FlatList windowSize             ❌†  ❌†  ❌†  ❌†  B/C/D used FlashList for
  FlatList maxToRenderPerBatch    ❌†  ❌†  ❌†  ❌†  50-200 items (correct per
  FlatList removeClippedSubviews  ❌†  ❌†  ❌†  ❌†  docs). A used FlatList
                                                       (correct for its range)
                                                       and passed. Assertions
                                                       penalise the right answer
                                                       for B/C/D.
  walletKitValidation.ts          ❌   ❌   ❌   ❌   All configs ignored it.
                                                       Quick Rules say "validate
                                                       with walletKitValidation.ts"
                                                       but neither the rule nor
                                                       the reference file give a
                                                       concrete function example.

## The reference files and the skill add little over Quick Rules

  Layer                   Cumulative score   Delta over previous
  Baseline                30/41  (73%)       —
  + Quick Rules (~30 ln)  35/41  (85%)       +12pp
  + Full ref files        36/41  (88%)       +3pp
  + Skill mechanism       36/41  (88%)       +0pp

  The Quick Rules provide the largest single lift. The 13 reference files add
  only 1 assertion (+3pp) on top — specifically the default export rule for
  screens, which the reference files encode but the Quick Rules primer also
  explicitly state (the difference here is likely N=1 variance rather than a
  structural advantage of the reference files).

  The skill mechanism adds zero measurable value over full refs at N=1.
  This matches the extension PR prediction and the prior Refs-only vs
  With-skill run (88% vs 88%, 0pp).

## Persistent gap: walletKitValidation.ts

  Every config (baseline, minimal refs, full refs, with skill) failed to call
  walletKitValidation.ts functions for the new handler. The Quick Rules say
  "Validate all request parameters with functions from walletKitValidation.ts
  before processing" but neither that rule nor the walletconnect.md reference
  give a concrete example of which specific function to call. Without an
  example, agents default to inline validation or delegation to the existing
  approveSessionRequest.

## Persistent gap: useMemo for derived values

  usdTotal / derived computed values are not in useMemo across all configs.
  The Quick Rules have no explicit rule for this; the reference files mention
  it in context but it does not reliably transfer.

## Proposed path forward

1. The Quick Rules are the highest-value layer: 30 lines, +12pp lift over
   baseline. They should be kept and applied even if the full skill mechanism
   is removed.

2. The full reference files add marginal value (+3pp) over Quick Rules at N=1.
   At N=1 the 95% CI is ~±15pp, so this 3pp difference is within noise.
   Multiple runs would establish whether the delta is real.

3. The skill mechanism (routing layer + SKILL.md) adds zero measurable value
   over just having reference files accessible directly. The extension PR
   restructure (delete SKILL.md, move refs to docs/best-practices/, update
   AGENTS.md routing) is supported by this data — but the Quick Rules should
   be preserved somewhere in the routing chain.

4. Update Eval 3 assertions to be FlashList-aware: if the agent used FlashList,
   check for estimatedItemSize instead of windowSize/maxToRenderPerBatch.
   Current assertions penalise the architecturally correct decision for any
   config that follows the ">100 items → FlashList" guidance.

5. Add a concrete before/after example to the walletKitValidation.ts rule
   (one function call showing which validator to use for XDR). The existing
   mandatory language in walletconnect.md is not enough — agents skip it
   without an example to pattern-match.

6. Add a useMemo rule to the Quick Rules: "Derived values (totals, filters,
   transformations) computed from store data → wrap in useMemo."

Per-Eval Breakdown

Eval 1 — code-style-naming

Assertion	Baseline	Minimal refs	Full refs	With skill
Arrow fn expression	✅	✅	✅	✅
Default export (it's a screen)	✅	✅	❌	❌
Absolute imports	✅	✅	✅	✅
List item in `React.memo()`	✅	✅	✅	✅
No hardcoded colors	✅	✅	✅	✅
Replaces `Alert.alert` with Toast	❌	✅	✅	✅
Replaces `Image` with `FastImage`	❌	✅	✅	✅
Replaces `ScrollView+map` with `FlatList`	✅	✅	✅	✅
`StyleSheet.create` → NativeWind `className`	✅	❌	✅	✅
Hardcoded strings → `t()`	✅	✅	✅	✅
Stable key (not index)	✅	✅	✅	✅
`useAppTranslation`, not raw `useTranslation`	✅	✅	✅	✅
`useShallow` for multi-field selectors	❌	✅	✅	✅
Zustand via selectors, not destructuring	❌	✅	✅	✅
`usdTotal` in `useMemo`	❌	❌	❌	❌
Score	10/15	13/15	14/15	14/15

Default export: A and B correctly exported this run; C and D produced the export-default hallucination (verbally correct, code wrong). N=1 variance.
usdTotal in useMemo: failed in all 4 configs — not in Quick Rules.

Eval 2 — architecture-zustand

Assertion	Baseline	Minimal refs	Full refs	With skill
`{ isLoading: true, error: null }` before `try`	✅	✅	✅	✅
`normalizeError()` from `config/logger`	❌	✅	✅	✅
`logger.error()`, not `console.error`	✅	✅	✅	✅
Error via Toast pattern (store sets `error`, component calls `showToast`)	✅	✅	✅	✅
No direct store mutations	✅	✅	✅	✅
`create<StoreState>()` with typed interface	✅	✅	✅	✅
Named export for store hook	✅	✅	✅	✅
Absolute imports	✅	✅	✅	✅
Score	7/8	8/8	8/8	8/8

Baseline missed normalizeError — wrote manual err.message extraction instead. All other configs used it correctly.

Eval 3 — performance-flatlist

Assertion	Baseline	Minimal refs	Full refs	With skill
`keyExtractor` stable ID	✅	✅	✅	✅
`maxToRenderPerBatch`	✅	❌ †	❌ †	❌ †
`removeClippedSubviews`	✅	❌ †	❌ †	❌ †
`windowSize`	✅	❌ †	❌ †	❌ †
List item in `React.memo()`	✅	✅	✅	✅
No inline arrow functions in JSX	✅	✅	✅	✅
`FastImage` for remote icons	✅	✅	✅	✅
NativeWind `className`	✅	✅	✅	✅
`useShallow` for multi-field selectors	❌	✅	✅	✅
Zustand via selectors	❌	✅	✅	✅
`renderItem` in `useCallback`	✅	✅	✅	✅
Score	9/11	8/11	8/11	8/11

† B/C/D used FlashList for 50–200 items — the correct decision per the ">100 items → FlashList" guidance. windowSize, maxToRenderPerBatch, and removeClippedSubviews are FlatList-only props; the assertions penalise the correct architectural decision for any config that follows the guidance.

Baseline (A) used FlatList (no guidance, defaulted to the familiar component) and happened to include all three performance props — scoring higher on Eval 3 than the guided configs while missing useShallow and Zustand selectors.

Eval 4 — security-walletconnect

Assertion	Baseline	Minimal refs	Full refs	With skill
Checks and sets `hasRespondedRef` before responding	❌	✅	✅	✅
Blockaid: `malicious`/`suspicious`/`scan-failed` → warning, user decides; `benign` → proceed	❌	✅	✅	✅
Never trusts dApp display names for security	✅	✅	✅	✅
User-facing strings via `t()` / `useAppTranslation`	✅	✅	✅	✅
`hasRespondedRef` is a React `useRef`	✅	✅	✅	✅
Uses validation functions from `walletKitValidation.ts`	❌	❌	❌	❌
Validates chain matches active Stellar network	✅	✅	✅	✅
Score	4/7	6/7	6/7	6/7

Baseline auto-rejected malicious Blockaid results rather than letting the user decide. The Quick Rules state "user decides" explicitly; all guided configs handled this correctly.

walletKitValidation.ts: failed in all 4 configs. The rule exists in both Quick Rules and walletconnect.md but no concrete function example is given. Agents defaulted to inline XDR string validation or delegation to approveSessionRequest.

Post-benchmark doc updates (commit `0d9541b`)

Three changes were committed after the Refs-only/With-skill run, addressing
findings from the report and the Blockaid doc conflict discovered during setup:

Change Addresses
SKILL.md — Quick Rules added Formalises the 13 primer rules into SKILL.md.
walletconnect.md — mandatory walletKitValidation.ts rule now bolded and
walletKitValidation.ts language required. No concrete per-function example yet.
error-handling.md — store/toast Adds code example for the store-sets-error /
pattern example added component-calls-showToast pattern.
SKILL.md — Blockaid fix Corrects Quick Rules to match code:
malicious → user decides (not auto-reject).

Outstanding from proposed path forward:

Items 1–3: not yet acted on (restructure, more iterations, FlashList evals)
Item 5 partially done: mandatory language added; no concrete function example
Items 4 and 6: new items added from the 4-config run above

leofelix077 · 2026-05-04T13:56:11Z

@CassioMG unlike the extension, mobile doesnt seem to have much difference between skill and bare refs, so I moved them out of the skills folder and left only the agents.md file

CassioMG

One small consistency finding from the final pass.

CassioMG · 2026-05-04T21:40:44Z

+```tsx
+if (hasRespondedRef.current) return;
+hasRespondedRef.current = true;
+await walletKit.respondSessionRequest({ ... });


[Comment 24 — Suggestion] Use the rejectSessionRequest helper here for consistency with error-handling.md

The hasRespondedRef anti-replay example here still uses the raw walletKit.respondSessionRequest({ ... }), but error-handling.md:165-168 (updated in commit 06ffd5cb) uses the rejectSessionRequest helper for the exact same pattern:

// error-handling.md if (hasRespondedRef.current) return; hasRespondedRef.current = true; await rejectSessionRequest({ sessionRequest, message });

The helper was added in Comment 15 to keep callers from re-implementing the JSON-RPC error structure. Worth aligning walletconnect.md to the same pattern so both files show consistent guidance.

Suggested:

+import { rejectSessionRequest } from "helpers/walletKitUtil"; + if (hasRespondedRef.current) return; hasRespondedRef.current = true; -await walletKit.respondSessionRequest({ ... }); +await rejectSessionRequest({ sessionRequest, message });

CassioMG · 2026-05-04T21:42:59Z

@CassioMG unlike the extension, mobile doesnt seem to have much difference between skill and bare refs, so I moved them out of the skills folder and left only the agents.md file

@leofelix077 removing the SKILL sounds good to me, thanks for running the benchmark.

I think the PR should be good to merge as soon as the 3 open comments are resolved (here, here and here).

Could you please update the PR title and description to reflect the current state of the PR now that it's not adding a SKILL? Thanks

leofelix077 · 2026-05-05T17:26:26Z

@CassioMG made the adjustments. will mege it then

add v1 of freighter mobile best practices skill

1ca3f2d

Copilot AI review requested due to automatic review settings April 9, 2026 16:15

leofelix077 self-assigned this Apr 9, 2026

Copilot AI reviewed Apr 9, 2026

View reviewed changes

leofelix077 added 3 commits April 9, 2026 13:52

adjust styling and comments from copilot reviee

808fe43

Adjust code review and update agentic files best practices

8914c2a

leofelix077 requested review from CassioMG, aristidesstaffieri and piyalbasu April 13, 2026 13:30

minor adjustments from copilot code review

8fc38b5