Skip to content

feat: reset phone auth rate limits#157

Open
k9ert wants to merge 16 commits into
mainfrom
kn/redis_key_deletion
Open

feat: reset phone auth rate limits#157
k9ert wants to merge 16 commits into
mainfrom
kn/redis_key_deletion

Conversation

@k9ert

@k9ert k9ert commented Jul 29, 2025

Copy link
Copy Markdown

If a user tries to log in with their phone number too often, they can get blocked by phone-based Redis rate limits. Support needs a narrow admin API operation to unblock that phone auth flow without exposing arbitrary Redis key search or deletion.

This PR:

  • Adds a phoneRateLimitReset admin GraphQL mutation.
  • Resets only the two phone auth limiters via the existing IRateLimitService reset implementation.
  • Uses the existing Phone scalar and SuccessPayload response shape.
  • Adds local dev helper scripts for admin API access.
  • Adds focused unit and integration coverage for the app and GraphQL paths.

Validation run locally:

TEST=./test/unit/app/admin/reset-phone-rate-limit.spec.ts pnpm test:unit
nix develop -c buck2 test //dev:check-sdls

@github-actions github-actions Bot added the core label Jul 29, 2025
@k9ert k9ert requested a review from dolcalmi July 29, 2025 10:57

@dolcalmi dolcalmi left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check this keybase://chat/blinkbtc#dev/1819

@k9ert k9ert changed the title feat: search and remove redis keys feat: search and reset redis limiter keys Jul 29, 2025
@openoms openoms marked this pull request as draft August 6, 2025 08:27
@k9ert k9ert force-pushed the kn/redis_key_deletion branch from 5983ae0 to f21c53d Compare May 18, 2026 09:47
@k9ert k9ert requested a review from Copilot May 19, 2026 07:19

@blink-claw-bot blink-claw-bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: feat: search and reset redis limiter keys

Summary

Clean, well-structured PR that adds an admin mutation to reset phone-based rate limiter keys in Redis. Follows the repo's hexagonal architecture correctly (domain config → service → app → GraphQL). Security model is sound — only hardcoded limiter key prefixes can be deleted, behind SYSTEM_CONFIG access control.


HIGH

None.

MEDIUM

1. PR title/description says "search and reset" but only reset is implemented

The description states "Creates two admin-endpoints to search and remove limiter based redis-keys" — but the diff only contains one endpoint (phoneRateLimitReset mutation). There's no search/query endpoint. If search is planned for a follow-up, the PR description should be updated to reflect what's actually in this PR. If it was intended to be here, it's missing.

2. Unit test doesn't mock @/services/tracing

reset-phone-rate-limit.ts imports addAttributesToCurrentSpan from @/services/tracing. The unit test mocks @/domain/rate-limit and @/services/rate-limit but not @/services/tracing. If there's no global auto-mock for tracing in the test setup, this will fail when the module tries to load OpenTelemetry internals. Worth verifying the unit test actually passes as-is.

LOW

3. nit: get-access-token.sh echoes client secret to stdout

Lines 49-50 print ADMIN_API_CLIENT_SECRET to the terminal. Fine for local dev, but could surprise someone running this in a CI log or shared screen. Consider gating behind a --verbose flag or removing it.

4. nit: setup-service.sh sources dev/helpers/auth.sh and calls login_user but doesn't use broker_token

The script calls login_user "broker_token" "+16505554336" "000000" — the resulting token variable isn't used anywhere in the script. Is this a leftover or a prerequisite side-effect (e.g., creating the user in Kratos)? A comment explaining why would help.


What looks good

  • Security: Rate limit reset is properly scoped — only requestCodeAttemptPerPhoneNumber and loginAttemptPerLoginIdentifier configs, no arbitrary Redis key access. Input validated via the Phone scalar. Protected by systemConfig access rule.
  • Testing: Good coverage — unit tests with mocks verify the happy path and error handling; integration tests verify actual Redis behavior and the full GraphQL flow including invalid input.
  • Architecture: Follows the repo's layered conventions. App layer is thin, GraphQL types are cleanly separated, tracing span attributes are added.
  • GraphQL schema: PhoneRateLimitResetPayload follows existing patterns with errors + success.

Verdict

Solid PR. The missing search endpoint (MEDIUM #1) should be addressed — either update the PR description or add the endpoint. No blocking code issues.

🤖 Review by Claude Opus 4 via Blink-Claw-Bot

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an admin GraphQL mutation + supporting app logic to reset phone-number-based Redis rate limiter state (useful for unblocking users who are rate-limited), along with local-dev helper scripts, Nix tooling, and tests.

Changes:

  • Added phoneRateLimitReset admin GraphQL mutation + input/payload types and wired it into admin mutations/permissions.
  • Implemented Admin.resetPhoneRateLimit app function to reset the two phone-related limiters.
  • Added unit + integration tests, plus local dev scripts and added redis to flake.nix.

Reviewed changes

Copilot reviewed 11 out of 13 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
flake.nix Adds redis to the dev shell/tooling set.
dev/bin/setup-service.sh Creates a Hydra client and writes .env.local for local admin API access.
dev/bin/get-access-token.sh Fetches an OAuth2 access token using the client credentials from .env.local.
core/api/src/app/admin/reset-phone-rate-limit.ts App-layer function to reset phone-related rate limit keys.
core/api/src/app/admin/index.ts Exposes the new admin app function via Admin exports.
core/api/src/graphql/admin/root/mutation/phone-rate-limit-reset.ts New admin GraphQL mutation resolver for resetting phone rate limits.
core/api/src/graphql/admin/mutations.ts Registers the new mutation and applies the systemConfig access rule.
core/api/src/graphql/admin/types/object/phone-rate-limit-reset-input.ts Adds GraphQL input type for the mutation.
core/api/src/graphql/admin/types/payload/phone-rate-limit-reset.ts Adds GraphQL payload type for the mutation response.
core/api/src/graphql/admin/schema.graphql Updates the admin schema SDL with the new mutation + types.
core/api/test/unit/app/admin/reset-phone-rate-limit.spec.ts Unit tests for the app-layer reset behavior.
core/api/test/integration/app/admin/reset-phone-rate-limit.spec.ts Integration test verifying Redis limiter reset behavior at the app layer.
core/api/test/integration/graphql/admin/phone-rate-limit-reset.spec.ts Integration test verifying the GraphQL mutation resets both limiters and handles invalid phone.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread core/api/src/app/admin/reset-phone-rate-limit.ts Outdated
Comment thread core/api/src/graphql/admin/types/payload/phone-rate-limit-reset.ts Outdated
Comment thread core/api/src/graphql/admin/root/mutation/phone-rate-limit-reset.ts
@k9ert k9ert changed the title feat: search and reset redis limiter keys feat: reset phone auth rate limits May 19, 2026
@k9ert

k9ert commented May 19, 2026

Copy link
Copy Markdown
Author

Addressing Blink-Claw-Bot review items:

  • PR title/description mentioned search: updated title/body to reflect the narrowed scope. This PR now only exposes phoneRateLimitReset; no Redis search or arbitrary key removal remains.
  • Unit test/tracing concern: removed the @/services/tracing import from reset-phone-rate-limit.ts, so the unit test no longer loads tracing and no phone PII is recorded.
  • get-access-token.sh printed client credentials: removed the client ID/secret output.
  • setup-service.sh broker_token side effect: added a short comment explaining the login primes the local dev auth stack before creating the admin API client; also removed the raw Hydra client response from stdout because it contains the secret.

Validation after these changes:

TEST=./test/unit/app/admin/reset-phone-rate-limit.spec.ts pnpm test:unit
nix develop -c buck2 test //dev:check-sdls

@k9ert k9ert marked this pull request as ready for review May 19, 2026 08:35
@k9ert k9ert requested a review from dolcalmi May 19, 2026 08:36
@k9ert k9ert requested a review from blink-claw-bot June 9, 2026 13:15

@blink-claw-bot blink-claw-bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: feat: reset phone auth rate limits

Summary

Re-reviewed current head c92a098 after the latest follow-up commit. The prior review comments look addressed: the PR description now matches the narrow reset-only behavior, the stale tracing concern no longer applies to the current implementation, and CI is green.

Findings

None.

Notes

  • Reset scope is narrow: only requestCodeAttemptPerPhoneNumber and loginAttemptPerLoginIdentifier are reset.
  • The admin mutation is gated behind SYSTEM_CONFIG, which is appropriate for support/admin recovery tooling.
  • The Phone scalar validation and SuccessPayload response shape match existing GraphQL patterns.
  • Unit and integration coverage are present for both the app service and GraphQL path.

CI checked via GitHub: all required jobs passing.

🤖 Review by GPT-5 via Blink-Claw-Bot

@dolcalmi dolcalmi left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, just minor change and some questions... dont forget to run buck2 build //dev:supergraph

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an e2e test, move to bats

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this coverage into BATS in bats/core/api/admin.bats and added bats/admin-gql/phone-rate-limit-reset.gql. The Apollo integration spec was removed. Also ran buck2 build //dev:supergraph successfully.

Comment thread flake.nix Outdated
grpcurl
buf
netcat
redis

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this required by scripts?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed redis from flake.nix. The BATS test uses the existing redis_cli helper, which executes redis-cli inside the Redis container, so no host Redis package is needed.

Comment thread dev/bin/get-access-token.sh Outdated

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this? is not enough with admin helpers?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed dev/bin/get-access-token.sh. The existing admin BATS helpers already create Hydra clients/tokens and exec_admin_graphql handles authenticated admin API calls.

@blink-claw-bot blink-claw-bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: feat: reset phone auth rate limits

Summary

Re-reviewed current head 47721ae after the June 10 follow-up commits. The post-approval changes move the GraphQL coverage into BATS, remove the extra helper script/flake change, keep generated credentials out of stdout, and pin shell-quote for the audit advisory.

HIGH

None.

MEDIUM

None.

LOW

None.

Notes

  • Reset scope remains narrow: only requestCodeAttemptPerPhoneNumber and loginAttemptPerLoginIdentifier are reset via the existing limiter service; there is no exposed arbitrary Redis key search/deletion.
  • Admin authorization is gated by SYSTEM_CONFIG, with the existing privileged client/scope checks.
  • Phone input goes through the existing Phone scalar, and the app path no longer traces/logs the phone number.
  • Test coverage now includes the app-layer Redis reset path plus admin GraphQL/BATS coverage for success and invalid-phone behavior.
  • CI checked via GitHub: all reported jobs are passing.

Model: GPT-5

Review by Blink-Claw-Bot

@k9ert k9ert force-pushed the kn/redis_key_deletion branch from 47721ae to 75b6ade Compare June 23, 2026 08:30
@k9ert k9ert force-pushed the kn/redis_key_deletion branch from 978bb60 to 6c7f829 Compare June 30, 2026 09:29
@blink-claw-bot

Copy link
Copy Markdown

Critical re-review of current head 6c7f829, focusing on the changes after dolcalmi's CHANGES_REQUESTED review on c92a098.

I would not treat the current PR as cleanly scoped yet. The core phone-rate-limit reset implementation looks narrow and reasonable, but the PR now carries unrelated work.

Findings:

  • Out of scope: apps/pay/components/parse-pos-payment/index.tsx:53 and :143
    This is a Pay POS route/query-param behavior fix. It may be a good fix, but it is not related to resetting phone auth rate limits and should be split into its own PR.

  • Out of scope: apps/pay/cypress/e2e/point-of-sale.cy.ts:3-63
    This adds POS username/default-wallet fixture setup. Useful test stabilization, but not part of the admin phone-rate-limit reset feature.

  • Out of scope / suspicious CI workaround: dev/core-bundle/integration-env.json:19 and dev/core-bundle/serve-env.json:19
    This changes LNURL_SERVER_INTERNAL_URL to an empty string. That changes LNURL integration behavior in local/core-bundle envs and is unrelated to this PR. I would split or justify this separately.

  • Borderline but defensible: the Cypress flushRedis() -> resetAuthRateLimits(identifier) changes across consent/dashboard/pay/voucher, for example apps/pay/cypress/support/commands.ts:59-71.
    This is auth-rate-limit related and less destructive than FLUSHALL, so it can be justified if the PR description explicitly says the E2E auth setup now uses targeted auth limiter resets. It is still broader than the production mutation.

  • Still slightly questionable: dev/bin/setup-service.sh:1-24 remains.
    The old get-access-token.sh helper was removed, but this script still creates local admin client credentials. It is related to exercising the admin API locally, but not necessary for the production feature. I would either explain why it belongs here or split it.

What looks good:

  • core/api/src/app/admin/reset-phone-rate-limit.ts:7-20 resets only requestCodeAttemptPerPhoneNumber and loginAttemptPerLoginIdentifier.
  • core/api/src/graphql/admin/root/mutation/phone-rate-limit-reset.ts uses the shared SuccessPayload and existing error mapping.
  • BATS coverage in bats/core/api/admin.bats:173-194 checks success and invalid-phone behavior.
  • Current CI is green.

Recommendation: remove/split the Pay POS changes and the LNURL env changes before merge. The Cypress targeted Redis reset changes can stay only if the PR description calls that out as test-auth stabilization for this feature.

🤖 Review by GPT-5 Codex via Blink-Claw-Bot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants