Skip to content

[ENG-158] feat: Abuse protection for OTP based login#3632

Open
praffq wants to merge 2 commits intodevelopfrom
ENG-158-otp-abuse-protection
Open

[ENG-158] feat: Abuse protection for OTP based login#3632
praffq wants to merge 2 commits intodevelopfrom
ENG-158-otp-abuse-protection

Conversation

@praffq
Copy link
Copy Markdown
Contributor

@praffq praffq commented May 4, 2026

Proposed Changes

  • enchancements to enforce more strict control over OTP

Associated Issue

  • Link to issue here, explain how the proposed solution will solve the reported issue/ feature request.

Architecture changes

  • Remove this section if not used

Merge Checklist

  • Tests added/fixed
  • Update docs in /docs
  • Linting Complete
  • Any other necessary step

Only PR's with test cases included and passing lint and test pipelines will be reviewed

@ohcnetwork/care-backend-maintainers @ohcnetwork/care-backend-admins

Summary by CodeRabbit

  • New Features

    • Stronger OTP protections: per-phone lockout after repeated failures, per-window send limits, and automatic invalidation of OTPs after too many wrong attempts.
    • Latest unused OTP is enforced for verification; successful verify marks it used and returns access.
  • Chores

    • Added persistent tracking of failed OTP attempts and new OTP-related configuration controls.
  • Tests

    • Comprehensive tests covering send/login flows, rate limits, lockout timing, and edge cases.

@praffq praffq requested a review from a team as a code owner May 4, 2026 08:32
Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 4, 2026

📝 Walkthrough

Walkthrough

OTP send/login now enforce per-phone send limits, aggregated failure-based lockouts, and per-OTP failed-attempt tracking. Login verification uses transactions and row-level locks to mark OTPs used or increment failures, with new validation paths for exhausted attempts.

Changes

OTP Login Rate-Limiting and Failure Tracking

Layer / File(s) Summary
Configuration
config/settings/base.py
Removed repeat-window settings and added OTP_SEND_WINDOW_MINUTES, OTP_MAX_SENDS_PER_WINDOW, OTP_MAX_VERIFY_ATTEMPTS, OTP_MAX_FAILURES, OTP_LOCKOUT_MINUTES (keeps OTP_LENGTH = 5).
Data Model
care/facility/models/patient.py
Added failed_attempts = models.PositiveSmallIntegerField(default=0) to PatientMobileOTP.
Database Schema
care/facility/migrations/0485_patientmobileotp_failed_attempts.py
Migration adds failed_attempts column (PositiveSmallIntegerField, default=0) to patientmobileotp.
Core Implementation
care/emr/api/otp_viewsets/login.py
Added OTPLoginView.failure_count(phone_number) aggregating failed_attempts over lockout window. send now enforces per-window send caps and rejects when failure_count ≥ OTP_MAX_FAILURES. login also checks failure_count, verifies latest unused OTP within transaction.atomic() + select_for_update(), marks OTP used on success, increments failed_attempts on mismatch, marks used and returns field-specific “request a new OTP” error when per-OTP verify attempts reach OTP_MAX_VERIFY_ATTEMPTS.
Tests
care/emr/tests/test_otp_login.py
New OTPLoginFlowTests covers send/create behavior, successful login and OTP marking, OTP reuse prevention, per-OTP verify caps, cross-OTP aggregated lockout, time-window unlocking, send-rate limits, multi-OTP precedence, and no-active-OTP edge case.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description is incomplete. While it mentions OTP control enhancements, it lacks specifics about the changes, doesn't include the issue link, and the merge checklist remains entirely unchecked despite comprehensive test additions. Expand the 'Proposed Changes' section with concrete details about failure tracking, lockout windows, and rate limiting; add the associated issue link; and ideally mark completed checklist items like 'Tests added/fixed' as done.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and specifically describes the main change: implementing abuse protection for OTP-based login, which aligns with the substantial refactoring of OTP throttling, lockout mechanisms, and failure tracking throughout the codebase.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ENG-158-otp-abuse-protection

Review rate limit: 4/5 reviews remaining, refill in 12 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 4, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 75.68%. Comparing base (589baaa) to head (45b86b9).

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #3632      +/-   ##
===========================================
+ Coverage    75.55%   75.68%   +0.12%     
===========================================
  Files          479      479              
  Lines        22958    22981      +23     
  Branches      2369     2374       +5     
===========================================
+ Hits         17347    17393      +46     
+ Misses        5040     5016      -24     
- Partials       571      572       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
care/emr/api/otp_viewsets/login.py (1)

71-102: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

The send throttle is still raceable.

sent_otps.count() and the subsequent insert are not serialized per phone number, so concurrent requests can all observe the same count and create more rows than OTP_MAX_SENDS_PER_WINDOW. For a DDoS-protection change, that leaves the easiest bypass right on the send path. Please move this to an atomic counter/lock keyed by phone number (Redis/cache is probably the least painful option here).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@care/emr/api/otp_viewsets/login.py` around lines 71 - 102, The current
send-throttle is raceable because you call
PatientMobileOTP.objects.filter(...).count() then insert; replace that with an
atomic counter keyed by phone number (e.g., cache key "otp_send:{phone_number}")
using your Redis/django cache: perform an atomic cache.incr(key) (if new, set
TTL to settings.OTP_SEND_WINDOW_MINUTES*60) and treat the returned value as the
current window count; if it exceeds settings.OTP_MAX_SENDS_PER_WINDOW, decrement
(cache.decr) and raise the ValidationError instead of proceeding; only create
PatientMobileOTP (otp_obj) after the successful incr check, and if SMS/send
fails rollback by decrementing the counter so failed sends don’t consume quota.
Ensure you reference PatientMobileOTP, otp_obj,
settings.OTP_MAX_SENDS_PER_WINDOW and settings.OTP_SEND_WINDOW_MINUTES when
implementing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@care/emr/api/otp_viewsets/login.py`:
- Around line 52-59: The failure_count function (PatientMobileOTP.failure_count)
incorrectly enforces lockout by summing failed_attempts on rows with recent
modified_date, which lets failures expire per-row rather than by actual failure
time; change the design to enforce a true lockout window by adding a
lockout_until (DateTimeField) on PatientMobileOTP (or a per-phone aggregate
model) that is set when failed_attempts reaches settings.OTP_MAX_FAILURES and
cleared after lockout expires, then update the login logic to check
lockout_until before allowing attempts; alternatively, if you prefer per-attempt
history, store individual attempt timestamps and compute failures by counting
attempts within the last settings.OTP_LOCKOUT_MINUTES instead of summing
modified_date on rows—update functions that reference failure_count and any
unlock logic to use lockout_until or the per-attempt timestamp count.
- Around line 71-79: The current send-cap check in the login view (the sent_otps
queryset) excludes used OTPs by filtering is_used=False, allowing a client to
request-and-consume OTPs indefinitely; update the check to count all OTPs for
that phone_number within the time window (remove the is_used=False filter or
otherwise include both used and unused entries) so sent_otps.count() enforces
OTP_MAX_SENDS_PER_WINDOW correctly; keep the same variable names (sent_otps) and
the same time window logic using PatientMobileOTP and
settings.OTP_SEND_WINDOW_MINUTES/OTP_MAX_SENDS_PER_WINDOW.

In `@care/facility/migrations/0485_patientmobileotp_failed_attempts.py`:
- Around line 11-17: The migration currently only adds the failed_attempts
column but not the database indexes needed by hot OTP paths; update
care/facility/migrations/0485_patientmobileotp_failed_attempts.py to also add
two composite indexes on the patientmobileotp table — one on ["phone_number",
"created_date"] and another on ["phone_number", "modified_date"] — by adding
migrations.AddIndex(...) entries (using models.Index with fields and explicit
unique index names like "patientmobileotp_phone_created_idx" and
"patientmobileotp_phone_modified_idx") so the queries in
care/emr/api/otp_viewsets/login.py that filter/order by phone_number +
created_date/modified_date are supported efficiently.

---

Outside diff comments:
In `@care/emr/api/otp_viewsets/login.py`:
- Around line 71-102: The current send-throttle is raceable because you call
PatientMobileOTP.objects.filter(...).count() then insert; replace that with an
atomic counter keyed by phone number (e.g., cache key "otp_send:{phone_number}")
using your Redis/django cache: perform an atomic cache.incr(key) (if new, set
TTL to settings.OTP_SEND_WINDOW_MINUTES*60) and treat the returned value as the
current window count; if it exceeds settings.OTP_MAX_SENDS_PER_WINDOW, decrement
(cache.decr) and raise the ValidationError instead of proceeding; only create
PatientMobileOTP (otp_obj) after the successful incr check, and if SMS/send
fails rollback by decrementing the counter so failed sends don’t consume quota.
Ensure you reference PatientMobileOTP, otp_obj,
settings.OTP_MAX_SENDS_PER_WINDOW and settings.OTP_SEND_WINDOW_MINUTES when
implementing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: f28e1cc1-43f0-4a5c-adf4-020f62f4d7ce

📥 Commits

Reviewing files that changed from the base of the PR and between 589baaa and aff7eda.

📒 Files selected for processing (5)
  • care/emr/api/otp_viewsets/login.py
  • care/emr/tests/test_otp_login.py
  • care/facility/migrations/0485_patientmobileotp_failed_attempts.py
  • care/facility/models/patient.py
  • config/settings/base.py

Comment thread care/emr/api/otp_viewsets/login.py
Comment thread care/emr/api/otp_viewsets/login.py
Comment on lines +11 to +17
operations = [
migrations.AddField(
model_name="patientmobileotp",
name="failed_attempts",
field=models.PositiveSmallIntegerField(default=0),
),
]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Add indexes for the new OTP hot paths.

care/emr/api/otp_viewsets/login.py now filters and orders this table by phone_number + created_date/modified_date on every send/login, but the migration only adds the counter column. Under the traffic this PR is trying to absorb, that turns the database into the bottleneck a little faster than we'd like. Please add composite indexes alongside this field.

Suggested migration shape
     operations = [
         migrations.AddField(
             model_name="patientmobileotp",
             name="failed_attempts",
             field=models.PositiveSmallIntegerField(default=0),
         ),
+        migrations.AddIndex(
+            model_name="patientmobileotp",
+            index=models.Index(
+                fields=["phone_number", "is_used", "-created_date"],
+                name="pmo_phone_used_created_idx",
+            ),
+        ),
+        migrations.AddIndex(
+            model_name="patientmobileotp",
+            index=models.Index(
+                fields=["phone_number", "-modified_date"],
+                name="pmo_phone_modified_idx",
+            ),
+        ),
     ]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
operations = [
migrations.AddField(
model_name="patientmobileotp",
name="failed_attempts",
field=models.PositiveSmallIntegerField(default=0),
),
]
operations = [
migrations.AddField(
model_name="patientmobileotp",
name="failed_attempts",
field=models.PositiveSmallIntegerField(default=0),
),
migrations.AddIndex(
model_name="patientmobileotp",
index=models.Index(
fields=["phone_number", "is_used", "-created_date"],
name="pmo_phone_used_created_idx",
),
),
migrations.AddIndex(
model_name="patientmobileotp",
index=models.Index(
fields=["phone_number", "-modified_date"],
name="pmo_phone_modified_idx",
),
),
]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@care/facility/migrations/0485_patientmobileotp_failed_attempts.py` around
lines 11 - 17, The migration currently only adds the failed_attempts column but
not the database indexes needed by hot OTP paths; update
care/facility/migrations/0485_patientmobileotp_failed_attempts.py to also add
two composite indexes on the patientmobileotp table — one on ["phone_number",
"created_date"] and another on ["phone_number", "modified_date"] — by adding
migrations.AddIndex(...) entries (using models.Index with fields and explicit
unique index names like "patientmobileotp_phone_created_idx" and
"patientmobileotp_phone_modified_idx") so the queries in
care/emr/api/otp_viewsets/login.py that filter/order by phone_number +
created_date/modified_date are supported efficiently.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
care/emr/api/otp_viewsets/login.py (1)

52-59: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

failure_count() still doesn't model a real lockout window.

This is still the same row-level window problem from the earlier review: you sum all failed_attempts from rows whose latest modified_date is recent, so failures expire in chunks per row instead of by actual attempt time. Slightly worse, the successful-login save on Line 126 refreshes modified_date, which can keep old failures “fresh” after a success. This needs lockout_until or per-attempt timestamps rather than modified_date as a proxy.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@care/emr/api/otp_viewsets/login.py` around lines 52 - 59, failure_count() is
using PatientMobileOTP.modified_date and per-row failed_attempts which creates a
row-level expiry window and is broken when modified_date is updated on success;
change the design to track attempts or an explicit lockout timestamp: add either
a separate OTPAttempt model with a timestamp for each failed attempt and rewrite
failure_count() to count attempts in the time window, or add a lockout_until
DateTimeField on PatientMobileOTP and update it on failure/success; update the
code paths that currently touch modified_date (e.g., the successful-login save
near the current success handler) to stop refreshing modified_date or to
set/clear lockout_until instead so old failures don’t remain “fresh.”
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@care/emr/api/otp_viewsets/login.py`:
- Around line 71-77: The current read-then-write using PatientMobileOTP
(sent_otps.count() vs later insert) is racy; wrap the check+insert in an atomic,
phone-scoped critical section. Use Django's transaction.atomic() and obtain a
phone-specific lock before re-checking the count: either create a small sentinel
model (e.g., OTPSendLock with phone_number PK) and call get_or_create(...) then
OTPSendLock.objects.select_for_update() to lock that row, or use a Redis-backed
cache.lock for the phone number; inside that locked transaction re-query
PatientMobileOTP for the window, enforce settings.OTP_MAX_SENDS_PER_WINDOW, and
only then create the new PatientMobileOTP row. Ensure you reference
PatientMobileOTP, sent_otps, and settings.OTP_MAX_SENDS_PER_WINDOW when making
the guarded re-check and insert.
- Around line 116-126: The current logic in the login view (PatientMobileOTP
handling) filters is_used=False before ordering so an older unused OTP can be
validated later; change the flow in the login handler that uses PatientMobileOTP
so you first SELECT the newest OTP for the phone (use
.select_for_update().filter(phone_number=data.phone_number).order_by("-created_date").first()
without pre-filtering is_used), then compare only against that newest row
(otp_object.otp == data.otp) and if it matches mark that row is_used=True and
save, and also atomically invalidate any older unused OTP rows for that phone
(e.g., a bulk update on PatientMobileOTP.filter(phone_number=..., is_used=False,
created_date__lt=otp_object.created_date).update(is_used=True,
modified_date=...)) so older OTPs cannot be accepted later; keep the
select_for_update lock to prevent races during this update.

---

Duplicate comments:
In `@care/emr/api/otp_viewsets/login.py`:
- Around line 52-59: failure_count() is using PatientMobileOTP.modified_date and
per-row failed_attempts which creates a row-level expiry window and is broken
when modified_date is updated on success; change the design to track attempts or
an explicit lockout timestamp: add either a separate OTPAttempt model with a
timestamp for each failed attempt and rewrite failure_count() to count attempts
in the time window, or add a lockout_until DateTimeField on PatientMobileOTP and
update it on failure/success; update the code paths that currently touch
modified_date (e.g., the successful-login save near the current success handler)
to stop refreshing modified_date or to set/clear lockout_until instead so old
failures don’t remain “fresh.”
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 368f6e34-d2f2-44c5-b7a5-d336d66898c0

📥 Commits

Reviewing files that changed from the base of the PR and between aff7eda and 45b86b9.

📒 Files selected for processing (1)
  • care/emr/api/otp_viewsets/login.py

Comment on lines 71 to +77
sent_otps = PatientMobileOTP.objects.filter(
created_date__gte=(timezone.now() - timedelta(settings.OTP_REPEAT_WINDOW)),
is_used=False,
created_date__gte=(
timezone.now() - timedelta(minutes=settings.OTP_SEND_WINDOW_MINUTES)
),
phone_number=data.phone_number,
)
if sent_otps.count() >= settings.OTP_MAX_REPEATS_WINDOW:
if sent_otps.count() >= settings.OTP_MAX_SENDS_PER_WINDOW:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Make the send-cap check atomic.

count() here and the later OTP insert are not synchronized. Parallel requests for the same phone can all observe the same count and each create a new row, so OTP_MAX_SENDS_PER_WINDOW is still bypassable if someone is even mildly motivated. This needs a phone-scoped lock / transactional guard / dedicated rate-limit primitive, not a plain read-then-write.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@care/emr/api/otp_viewsets/login.py` around lines 71 - 77, The current
read-then-write using PatientMobileOTP (sent_otps.count() vs later insert) is
racy; wrap the check+insert in an atomic, phone-scoped critical section. Use
Django's transaction.atomic() and obtain a phone-specific lock before
re-checking the count: either create a small sentinel model (e.g., OTPSendLock
with phone_number PK) and call get_or_create(...) then
OTPSendLock.objects.select_for_update() to lock that row, or use a Redis-backed
cache.lock for the phone number; inside that locked transaction re-query
PatientMobileOTP for the window, enforce settings.OTP_MAX_SENDS_PER_WINDOW, and
only then create the new PatientMobileOTP row. Ensure you reference
PatientMobileOTP, sent_otps, and settings.OTP_MAX_SENDS_PER_WINDOW when making
the guarded re-check and insert.

Comment on lines +116 to +126
otp_object = (
PatientMobileOTP.objects.select_for_update()
.filter(phone_number=data.phone_number, is_used=False)
.order_by("-created_date")
.first()
)

if otp_object:
if otp_object.otp == data.otp:
otp_object.is_used = True
otp_object.save(update_fields=["is_used", "modified_date"])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Stale OTPs can come back from the dead.

Because Line 118 filters is_used=False before ordering, a successful login only consumes the newest row. If OTP1 and OTP2 both exist, using OTP2 on Line 125 leaves OTP1 unused; the next login will fetch OTP1 and accept it. That breaks both “latest OTP wins” and “one-time” semantics. Always validate against the newest OTP for the phone, or invalidate all older unused OTPs when issuing/consuming one.

Possible localized fix
         with transaction.atomic():
             otp_object = (
                 PatientMobileOTP.objects.select_for_update()
-                .filter(phone_number=data.phone_number, is_used=False)
+                .filter(phone_number=data.phone_number)
                 .order_by("-created_date")
                 .first()
             )

-            if otp_object:
-                if otp_object.otp == data.otp:
+            if otp_object and not otp_object.is_used:
+                if otp_object.otp == data.otp:
                     otp_object.is_used = True
                     otp_object.save(update_fields=["is_used", "modified_date"])
                     token = PatientToken()
                     token["phone_number"] = data.phone_number
                     return Response({"access": str(token)})
                 otp_object.failed_attempts += 1
                 if otp_object.failed_attempts >= settings.OTP_MAX_VERIFY_ATTEMPTS:
                     otp_object.is_used = True
                     expired = True
                 otp_object.save(
                     update_fields=["failed_attempts", "is_used", "modified_date"]
                 )
+            else:
+                raise ValidationError({"otp": "Please request a new OTP."})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@care/emr/api/otp_viewsets/login.py` around lines 116 - 126, The current logic
in the login view (PatientMobileOTP handling) filters is_used=False before
ordering so an older unused OTP can be validated later; change the flow in the
login handler that uses PatientMobileOTP so you first SELECT the newest OTP for
the phone (use
.select_for_update().filter(phone_number=data.phone_number).order_by("-created_date").first()
without pre-filtering is_used), then compare only against that newest row
(otp_object.otp == data.otp) and if it matches mark that row is_used=True and
save, and also atomically invalidate any older unused OTP rows for that phone
(e.g., a bulk update on PatientMobileOTP.filter(phone_number=..., is_used=False,
created_date__lt=otp_object.created_date).update(is_used=True,
modified_date=...)) so older OTPs cannot be accepted later; keep the
select_for_update lock to prevent races during this update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants