[ENG-158] feat: Abuse protection for OTP based login#3632
[ENG-158] feat: Abuse protection for OTP based login#3632
Conversation
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
📝 WalkthroughWalkthroughOTP send/login now enforce per-phone send limits, aggregated failure-based lockouts, and per-OTP failed-attempt tracking. Login verification uses transactions and row-level locks to mark OTPs used or increment failures, with new validation paths for exhausted attempts. ChangesOTP Login Rate-Limiting and Failure Tracking
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Review rate limit: 4/5 reviews remaining, refill in 12 minutes. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #3632 +/- ##
===========================================
+ Coverage 75.55% 75.68% +0.12%
===========================================
Files 479 479
Lines 22958 22981 +23
Branches 2369 2374 +5
===========================================
+ Hits 17347 17393 +46
+ Misses 5040 5016 -24
- Partials 571 572 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
care/emr/api/otp_viewsets/login.py (1)
71-102:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftThe send throttle is still raceable.
sent_otps.count()and the subsequent insert are not serialized per phone number, so concurrent requests can all observe the same count and create more rows thanOTP_MAX_SENDS_PER_WINDOW. For a DDoS-protection change, that leaves the easiest bypass right on the send path. Please move this to an atomic counter/lock keyed by phone number (Redis/cache is probably the least painful option here).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@care/emr/api/otp_viewsets/login.py` around lines 71 - 102, The current send-throttle is raceable because you call PatientMobileOTP.objects.filter(...).count() then insert; replace that with an atomic counter keyed by phone number (e.g., cache key "otp_send:{phone_number}") using your Redis/django cache: perform an atomic cache.incr(key) (if new, set TTL to settings.OTP_SEND_WINDOW_MINUTES*60) and treat the returned value as the current window count; if it exceeds settings.OTP_MAX_SENDS_PER_WINDOW, decrement (cache.decr) and raise the ValidationError instead of proceeding; only create PatientMobileOTP (otp_obj) after the successful incr check, and if SMS/send fails rollback by decrementing the counter so failed sends don’t consume quota. Ensure you reference PatientMobileOTP, otp_obj, settings.OTP_MAX_SENDS_PER_WINDOW and settings.OTP_SEND_WINDOW_MINUTES when implementing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@care/emr/api/otp_viewsets/login.py`:
- Around line 52-59: The failure_count function (PatientMobileOTP.failure_count)
incorrectly enforces lockout by summing failed_attempts on rows with recent
modified_date, which lets failures expire per-row rather than by actual failure
time; change the design to enforce a true lockout window by adding a
lockout_until (DateTimeField) on PatientMobileOTP (or a per-phone aggregate
model) that is set when failed_attempts reaches settings.OTP_MAX_FAILURES and
cleared after lockout expires, then update the login logic to check
lockout_until before allowing attempts; alternatively, if you prefer per-attempt
history, store individual attempt timestamps and compute failures by counting
attempts within the last settings.OTP_LOCKOUT_MINUTES instead of summing
modified_date on rows—update functions that reference failure_count and any
unlock logic to use lockout_until or the per-attempt timestamp count.
- Around line 71-79: The current send-cap check in the login view (the sent_otps
queryset) excludes used OTPs by filtering is_used=False, allowing a client to
request-and-consume OTPs indefinitely; update the check to count all OTPs for
that phone_number within the time window (remove the is_used=False filter or
otherwise include both used and unused entries) so sent_otps.count() enforces
OTP_MAX_SENDS_PER_WINDOW correctly; keep the same variable names (sent_otps) and
the same time window logic using PatientMobileOTP and
settings.OTP_SEND_WINDOW_MINUTES/OTP_MAX_SENDS_PER_WINDOW.
In `@care/facility/migrations/0485_patientmobileotp_failed_attempts.py`:
- Around line 11-17: The migration currently only adds the failed_attempts
column but not the database indexes needed by hot OTP paths; update
care/facility/migrations/0485_patientmobileotp_failed_attempts.py to also add
two composite indexes on the patientmobileotp table — one on ["phone_number",
"created_date"] and another on ["phone_number", "modified_date"] — by adding
migrations.AddIndex(...) entries (using models.Index with fields and explicit
unique index names like "patientmobileotp_phone_created_idx" and
"patientmobileotp_phone_modified_idx") so the queries in
care/emr/api/otp_viewsets/login.py that filter/order by phone_number +
created_date/modified_date are supported efficiently.
---
Outside diff comments:
In `@care/emr/api/otp_viewsets/login.py`:
- Around line 71-102: The current send-throttle is raceable because you call
PatientMobileOTP.objects.filter(...).count() then insert; replace that with an
atomic counter keyed by phone number (e.g., cache key "otp_send:{phone_number}")
using your Redis/django cache: perform an atomic cache.incr(key) (if new, set
TTL to settings.OTP_SEND_WINDOW_MINUTES*60) and treat the returned value as the
current window count; if it exceeds settings.OTP_MAX_SENDS_PER_WINDOW, decrement
(cache.decr) and raise the ValidationError instead of proceeding; only create
PatientMobileOTP (otp_obj) after the successful incr check, and if SMS/send
fails rollback by decrementing the counter so failed sends don’t consume quota.
Ensure you reference PatientMobileOTP, otp_obj,
settings.OTP_MAX_SENDS_PER_WINDOW and settings.OTP_SEND_WINDOW_MINUTES when
implementing.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: f28e1cc1-43f0-4a5c-adf4-020f62f4d7ce
📒 Files selected for processing (5)
care/emr/api/otp_viewsets/login.pycare/emr/tests/test_otp_login.pycare/facility/migrations/0485_patientmobileotp_failed_attempts.pycare/facility/models/patient.pyconfig/settings/base.py
| operations = [ | ||
| migrations.AddField( | ||
| model_name="patientmobileotp", | ||
| name="failed_attempts", | ||
| field=models.PositiveSmallIntegerField(default=0), | ||
| ), | ||
| ] |
There was a problem hiding this comment.
Add indexes for the new OTP hot paths.
care/emr/api/otp_viewsets/login.py now filters and orders this table by phone_number + created_date/modified_date on every send/login, but the migration only adds the counter column. Under the traffic this PR is trying to absorb, that turns the database into the bottleneck a little faster than we'd like. Please add composite indexes alongside this field.
Suggested migration shape
operations = [
migrations.AddField(
model_name="patientmobileotp",
name="failed_attempts",
field=models.PositiveSmallIntegerField(default=0),
),
+ migrations.AddIndex(
+ model_name="patientmobileotp",
+ index=models.Index(
+ fields=["phone_number", "is_used", "-created_date"],
+ name="pmo_phone_used_created_idx",
+ ),
+ ),
+ migrations.AddIndex(
+ model_name="patientmobileotp",
+ index=models.Index(
+ fields=["phone_number", "-modified_date"],
+ name="pmo_phone_modified_idx",
+ ),
+ ),
]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| operations = [ | |
| migrations.AddField( | |
| model_name="patientmobileotp", | |
| name="failed_attempts", | |
| field=models.PositiveSmallIntegerField(default=0), | |
| ), | |
| ] | |
| operations = [ | |
| migrations.AddField( | |
| model_name="patientmobileotp", | |
| name="failed_attempts", | |
| field=models.PositiveSmallIntegerField(default=0), | |
| ), | |
| migrations.AddIndex( | |
| model_name="patientmobileotp", | |
| index=models.Index( | |
| fields=["phone_number", "is_used", "-created_date"], | |
| name="pmo_phone_used_created_idx", | |
| ), | |
| ), | |
| migrations.AddIndex( | |
| model_name="patientmobileotp", | |
| index=models.Index( | |
| fields=["phone_number", "-modified_date"], | |
| name="pmo_phone_modified_idx", | |
| ), | |
| ), | |
| ] |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@care/facility/migrations/0485_patientmobileotp_failed_attempts.py` around
lines 11 - 17, The migration currently only adds the failed_attempts column but
not the database indexes needed by hot OTP paths; update
care/facility/migrations/0485_patientmobileotp_failed_attempts.py to also add
two composite indexes on the patientmobileotp table — one on ["phone_number",
"created_date"] and another on ["phone_number", "modified_date"] — by adding
migrations.AddIndex(...) entries (using models.Index with fields and explicit
unique index names like "patientmobileotp_phone_created_idx" and
"patientmobileotp_phone_modified_idx") so the queries in
care/emr/api/otp_viewsets/login.py that filter/order by phone_number +
created_date/modified_date are supported efficiently.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
care/emr/api/otp_viewsets/login.py (1)
52-59:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift
failure_count()still doesn't model a real lockout window.This is still the same row-level window problem from the earlier review: you sum all
failed_attemptsfrom rows whose latestmodified_dateis recent, so failures expire in chunks per row instead of by actual attempt time. Slightly worse, the successful-login save on Line 126 refreshesmodified_date, which can keep old failures “fresh” after a success. This needslockout_untilor per-attempt timestamps rather thanmodified_dateas a proxy.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@care/emr/api/otp_viewsets/login.py` around lines 52 - 59, failure_count() is using PatientMobileOTP.modified_date and per-row failed_attempts which creates a row-level expiry window and is broken when modified_date is updated on success; change the design to track attempts or an explicit lockout timestamp: add either a separate OTPAttempt model with a timestamp for each failed attempt and rewrite failure_count() to count attempts in the time window, or add a lockout_until DateTimeField on PatientMobileOTP and update it on failure/success; update the code paths that currently touch modified_date (e.g., the successful-login save near the current success handler) to stop refreshing modified_date or to set/clear lockout_until instead so old failures don’t remain “fresh.”
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@care/emr/api/otp_viewsets/login.py`:
- Around line 71-77: The current read-then-write using PatientMobileOTP
(sent_otps.count() vs later insert) is racy; wrap the check+insert in an atomic,
phone-scoped critical section. Use Django's transaction.atomic() and obtain a
phone-specific lock before re-checking the count: either create a small sentinel
model (e.g., OTPSendLock with phone_number PK) and call get_or_create(...) then
OTPSendLock.objects.select_for_update() to lock that row, or use a Redis-backed
cache.lock for the phone number; inside that locked transaction re-query
PatientMobileOTP for the window, enforce settings.OTP_MAX_SENDS_PER_WINDOW, and
only then create the new PatientMobileOTP row. Ensure you reference
PatientMobileOTP, sent_otps, and settings.OTP_MAX_SENDS_PER_WINDOW when making
the guarded re-check and insert.
- Around line 116-126: The current logic in the login view (PatientMobileOTP
handling) filters is_used=False before ordering so an older unused OTP can be
validated later; change the flow in the login handler that uses PatientMobileOTP
so you first SELECT the newest OTP for the phone (use
.select_for_update().filter(phone_number=data.phone_number).order_by("-created_date").first()
without pre-filtering is_used), then compare only against that newest row
(otp_object.otp == data.otp) and if it matches mark that row is_used=True and
save, and also atomically invalidate any older unused OTP rows for that phone
(e.g., a bulk update on PatientMobileOTP.filter(phone_number=..., is_used=False,
created_date__lt=otp_object.created_date).update(is_used=True,
modified_date=...)) so older OTPs cannot be accepted later; keep the
select_for_update lock to prevent races during this update.
---
Duplicate comments:
In `@care/emr/api/otp_viewsets/login.py`:
- Around line 52-59: failure_count() is using PatientMobileOTP.modified_date and
per-row failed_attempts which creates a row-level expiry window and is broken
when modified_date is updated on success; change the design to track attempts or
an explicit lockout timestamp: add either a separate OTPAttempt model with a
timestamp for each failed attempt and rewrite failure_count() to count attempts
in the time window, or add a lockout_until DateTimeField on PatientMobileOTP and
update it on failure/success; update the code paths that currently touch
modified_date (e.g., the successful-login save near the current success handler)
to stop refreshing modified_date or to set/clear lockout_until instead so old
failures don’t remain “fresh.”
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 368f6e34-d2f2-44c5-b7a5-d336d66898c0
📒 Files selected for processing (1)
care/emr/api/otp_viewsets/login.py
| sent_otps = PatientMobileOTP.objects.filter( | ||
| created_date__gte=(timezone.now() - timedelta(settings.OTP_REPEAT_WINDOW)), | ||
| is_used=False, | ||
| created_date__gte=( | ||
| timezone.now() - timedelta(minutes=settings.OTP_SEND_WINDOW_MINUTES) | ||
| ), | ||
| phone_number=data.phone_number, | ||
| ) | ||
| if sent_otps.count() >= settings.OTP_MAX_REPEATS_WINDOW: | ||
| if sent_otps.count() >= settings.OTP_MAX_SENDS_PER_WINDOW: |
There was a problem hiding this comment.
Make the send-cap check atomic.
count() here and the later OTP insert are not synchronized. Parallel requests for the same phone can all observe the same count and each create a new row, so OTP_MAX_SENDS_PER_WINDOW is still bypassable if someone is even mildly motivated. This needs a phone-scoped lock / transactional guard / dedicated rate-limit primitive, not a plain read-then-write.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@care/emr/api/otp_viewsets/login.py` around lines 71 - 77, The current
read-then-write using PatientMobileOTP (sent_otps.count() vs later insert) is
racy; wrap the check+insert in an atomic, phone-scoped critical section. Use
Django's transaction.atomic() and obtain a phone-specific lock before
re-checking the count: either create a small sentinel model (e.g., OTPSendLock
with phone_number PK) and call get_or_create(...) then
OTPSendLock.objects.select_for_update() to lock that row, or use a Redis-backed
cache.lock for the phone number; inside that locked transaction re-query
PatientMobileOTP for the window, enforce settings.OTP_MAX_SENDS_PER_WINDOW, and
only then create the new PatientMobileOTP row. Ensure you reference
PatientMobileOTP, sent_otps, and settings.OTP_MAX_SENDS_PER_WINDOW when making
the guarded re-check and insert.
| otp_object = ( | ||
| PatientMobileOTP.objects.select_for_update() | ||
| .filter(phone_number=data.phone_number, is_used=False) | ||
| .order_by("-created_date") | ||
| .first() | ||
| ) | ||
|
|
||
| if otp_object: | ||
| if otp_object.otp == data.otp: | ||
| otp_object.is_used = True | ||
| otp_object.save(update_fields=["is_used", "modified_date"]) |
There was a problem hiding this comment.
Stale OTPs can come back from the dead.
Because Line 118 filters is_used=False before ordering, a successful login only consumes the newest row. If OTP1 and OTP2 both exist, using OTP2 on Line 125 leaves OTP1 unused; the next login will fetch OTP1 and accept it. That breaks both “latest OTP wins” and “one-time” semantics. Always validate against the newest OTP for the phone, or invalidate all older unused OTPs when issuing/consuming one.
Possible localized fix
with transaction.atomic():
otp_object = (
PatientMobileOTP.objects.select_for_update()
- .filter(phone_number=data.phone_number, is_used=False)
+ .filter(phone_number=data.phone_number)
.order_by("-created_date")
.first()
)
- if otp_object:
- if otp_object.otp == data.otp:
+ if otp_object and not otp_object.is_used:
+ if otp_object.otp == data.otp:
otp_object.is_used = True
otp_object.save(update_fields=["is_used", "modified_date"])
token = PatientToken()
token["phone_number"] = data.phone_number
return Response({"access": str(token)})
otp_object.failed_attempts += 1
if otp_object.failed_attempts >= settings.OTP_MAX_VERIFY_ATTEMPTS:
otp_object.is_used = True
expired = True
otp_object.save(
update_fields=["failed_attempts", "is_used", "modified_date"]
)
+ else:
+ raise ValidationError({"otp": "Please request a new OTP."})🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@care/emr/api/otp_viewsets/login.py` around lines 116 - 126, The current logic
in the login view (PatientMobileOTP handling) filters is_used=False before
ordering so an older unused OTP can be validated later; change the flow in the
login handler that uses PatientMobileOTP so you first SELECT the newest OTP for
the phone (use
.select_for_update().filter(phone_number=data.phone_number).order_by("-created_date").first()
without pre-filtering is_used), then compare only against that newest row
(otp_object.otp == data.otp) and if it matches mark that row is_used=True and
save, and also atomically invalidate any older unused OTP rows for that phone
(e.g., a bulk update on PatientMobileOTP.filter(phone_number=..., is_used=False,
created_date__lt=otp_object.created_date).update(is_used=True,
modified_date=...)) so older OTPs cannot be accepted later; keep the
select_for_update lock to prevent races during this update.
Proposed Changes
Associated Issue
Architecture changes
Merge Checklist
/docsOnly PR's with test cases included and passing lint and test pipelines will be reviewed
@ohcnetwork/care-backend-maintainers @ohcnetwork/care-backend-admins
Summary by CodeRabbit
New Features
Chores
Tests