Batch User Model Loading Queries. 32% speedup #2361

olirice · 2026-01-30T22:17:00Z

What kind of change does this PR introduce?

Batches User Model queries to reduce from 3:1

No worries if you don't want to merge this, I was just playing around to see how hard it'd be to make the endpoints less chatty in case we want to separate Auth from the DB instance at some point.

Context

A bunch of frequently hit endpoints like

/token
/user

call findUser, which goes to

tx.Eager().Q().Where(query, args...).First(obj)

which eagerly loads the user with relations like this:

SELECT * FROM users WHERE instance_id = ? and id = ?
SELECT * FROM identities WHERE user_id = ?
SELECT * FROM factors WHERE user_id = ?

Authenticated endpoints run those 3 queries twice. Once in non-transactional pre-flight checks, and again transactionally for a total of 6 round trips

This PR combines the 3 queries into 1 by returning JSON for identities and factors.

select u.*,
  coalesce((select json_agg(json_build_object(
    'identity_id', i.id,
    'id', i.provider_id,
    'user_id', i.user_id,
    'identity_data', i.identity_data,
    'provider', i.provider,
    'last_sign_in_at', i.last_sign_in_at,
    'created_at', i.created_at,
    'updated_at', i.updated_at,
    'email', i.email
  )) from identities i where i.user_id = u.id), '[]') as identities_json,
  coalesce((select json_agg(json_build_object(
    'id', f.id,
    'user_id', f.user_id,
    'created_at', f.created_at,
    'updated_at', f.updated_at,
    'status', f.status,
    'friendly_name', f.friendly_name,
    'factor_type', f.factor_type,
    'secret', f.secret,
    'phone', f.phone,
    'last_challenged_at', f.last_challenged_at,
    'web_authn_credential', f.web_authn_credential
  )) from factors f where f.user_id = u.id), '[]') as factors_json
from users u
where ...

Here the results

Before (Eager loading - 3 queries)

BenchmarkFindUserWithRefreshToken-10    	    1713	    646805 ns/op	   90468 B/op	    1166 allocs/op
BenchmarkFindUserWithRefreshToken-10    	    2026	    717935 ns/op	   91345 B/op	    1166 allocs/op
BenchmarkFindUserWithRefreshToken-10    	    1587	    663158 ns/op	   89972 B/op	    1165 allocs/op
BenchmarkFindUserWithRefreshToken-10    	    1756	    666331 ns/op	   91112 B/op	    1166 allocs/op
BenchmarkFindUserWithRefreshToken-10    	    1737	    709838 ns/op	   91344 B/op	    1166 allocs/op

Average: ~681 µs/op, ~90.6 KB/op, ~1166 allocs/op

After (JSON aggregation - 1 query)

BenchmarkFindUserWithRefreshToken-10    	    2802	    461595 ns/op	   50848 B/op	     718 allocs/op
BenchmarkFindUserWithRefreshToken-10    	    2827	    447097 ns/op	   50969 B/op	     718 allocs/op
BenchmarkFindUserWithRefreshToken-10    	    2990	    449752 ns/op	   50814 B/op	     718 allocs/op
BenchmarkFindUserWithRefreshToken-10    	    2913	    492917 ns/op	   51128 B/op	     718 allocs/op
BenchmarkFindUserWithRefreshToken-10    	    2720	    460875 ns/op	   51381 B/op	     718 allocs/op

Average: ~462 µs/op, ~51.0 KB/op, ~718 allocs/op

I did not include the benchmarking script for the users model in this PR. If that's a best practice I can add it.

Performance Difference

Metric	Before	After	Improvement
Time	681 µs	462 µs	32% faster
Memory	90.6 KB	51.0 KB	44% reduction
Allocations	1166	718	38% reduction
Queries	3	1	67% reduction

Query Plan

The biggest danger of updating the user loading query is if the query plan drops off index at high scale. To test that I created 1M users, 1M identities, and 250k MFA factors and checked the query plan with explain analyze

The original pattern of 3 queries had

Index Scan using users_pkey - 0.040 ms
AND
Index Scan using identities_user_id_idx - 0.028 ms
AND
Index Scan using unique_phone_factor_per_user - 0.027 ms

and the updated query was

Index Scan using users_pkey on users u  (cost=0.42..25.36 rows=1 width=2894) (actual time=0.062..0.062 rows=1 loops=1)
  Index Cond: (id = 'c8522e5b-9b5c-42f7-a4b8-31fbc2b21ef8'::uuid)
  Filter: (instance_id = '00000000-0000-0000-0000-000000000000'::uuid)
  Buffers: shared hit=11
  SubPlan 1
    ->  Aggregate  (cost=8.45..8.46 rows=1 width=32) (actual time=0.050..0.050 rows=1 loops=1)
          Buffers: shared hit=4
          ->  Index Scan using identities_user_id_idx on identities i  (cost=0.42..8.44 rows=1 width=239) (actual time=0.002..0.003 rows=1 loops=1)
                Index Cond: (user_id = u.id)
                Buffers: shared hit=4
  SubPlan 2
    ->  Aggregate  (cost=8.45..8.46 rows=1 width=32) (actual time=0.004..0.004 rows=1 loops=1)
          Buffers: shared hit=3
          ->  Index Scan using unique_phone_factor_per_user on mfa_factors f  (cost=0.42..8.44 rows=1 width=170) (actual time=0.003..0.003 rows=0 loops=1)
                Index Cond: (user_id = u.id)
                Buffers: shared hit=3
Planning:
  Buffers: shared hit=9
Planning Time: 0.099 ms
Execution Time: 0.091 ms

Which shows that all joins and conditions are on-index at scale.

Total time spent in database is reduced 5% but the real benefit is reducing the number of roundtrips and contention for the connection pool

IMPORTANT

I have never contributed to Auth before so please double check everything

coveralls · 2026-01-31T19:28:53Z

Pull Request Test Coverage Report for Build 21566888187

Details

55 of 59 (93.22%) changed or added relevant lines in 1 file are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.06%) to 68.91%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
internal/models/user.go	55	59	93.22%

Totals
Change from base Build 21450104510:	0.06%
Covered Lines:	14948
Relevant Lines:	21692

💛 - Coveralls

depthfirst-app · 2026-01-31T19:42:27Z

internal/models/user.go

+    'web_authn_credential', f.web_authn_credential
+  )) from ` + Factor{}.TableName() + ` f where f.user_id = u.id), '[]') as factors_json
+from ` + User{}.TableName() + ` u
+where ` + query


🟠 Severity: HIGH

SQL Injection Risk via String Concatenation: The query parameter is concatenated directly into the SQL string before being passed to RawQuery. While current callers use hardcoded strings, this pattern is dangerous because it breaks the parameterization contract. The old code used tx.Eager().Q().Where(query, args...) which properly handled parameterization through the ORM. Future developers might pass dynamically constructed queries, creating SQL injection vectors.
Helpful? Add 👍 / 👎

💡 Fix Suggestion

Suggestion: This SQL injection risk requires an architectural change to the findUser function. Consider one of these approaches:

Use parameterized WHERE clause builder: Instead of accepting a raw query string, accept structured parameters (e.g., field names and operators) and build the WHERE clause programmatically with proper escaping.

Implement query validation/whitelisting: Add a whitelist of allowed query patterns at the start of findUser to ensure only safe, hardcoded queries are accepted. Reject any query that doesn't match the whitelist.

Revert to ORM-based approach: Consider reverting to the original tx.Eager().Q().Where(query, args...) pattern which properly handles parameterization through the ORM, accepting the performance trade-off for better security.

Create specific finder methods: Instead of a generic findUser helper, create specific methods (findUserByEmail, findUserByPhone, etc.) that construct their own safe SQL queries, eliminating the need to pass query strings as parameters.

The current implementation breaks the parameterization contract by concatenating user-provided strings directly into SQL, even though current callers use hardcoded strings. This creates a dangerous pattern that future developers might misuse.

yeah, i don't love passing raw sql around, but that seems to be the pattern. any suggestions would be great. This could easily be a pair of uuids instead but I don't want to change the functions interface in this PR

depthfirst-app · 2026-01-31T21:18:26Z

internal/models/user.go

+    'id', f.id,
+    'user_id', f.user_id,
+    'created_at', f.created_at,
+    'updated_at', f.updated_at,
+    'status', f.status,
+    'friendly_name', f.friendly_name,
+    'factor_type', f.factor_type,
+    'secret', f.secret,


🔴 Severity: CRITICAL

MFA Secret Exposure - The TOTP secret field is included in JSON aggregation, bypassing the json:"-" protection in the Factor struct. This exposes the shared secret used to generate TOTP codes. The original Eager loading respected this tag and excluded secrets. Remove this line and web_authn_credential (line 654) to prevent exposure in API responses, logs, or caches.
Helpful? Add 👍 / 👎

💡 Fix Suggestion

Suggestion: Remove the 'secret' field (line 651) and 'web_authn_credential' field (line 654) from the JSON aggregation query. These fields are marked with json:"-" tags in the Factor struct to prevent exposure in API responses. Including them in the SQL JSON aggregation bypasses this protection and exposes sensitive MFA secrets that could be logged, cached, or transmitted in API responses. The corrected query should only include non-sensitive Factor fields that are safe to serialize.

⚠️ Experimental Feature: This code suggestion is automatically generated. Please review carefully.

Suggested change

'id', f.id,

'user_id', f.user_id,

'created_at', f.created_at,

'updated_at', f.updated_at,

'status', f.status,

'friendly_name', f.friendly_name,

'factor_type', f.factor_type,

'secret', f.secret,

'id', f.id,

'user_id', f.user_id,

'created_at', f.created_at,

'updated_at', f.updated_at,

'status', f.status,

'friendly_name', f.friendly_name,

'factor_type', f.factor_type,

'phone', f.phone,

'last_challenged_at', f.last_challenged_at

)) from ` + Factor{}.TableName() + ` f where f.user_id = u.id), '[]') as factors_json

this was already being loaded when we used *. Its not a change in behavior. I'm not clear on the implications of removing it but feel free

…gregation Reduces FindUserWithRefreshToken from 3 separate queries (user + identities + factors) to a single query using json_agg subqueries. This optimization impacts the /token endpoint (~45% of total traffic) and /user endpoint by eliminating 2 database round-trips per call. Since both of those endpoints call FindUserWithRefreshToken twice it removes 4 database roundtrips per request. Performance impact: - Query execution: 399µs → 209µs (47.6% faster) - Memory allocations: 18.4KB → 7.1KB (61% reduction) - Allocation count: 299 → 117 allocs (61% reduction) - /token throughput: +20.3% (55.18 vs 45.87 req/s) in local testing - /token latency: -16.8% (181ms vs 218ms) Replaces Pop ORM .Eager() pattern with explicit SQL column enumeration and coalesce(json_agg()) for related entities. No changes to User struct or API.

olirice requested a review from a team as a code owner January 30, 2026 22:17

olirice marked this pull request as draft January 30, 2026 22:28

olirice marked this pull request as ready for review January 31, 2026 19:37

depthfirst-app bot reviewed Jan 31, 2026

View reviewed changes

olirice force-pushed the or/query_reduction branch from a517e59 to 1e078e4 Compare February 1, 2026 17:05

olirice force-pushed the or/query_reduction branch from 1e078e4 to b407d87 Compare February 1, 2026 17:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch User Model Loading Queries. 32% speedup #2361

Batch User Model Loading Queries. 32% speedup #2361

Uh oh!

olirice commented Jan 30, 2026 •

edited

Loading

Uh oh!

coveralls commented Jan 31, 2026 •

edited

Loading

Uh oh!

depthfirst-app bot Jan 31, 2026

Uh oh!

olirice Jan 31, 2026

Uh oh!

depthfirst-app bot Jan 31, 2026

Uh oh!

olirice Jan 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Batch User Model Loading Queries. 32% speedup #2361

Are you sure you want to change the base?

Batch User Model Loading Queries. 32% speedup #2361

Uh oh!

Conversation

olirice commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What kind of change does this PR introduce?

Context

Before (Eager loading - 3 queries)

After (JSON aggregation - 1 query)

Performance Difference

Query Plan

IMPORTANT

Uh oh!

coveralls commented Jan 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Test Coverage Report for Build 21566888187

Details

💛 - Coveralls

Uh oh!

depthfirst-app bot Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

olirice Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

depthfirst-app bot Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

olirice Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

olirice commented Jan 30, 2026 •

edited

Loading

coveralls commented Jan 31, 2026 •

edited

Loading