You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While doing performance analysis for the distributed index, compareHashAndPassword was high on the list until user ramp-up completed. This is somewhat working as intended - the bcrypt call is intended to be slow/expensive, to make online attacks expensive.
To address this, we maintain a cache of recent password hashes + SHA1 digest (10000 entries). As users connect and authenticate for the first time, they will gradually be filling up this queue.
However, when this queue fills up, we're dropping it completely and starting again from an empty cache. At that point all active users would trigger the bcrypt call on their next basic auth API call. Under high load, we'd expect a large CPU spike at that point.
If there are more than 10000 concurrent users, we'd be repeatedly dropping and recreating the cache.
At minimum, users anticipating high load should be using session-based auth, not basic auth.
Would like to review whether there's a more efficient way to manage this cache (or if I'm missing a subtlety in the implementation). We might not want the overhead of a full LRU cache, but should find a way to avoid CPU spikes when it fills up.
The text was updated successfully, but these errors were encountered:
One additional note - we're not specifying a capacity for the cache map - we should probably do so to avoid the GC work when the map gets internally resized.
While doing performance analysis for the distributed index, compareHashAndPassword was high on the list until user ramp-up completed. This is somewhat working as intended - the bcrypt call is intended to be slow/expensive, to make online attacks expensive.
To address this, we maintain a cache of recent password hashes + SHA1 digest (10000 entries). As users connect and authenticate for the first time, they will gradually be filling up this queue.
However, when this queue fills up, we're dropping it completely and starting again from an empty cache. At that point all active users would trigger the bcrypt call on their next basic auth API call. Under high load, we'd expect a large CPU spike at that point.
If there are more than 10000 concurrent users, we'd be repeatedly dropping and recreating the cache.
At minimum, users anticipating high load should be using session-based auth, not basic auth.
Would like to review whether there's a more efficient way to manage this cache (or if I'm missing a subtlety in the implementation). We might not want the overhead of a full LRU cache, but should find a way to avoid CPU spikes when it fills up.
The text was updated successfully, but these errors were encountered: