Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential Denial of Service in JPA Server via history operation #2641

Closed
jamesagnew opened this issue May 6, 2021 · 0 comments · Fixed by #2642
Closed

Potential Denial of Service in JPA Server via history operation #2641

jamesagnew opened this issue May 6, 2021 · 0 comments · Fixed by #2642

Comments

@jamesagnew
Copy link
Collaborator

A weakness in our handling of FHIR history operations has been reported.

Specifically, on a server with a very large number of resources, if the history operation is executed by many clients (e.g. 200) concurrently, the server becomes unresponsive and ultimately consumes a large amount of disk and becomes unstable.

Our investigation has revealed that the root cause is a SELECT COUNT(...) query that executes at the start of all _history operations. Essentially, anytime a _history is executed, the server executes 2 SQL statements (statements here are approximate):

  1. A SELECT COUNT(pid) FROM hfj_res_ver is performed to supply a value for Bundle.total
  2. A SELECT * FROM hfj_res_ver ORDER BY updated DESC LIMIT 100 is performed to supply the contents

The second query is executed against an index and is very fast. The first query by its nature requires a full index scan and is slow. Executing it 200 concurrent times quickly overwhelms the database and leads to timeouts, exceptions, and eventually instability.

The proposed fix is as follows:

  • A new DaoConfig setting is added. This setting introduces a "history count mode" with 3 options:
    • Cached. This is the new default: A loading cache will be used for history counts, meaning that counts are stored in RAM for up to one minute, and the loading cache blocks all but one client thread per JVM from actually performing the count. This effectively throttles access to the database.
    • Not cached. This is the status quo and does exhibit the weakness described here, but may be appropriate in scenarios where users are trusted and accuracy is always required.
    • No count. This setting avoids the count query entirely, saving time and avoiding this weakness at the expense of not including any total in the response.

A huge thanks to Zachary Minneker at Security Innovation who discovered and submitted a responsible disclosure of this issue.

This issue will be resolved for the upcoming 5.4.0 release. A CVE number is forthcoming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant