-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test out the performance of batch-export using a sizeable test database #39
Comments
I tried running the pipeline over the weekend, unfortunately, I'm experiencing some internal server error calling Using FHIR version 1.0.0-SNAPSHOT and OMR Version: 2.1.2 @bashir2 @ibacher @mozzy11 @jecihjoy @corneliouzbett Type Exception Report
Message Servlet execution threw an exception
Description The server encountered an unexpected condition that prevented it from fulfilling the request.
Exception
javax.servlet.ServletException: Servlet execution threw an exception
org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:101)
org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:101)
org.openmrs.module.fhir2.web.filter.ForwardingFilter.doFilter(ForwardingFilter.java:60)
org.openmrs.module.web.filter.ModuleFilterChain.doFilter(ModuleFilterChain.java:70)
org.openmrs.module.fhir2.web.filter.AuthenticationFilter.doFilter(AuthenticationFilter.java:70)
org.openmrs.module.web.filter.ModuleFilterChain.doFilter(ModuleFilterChain.java:70)
org.openmrs.module.spa.filter.SpaFilter.doFilter(SpaFilter.java:54)
org.openmrs.module.web.filter.ModuleFilterChain.doFilter(ModuleFilterChain.java:70)
org.openmrs.module.web.filter.ModuleFilter.doFilter(ModuleFilter.java:54)
org.openmrs.web.filter.OpenmrsFilter.doFilterInternal(OpenmrsFilter.java:108)
org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
org.springframework.orm.hibernate4.support.OpenSessionInViewFilter.doFilterInternal(OpenSessionInViewFilter.java:150)
org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
org.openmrs.web.filter.StartupFilter.doFilter(StartupFilter.java:105)
org.openmrs.web.filter.StartupFilter.doFilter(StartupFilter.java:105)
org.openmrs.web.filter.StartupFilter.doFilter(StartupFilter.java:105)
org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:88)
org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
Root Cause
java.lang.NoClassDefFoundError: com/fasterxml/jackson/databind/JsonNode
ca.uhn.fhir.parser.JsonParser.createJsonWriter(JsonParser.java:166)
ca.uhn.fhir.parser.JsonParser.doEncodeResourceToWriter(JsonParser.java:183)
ca.uhn.fhir.parser.BaseParser.encodeResourceToWriter(BaseParser.java:419)
ca.uhn.fhir.parser.BaseParser.encodeResourceToWriter(BaseParser.java:403)
ca.uhn.fhir.rest.server.RestfulServerUtils.streamResponseAsResource(RestfulServerUtils.java:925)
ca.uhn.fhir.rest.server.RestfulResponse.streamResponseAsResource(RestfulResponse.java:87)
ca.uhn.fhir.rest.server.interceptor.ExceptionHandlingInterceptor.handleException(ExceptionHandlingInterceptor.java:103)
ca.uhn.fhir.rest.server.interceptor.ExceptionHandlingInterceptor.handleException(ExceptionHandlingInterceptor.java:63)
ca.uhn.fhir.rest.server.RestfulServer.handleRequest(RestfulServer.java:1118)
ca.uhn.fhir.rest.server.RestfulServer.doGet(RestfulServer.java:363)
ca.uhn.fhir.rest.server.RestfulServer.service(RestfulServer.java:1722)
javax.servlet.http.HttpServlet.service(HttpServlet.java:731)
org.openmrs.module.web.ModuleServlet.service(ModuleServlet.java:70)
javax.servlet.http.HttpServlet.service(HttpServlet.java:731)
org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:101)
org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:101)
org.openmrs.module.fhir2.web.filter.ForwardingFilter.doFilter(ForwardingFilter.java:60)
org.openmrs.module.web.filter.ModuleFilterChain.doFilter(ModuleFilterChain.java:70)
org.openmrs.module.fhir2.web.filter.AuthenticationFilter.doFilter(AuthenticationFilter.java:70)
org.openmrs.module.web.filter.ModuleFilterChain.doFilter(ModuleFilterChain.java:70)
org.openmrs.module.spa.filter.SpaFilter.doFilter(SpaFilter.java:54)
org.openmrs.module.web.filter.ModuleFilterChain.doFilter(ModuleFilterChain.java:70)
org.openmrs.module.web.filter.ModuleFilter.doFilter(ModuleFilter.java:54)
org.openmrs.web.filter.OpenmrsFilter.doFilterInternal(OpenmrsFilter.java:108)
org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
org.springframework.orm.hibernate4.support.OpenSessionInViewFilter.doFilterInternal(OpenSessionInViewFilter.java:150)
org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
org.openmrs.web.filter.StartupFilter.doFilter(StartupFilter.java:105)
org.openmrs.web.filter.StartupFilter.doFilter(StartupFilter.java:105)
org.openmrs.web.filter.StartupFilter.doFilter(StartupFilter.java:105)
org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:88)
org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
Root Cause
java.lang.ClassNotFoundException: com.fasterxml.jackson.databind.JsonNode
org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1955)
org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1798)
org.openmrs.module.ModuleClassLoader.loadClass(ModuleClassLoader.java:571)
java.lang.ClassLoader.loadClass(ClassLoader.java:351)
ca.uhn.fhir.parser.JsonParser.createJsonWriter(JsonParser.java:166)
ca.uhn.fhir.parser.JsonParser.doEncodeResourceToWriter(JsonParser.java:183)
ca.uhn.fhir.parser.BaseParser.encodeResourceToWriter(BaseParser.java:419)
ca.uhn.fhir.parser.BaseParser.encodeResourceToWriter(BaseParser.java:403)
ca.uhn.fhir.rest.server.RestfulServerUtils.streamResponseAsResource(RestfulServerUtils.java:925)
ca.uhn.fhir.rest.server.RestfulResponse.streamResponseAsResource(RestfulResponse.java:87)
ca.uhn.fhir.rest.server.interceptor.ExceptionHandlingInterceptor.handleException(ExceptionHandlingInterceptor.java:103)
ca.uhn.fhir.rest.server.interceptor.ExceptionHandlingInterceptor.handleException(ExceptionHandlingInterceptor.java:63)
ca.uhn.fhir.rest.server.RestfulServer.handleRequest(RestfulServer.java:1118)
ca.uhn.fhir.rest.server.RestfulServer.doGet(RestfulServer.java:363)
ca.uhn.fhir.rest.server.RestfulServer.service(RestfulServer.java:1722)
javax.servlet.http.HttpServlet.service(HttpServlet.java:731)
org.openmrs.module.web.ModuleServlet.service(ModuleServlet.java:70)
javax.servlet.http.HttpServlet.service(HttpServlet.java:731)
org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:101)
org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:101)
org.openmrs.module.fhir2.web.filter.ForwardingFilter.doFilter(ForwardingFilter.java:60)
org.openmrs.module.web.filter.ModuleFilterChain.doFilter(ModuleFilterChain.java:70)
org.openmrs.module.fhir2.web.filter.AuthenticationFilter.doFilter(AuthenticationFilter.java:70)
org.openmrs.module.web.filter.ModuleFilterChain.doFilter(ModuleFilterChain.java:70)
org.openmrs.module.spa.filter.SpaFilter.doFilter(SpaFilter.java:54)
org.openmrs.module.web.filter.ModuleFilterChain.doFilter(ModuleFilterChain.java:70)
org.openmrs.module.web.filter.ModuleFilter.doFilter(ModuleFilter.java:54)
org.openmrs.web.filter.OpenmrsFilter.doFilterInternal(OpenmrsFilter.java:108)
org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
org.springframework.orm.hibernate4.support.OpenSessionInViewFilter.doFilterInternal(OpenSessionInViewFilter.java:150)
org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
org.openmrs.web.filter.StartupFilter.doFilter(StartupFilter.java:105)
org.openmrs.web.filter.StartupFilter.doFilter(StartupFilter.java:105)
org.openmrs.web.filter.StartupFilter.doFilter(StartupFilter.java:105)
org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:88)
org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
Note The full stack trace of the root cause is available in the server logs.
Apache Tomcat/7.0.94```
|
@kimaina , are you trying to mean the pipeline cant query R3 Resources from the OpenMRS server ?? if i got you right ?? |
Hi @mozzy11 Thanks for following up! Running locally using refApp works fine for both (R4 and R3) but against prod does not! |
is the production server Updated with the latest build of the Fhir module ?? |
We are currently using FHIR2 version 1.0.0-SNAPSHOT (deployed a few wks ago) and OMR Version: 2.1.2 |
@kimaina , the latest version is 1.2.0-SNAPSHOT . we have made quite many bug fixes since 1.0.0-SNAPSHOT and released 1.0.0 , 1.1.0 . |
Thanks, @mozzy11 will keep you posted on this! |
sure , |
This commit in particular is likely relevant to this. Jackson2 was added to openmrs-core in 2.1.3. |
Thanks @ibacher |
Updates on this task, This pr solves the error |
A few more updates on this: We are currently hitting timeouts (504) for endpoints with large volume of data e.g: /ws/fhir2/R3/Encounter?_count=1 or Patient or Obs. Need to do further investigations! |
Upon further investigation, hitting an endpoint like
which makes the API call to timeout when you have a large dataset (with or without setting I tried running the query against the reporting MySQL server (containing over 300M obs), and it took more than 10 minutes, and had to terminate the query. Given that we are setting
|
Nope, that’s exactly right. Note that the FHIR _count really refers to the number of results per query page rather than anything else. There is an optional _total parameter that we currently don’t support, but which could be used to turn the calculation of the total count off. That would require some redesign of our paging provider to properly support. The issue is that what we’re currently doing is basically requesting the full list of UUIDs matching a query and then paging across that list, with the idea that that’s relatively performant and provides a consistent paging view at the expense of occasionally reporting data that no longer matches a query (if the underlying data were modified). It strikes me that to best serve the bulk workflow, what we really want is some means of indicating that a given search request only ever wants a single page of data, that way we can eliminate some overhead (I’m open to suggestions about how we represent this). Now the other part is (somehow) efficiently getting the total number of results so that the total number of requests (i.e., how many total pages) can be worked out. For this, I’m a bit stuck. For this particular query, we could speed things up by having an index on the voided and uuid columns, which has the advantage of basically precalculating the results (I’m assuming that if you look at the EXPLAIN output for the query on your live database, you’ll see it’s doing a full table scan). That works for the situation where we’re extracting all of the Obs, but not if we’re doing any additional filtering (a fundamental limitation of MySQL is that indexes are not composable, i.e., it cannot take advantage of the index I previously described if there are other filtering criteria). But what I’d really want to push for this use case is implementing the FHIR Bulk Data IG described here. It’s a more complicated song-and-dance for both the client and server (the request is submitted asynchronously and then polled until complete; the resulting data needs to be separately fetched as an ndjson file—on this point, we can be flexible), but it does allow us to do this kind of heavy-duty querying without worrying about web request timeouts. It also gives us a space to come up with some less-naïve handling for bulk data requests (i.e., we can cache a number of lookups that happen as part of the translation process, etc.). This also could make it sensible to create appropriate database-level indexes for the necessary queries (I appreciate that adding indexes to the AMPATH database is going to be a painful process, but it’s the easiest way to make queries more efficient). |
Thanks @kimaina for the analysis. For the query, I understand it is Hibernate who composes it but it seems the Note that with the initial query we not only need the @ibacher, I agree with your suggestion around the bulk export but I think the right place for this is to be implemented inside the |
Yes, that functionality definitely belongs in the FHIR2 module. It's just being able to handle the FHIR asynchronous requests part has some (small) impact on the client (i.e., it's not a simple request -> response flow). Also, we need to actually implement it.
I would very much like to explore exporting large datasets via a non-JSON format, though with Parquet, we're likely to run up against the same limitations (R3 only unless we build it).
In this particular case it is; however, as a general rule, it necessary to avoid duplicate UUIDs when there are joined tables. I haven't really experimented with this, but I assumed the query optimiser could determine that the
I would suggest that (i) is likely to be less performant, though it would work work. This is largely because of the overhead of needing to do a HTTP request-per-resource and (read-only) database transaction per-resource, but it might be interesting to compare the performance as an actual experiment. (ii) might be worth exploring too, although that does put a lot of work on the analytics engine that properly belongs with the rest of the FHIR work. A better way to approach that might be to see if we generalise the translation layer to rely less directly on OpenMRS objects. |
|
This is extremely fast, executes in < 0.1 secs, possibly due to the first index |
This took more than 2 mins, canceled it! |
can this be achieved with a simple query that does take into account the index? e.g see the above query
|
On indexing it is also possible to force a particular index https://dev.mysql.com/doc/refman/5.6/en/index-hints.html |
Okay just to wrap up what we discussed: If other simple solutions do not work, let's go with approach (i), i.e., we grab the list of |
My guess is that this is fast only because you are grabbing a limited number of |
Also for documentation: my initial reaction to @kimaina above findings is to try out the version of the FHIR module in this branch and see if that gets performance to a quasi-acceptable level. That branch should change this query: select this_.uuid as y0_
from obs this_
where this_.uuid in (
select this_.uuid as y0_
from obs this_
where this_.voided=0)
group by this_.uuid To this: select this_.uuid
from obs this_
where this_.voided=0
Yeah, if that query is only asking for a subset of uuids, then we'd need to look elsewhere for a fix. I just realised that the index this is relying on |
I agree, streaming in 300 million UUID would still make the API call to timeout |
Test out the performance of batch-export using a sizeable test database within AMPATH. Metrics to report
The text was updated successfully, but these errors were encountered: