Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stale connections in JDBC connection pool cause requests to fail #1485

Closed
punktilious opened this issue Sep 8, 2020 · 3 comments · Fixed by #1555
Closed

Stale connections in JDBC connection pool cause requests to fail #1485

punktilious opened this issue Sep 8, 2020 · 3 comments · Fixed by #1555
Assignees
Labels
bug Something isn't working
Milestone

Comments

@punktilious
Copy link
Collaborator

Describe the bug
After running load against a FHIR server then letting it sit idle for a long (currently unknown) period of time, a new request will fail when it attempts to use a stale connection from the connection pool.

After the stale connection is purged from the pool, subsequent requests succeed.

To Reproduce
Steps to reproduce the behavior:

  1. Record the datasource configuration used for the test (fhirProxyDataSource in WLP_HOME/usr/servers/fhir-server/server.xml)
  2. Run a concurrent load against the FHIR server
  3. Let the FHIR server sit idle for minutes
  4. Issue a request and see it fail. If it doesn't fail, then try the sequence again, but increase the time where the server was idle.

Expected behavior
Requests should succeed, even after an extended period without activity.

Additional context
Sep 6 09:17:22 redacted redacted ERROR J2CA0027E: An exception occurred while invoking end on an XA Resource Adapter from DataSource jdbc/fhirProxyDataSource, within transaction ID {XidImpl: formatId(57415344), gtrid_length(36), bqual_length(54),
data(0000017463906730000000016a51607c790d826051a2074afb3ddb5085aaeaf504d0db9d0000017463906730000000016a51607c790d826051a2074afb3ddb5085aaeaf504d0db9d000000010000000000000000000000000001)} : com.ibm.db2.jcc.am.XaException: [jcc][t4][10401][12066][4.26.14] XA exception: XA_RBOTHER ERRORCODE=-4228, SQLSTATE=null
at com.ibm.db2.jcc.am.b7.c(b7.java:561)
at com.ibm.db2.jcc.am.b7.c(b7.java:590)
at com.ibm.db2.jcc.t4.a3.a(a3.java:1741)
at com.ibm.db2.jcc.t4.a3.a(a3.java:1690)
at com.ibm.db2.jcc.t4.a3.a(a3.java:1601)
at com.ibm.db2.jcc.t4.a4.end(a4.java:630)
at com.ibm.ws.rsadapter.impl.WSRdbXaResourceImpl.end(WSRdbXaResourceImpl.java:417)
at com.ibm.ejs.j2c.XATransactionWrapper.end(XATransactionWrapper.java:380)
at com.ibm.ws.Transaction.JTA.JTAResourceBase.end(JTAResourceBase.java:218)
at com.ibm.tx.jta.impl.RegisteredResources.sendEnd(RegisteredResources.java:941)
at com.ibm.tx.jta.impl.RegisteredResources.distributeEnd(RegisteredResources.java:920)
at com.ibm.tx.jta.impl.TransactionImpl.internalRollback(TransactionImpl.java:1627)
at com.ibm.tx.jta.impl.TransactionImpl.internalRollback(TransactionImpl.java:1597)
at com.ibm.tx.jta.impl.TransactionImpl.rollback(TransactionImpl.java:1065)
at com.ibm.tx.jta.impl.TranManagerImpl.rollback(TranManagerImpl.java:195)
at com.ibm.tx.jta.impl.TranManagerSet.rollback(TranManagerSet.java:138)
at com.ibm.tx.jta.impl.UserTransactionImpl.rollback(UserTransactionImpl.java:242)
at com.ibm.tx.jta.embeddable.impl.EmbeddableUserTransactionImpl.rollback(EmbeddableUserTransactionImpl.java:148)
at com.ibm.ws.transaction.services.UserTransactionService.rollback(UserTransactionService.java:88)
at com.ibm.fhir.persistence.jdbc.connection.FHIRUserTransactionAdapter.end(FHIRUserTransactionAdapter.java:105)
at com.ibm.fhir.operation.healthcheck.HealthcheckOperation.doInvoke(HealthcheckOperation.java:62)
at com.ibm.fhir.server.operation.spi.AbstractOperation.invoke(AbstractOperation.java:54)
at com.ibm.fhir.server.util.FHIRRestHelper.doInvoke(FHIRRestHelper.java:1027)
at com.ibm.fhir.server.resources.Operation.invoke(Operation.java:84)
at com.ibm.fhir.server.resources.Operation$Proxy$_$$_WeldClientProxy.invoke(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.ibm.ws.jaxrs20.cdi.component.JaxRsFactoryImplicitBeanCDICustomizer.serviceInvoke(JaxRsFactoryImplicitBeanCDICustomizer.java:339)
at com.ibm.ws.jaxrs20.server.LibertyJaxRsServerFactoryBean.performInvocation(LibertyJaxRsServerFactoryBean.java:641)
at com.ibm.ws.jaxrs20.server.LibertyJaxRsInvoker.performInvocation(LibertyJaxRsInvoker.java:160)
at org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:96)
at com.ibm.ws.jaxrs20.server.LibertyJaxRsInvoker.invoke(LibertyJaxRsInvoker.java:273)
at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:210)
at com.ibm.ws.jaxrs20.server.LibertyJaxRsInvoker.invoke(LibertyJaxRsInvoker.java:444)
at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:112)
at org.apache.cxf.interceptor.ServiceInvokerInterceptor$1.run(ServiceInvokerInterceptor.java:61)
at org.apache.cxf.interceptor.ServiceInvokerInterceptor.handleMessage(ServiceInvokerInterceptor.java:99)
at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:316)
at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:123)
at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:273)
at com.ibm.ws.jaxrs20.endpoint.AbstractJaxRsWebEndpoint.invoke(AbstractJaxRsWebEndpoint.java:136)
at com.ibm.websphere.jaxrs.server.IBMRestServlet.handleRequest(IBMRestServlet.java:146)
at com.ibm.websphere.jaxrs.server.IBMRestServlet.doGet(IBMRestServlet.java:112)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:686)
at com.ibm.websphere.jaxrs.server.IBMRestServlet.service(IBMRestServlet.java:96)
at com.ibm.ws.webcontainer.servlet.ServletWrapper.service(ServletWrapper.java:1230)
at com.ibm.ws.webcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:729)
at com.ibm.ws.webcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:426)
at com.ibm.ws.webcontainer.filter.WebAppFilterChain.invokeTarget(WebAppFilterChain.java:182)
at com.ibm.ws.webcontainer.filter.WebAppFilterChain.doFilter(WebAppFilterChain.java:93)
at com.ibm.fhir.server.filter.rest.FHIRRestServletFilter.doFilter(FHIRRestServletFilter.java:139)
at javax.servlet.http.HttpFilter.doFilter(HttpFilter.java:127)
at com.ibm.ws.webcontainer.filter.FilterInstanceWrapper.doFilter(FilterInstanceWrapper.java:201)
at com.ibm.ws.webcontainer.filter.WebAppFilterChain.doFilter(WebAppFilterChain.java:90)
at com.ibm.fhir.server.filter.rest.FHIRRestAuthorizationServletFilter.doFilter(FHIRRestAuthorizationServletFilter.java:86)
at com.ibm.ws.webcontainer.filter.FilterInstanceWrapper.doFilter(FilterInstanceWrapper.java:201)
at com.ibm.ws.webcontainer.filter.WebAppFilterChain.doFilter(WebAppFilterChain.java:90)
at com.ibm.ws.webcontainer.filter.WebAppFilterManager.doFilter(WebAppFilterManager.java:1001)
at com.ibm.ws.webcontainer.filter.WebAppFilterManager.invokeFilters(WebAppFilterManager.java:1139)
at com.ibm.ws.webcontainer.filter.WebAppFilterManager.invokeFilters(WebAppFilterManager.java:1010)
at com.ibm.ws.webcontainer.servlet.CacheServletWrapper.handleRequest(CacheServletWrapper.java:75)
at com.ibm.ws.webcontainer40.servlet.CacheServletWrapper40.handleRequest(CacheServletWrapper40.java:83)
at com.ibm.ws.webcontainer.WebContainer.handleRequest(WebContainer.java:938)
at com.ibm.ws.webcontainer.osgi.DynamicVirtualHost$2.run(DynamicVirtualHost.java:279)
at com.ibm.ws.http.dispatcher.internal.channel.HttpDispatcherLink$TaskWrapper.run(HttpDispatcherLink.java:1134)
at com.ibm.ws.http.dispatcher.internal.channel.HttpDispatcherLink.wrapHandlerAndExecute(HttpDispatcherLink.java:415)
at com.ibm.ws.http.dispatcher.internal.channel.HttpDispatcherLink.ready(HttpDispatcherLink.java:374)
at com.ibm.ws.http.channel.internal.inbound.HttpInboundLink.handleDiscrimination(HttpInboundLink.java:546)
at com.ibm.ws.http.channel.internal.inbound.HttpInboundLink.handleNewRequest(HttpInboundLink.java:480)
at com.ibm.ws.http.channel.internal.inbound.HttpInboundLink.processRequest(HttpInboundLink.java:345)
at com.ibm.ws.http.channel.internal.inbound.HttpInboundLink.ready(HttpInboundLink.java:316)
at com.ibm.ws.http.channel.h2internal.H2HttpInboundLinkWrap.ready(H2HttpInboundLinkWrap.java:97)
at com.ibm.ws.http.channel.h2internal.H2StreamProcessor$Http2Ready.run(H2StreamProcessor.java:1686)
at com.ibm.ws.threading.internal.ExecutorServiceImpl$RunnableWrapper.run(ExecutorServiceImpl.java:239)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:823)

@punktilious punktilious added the bug Something isn't working label Sep 8, 2020
@csandersdev
Copy link
Contributor

csandersdev commented Sep 8, 2020

My DB2 datasource in fhir-server-config.json is pretty vanilla. See below...

    "default": {
        "type": "db2",
        "tenantKey": "**",
        "connectionProperties": {
            "serverName": "***",
            "portNumber": 31300,
            "user": "***",
            "password": "***",
            "databaseName": "bludb",
            "currentSchema": "FHIRDATA",
            "driverType": 4,
            "sslConnection": true
        }
    }

@lmsurpre lmsurpre added this to the Sprint 18 milestone Sep 16, 2020
prb112 added a commit that referenced this issue Oct 5, 2020
Signed-off-by: Paul Bastide <pbastide@us.ibm.com>
@prb112
Copy link
Contributor

prb112 commented Oct 5, 2020

PR #1555
This is a different interpretation of the fix. The DataSource is checked to see it's valid, and then passes it to the persistence layer otherwise it passes one that's valid.

I tested this locally by starting a db2 server
connecting the healthcheck
Stopping the db2 server
connecting and seeing a msg on XAER_RMFAIL (-7), while this is not the exact error, the behavior is similar.

I then set the validation to 30s (we don't want something too low, I've gotten into this trap before at scale)

Then I tested via healthcheck, and all pass.

@prb112 prb112 self-assigned this Oct 5, 2020
prb112 added a commit that referenced this issue Oct 5, 2020
Stale connections in JDBC connection pool cause requests to fail #1485
@lmsurpre lmsurpre reopened this Oct 6, 2020
@lmsurpre
Copy link
Member

I deployed the server, let it sit idle for a few hours, then made a GET request to read a Patient resource.
It worked and the logs look clean. Lets consider this one fixed and we can re-open if we see it again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants