NPE in LTPAConfigurationImpl.loadConfig #7448

jwalcorn · 2019-05-08T06:01:01Z

I've owned a simple Stock Trader application for the past couple of years. Almost all of the microservices run on Liberty in Docker containers on Kubernetes (I've run it in ICP, IKS, and OCP). All has been fine as I moved up to each new Liberty release, until the new 19.0.0.4 came out last week. First a tech sales guy reported it, then I reproduced the problem myself. All I have to do is build the Docker container for my trader UI microservice against 19.0.0.4, and the server will fail to start, with an NPE early on as the security service tries to initialize. If I go back to 19.0.0.3, all is fine. The log output is as follows:

[INFO    ] CWWKS0007I: The security service is starting...
[ERROR   ] CWWKE0701E: bundle com.ibm.ws.security.token.ltpa:1.0.27.cl190420190419-0642 (67)[com.ibm.ws.security.token.ltpa.LTPAConfiguration(156)] : The activate method has thrown an exception java.lang.NullPointerException
    at com.ibm.ws.security.token.ltpa.internal.LTPAConfigurationImpl.loadConfig(LTPAConfigurationImpl.java:115)
    at [internal classes]

I'll attach the full messages.log, a trace.log with *=info:com.ibm.ws.security.*=all:com.ibm.ws.webcontainer.security.*=all, and my server.xml (the full code is in GitHub at https://github.com/IBMStockTrader/trader). Note my app actually uses JWT, not LTPA, but it appears Liberty loads the LTPA stuff whenever the security service starts. Please let me know if any other traces or other info are needed. I'm holding off on pushing 19.0.0.4-based images to DockerHub until this is fixed, so that at least the people just using the pre-built images are OK - but some people prefer to build it themselves, and they will be broken now.

The text was updated successfully, but these errors were encountered:

jwalcorn · 2019-05-08T06:04:51Z

logs.zip

jwalcorn · 2019-05-08T06:05:59Z

Discussed in #was-security: https://ibm-cloud.slack.com/archives/C324NP6H5/p1556903477018100

jwalcorn · 2019-05-08T06:18:28Z

A few random thoughts, which may or may not be relevant:

As I don't actually use LTPA, I'm not copying an ltpa.keys into the Docker container
Perhaps the server is generating one on the fly?
I do have a key.jks and validationKeystore.jks that I refer to in my server.xml
This is my only microservice with a validationKeystore.jks (and only one failing - could be coincidence)
I believe there may be an expired key in my key.jks (but hasn't been an issue...)
I've heard Liberty defaults to .p12 format for keystores now (new, in 19.0.0.4). I'm still using .jks format, which I understand should still be supported for backward compatibility reasons.
I have appSecurity-2.0 being loaded, as well as mpJwt-1.1
I'm also using jwtSSO-1.0 (this is my only microservice doing so - and the only one failing - could be coincidence.
I'm just using basicRegistry
I've reproduced this outside of Kubernetes, just by doing a docker run trader:latest on my Mac.

teddyjtorres · 2019-05-08T17:16:21Z

It has to be determined why config admin provided a null value for the "expiration" attribute.

teddyjtorres · 2019-05-08T17:36:41Z

The service is not passed the attribute values from metatype,

[5/7/19 16:23:00:866 UTC] 00000020 id=26fe8f0b om.ibm.ws.security.token.ltpa.internal.LTPAConfigurationImpl > activate Entry
org.apache.felix.scr.impl.manager.ComponentContextImpl@1ac6e462
{service.vendor=IBM, component.name=com.ibm.ws.security.token.ltpa.LTPAConfiguration, tokenType=Ltpa2, component.id=185}

tjwatson · 2019-05-09T13:51:05Z

Is the server configuration available to reproduce this?

tjwatson · 2019-05-09T18:51:25Z

recreate.zip

unzip the recreate.zip and run the recreate7448.sh script to reproduce.

tjwatson · 2019-05-09T19:17:09Z

Add the following to the end of your Dockerfile to work around:

RUN server start --clean && server stop

This also has the added benefit of doing some work in your docker build to cache the configuration changes of your application so it doesn't have to happen each time you start a new container based on your image.

tjwatson · 2019-05-10T13:02:22Z

An alternative work around is to not run the installUtility against your defaultServer configuration. Instead list the features you want to install, for example:

RUN installUtility install --acceptLicense jwtSso-1.0 logstashCollector-1.0

arthurdm · 2019-05-10T20:46:51Z

hey @tjwatson - I am curious as to why one of the workarounds is to install the features individually vs doing installUtility install --acceptLicense defaultServer. Can you please elaborate on that?

as fyi:

the Docker images for OL and WL already have a docker server start / stop warmup optimization - which although don't have the application classes in the cache, do improve the startup considerably. They do add about 60-80 MB to the Docker layer though.

So if you do another server warmup it would duplicate most of the cache and cause the image to be even more bloater - yes, it would then have a better cache, but probably not worth the double footprint hit. And, if would have to worry about changing the group permissions (see the last part of this command)

tjwatson · 2019-05-10T21:05:39Z

hey @tjwatson - I am curious as to why one of the workarounds is to install the features individually vs doing installUtility install --acceptLicense defaultServer. Can you please elaborate on that?

When doing installUtility against a server configuration it trashes the configurations persisted in the workarea of the server. These are the configurations that got persisted in the workarea for defaultServer in the Docker images for OL and WL which you reference below. On a cached restart we depended on the cached configurations to provide a performance improvement on cached restart. But when installUtility trashes the persistent configs we are doomed to fail the relaunch and that results in the NPE from this issue. Also see the design issue #7491 that I opened as a fallout of this issue.

If instead you use installUtility with an explicit list of features instead of a server configuration then it avoids trashing the workarea of the defaultServer preserving the configuration data we need to successfully start again.

as fyi:

the Docker images for OL and WL already have a docker server start / stop warmup optimization - which although don't have the application classes in the cache, do improve the startup considerably. They do add about 60-80 MB to the Docker layer though.

So if you do another server warmup it would duplicate most of the cache and cause the image to be even more bloater - yes, it would then have a better cache, but probably not worth the double footprint hit. And, if would have to worry about changing the group permissions (see the last part of this command)

OpenLiberty/open-liberty#7448

jwalcorn · 2019-05-13T16:18:48Z

I tried the workaround of starting (with --clean) and stopping the server at the end of my Dockerfile (https://github.com/IBMStockTrader/trader/blob/master/Dockerfile), and it seems to be working. Thanks for the quick suggestion. I'll be happy to test the real fix whenever it's ready, and remove the workaround. Thanks again!

jwalcorn · 2019-05-13T16:22:08Z

So a working 19.0.0.4-based image of my Trader UI microservice is in DockerHub now, available via "docker pull ibmstocktrader/trader:latest". My colleague Greg Hintermeister is about to deploy it to IKS and try it.

marikaj123 · 2019-05-15T13:15:21Z

Tom will provide Docker to John to test.

tjwatson · 2019-05-15T20:05:18Z

I provided John @jwalcorn with a Dockerfile in my fork of trader at https://github.com/tjwatson/trader/tree/LibertyIssue7448

This does require --build-arg to be used when doing a docker build to pass DOWNLOAD_OPTIONS with the --user and --password for accessing the liberty build.

marikaj123 · 2019-05-16T16:14:11Z

@jwalcorn - Let us know if the fix did pass verification and the issue can be closed.

jwalcorn · 2019-05-16T18:23:57Z

Just tested it built against the latest 19.0.0.5 GM-candidate Liberty build, without the workaround, and all is good!

jwalcorn · 2019-05-16T18:29:17Z

marikaj123 · 2019-05-16T18:30:10Z

@jwalcorn - Thank you and Tom thank you too. Please close the issue since the code testing passed with the green release build.

LibbyBot added the Needs member attention label May 8, 2019

tjwatson mentioned this issue May 10, 2019

Always add metatype to work around issue 7448 #7487

Merged

tjwatson mentioned this issue May 10, 2019

Protect the workarea of the server for different launch scenarios #7491

Closed

tjwatson self-assigned this May 10, 2019

tjwatson added release bug This bug is present in a released version of Open Liberty regression This bug is for something that worked in a past release, but no longer does team:OSGi Infrastructure and removed Needs member attention labels May 10, 2019

jwalcorn pushed a commit to IBMStockTrader/trader that referenced this issue May 13, 2019

Workaround Liberty bug

8d27f3c

OpenLiberty/open-liberty#7448

tjwatson closed this as completed May 16, 2019

LibbyBot added the release:19005 label May 20, 2019

dazavala mentioned this issue Sep 13, 2019

7491 Revert the workaround for issue 7448 #8943

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NPE in LTPAConfigurationImpl.loadConfig #7448

NPE in LTPAConfigurationImpl.loadConfig #7448

jwalcorn commented May 8, 2019

jwalcorn commented May 8, 2019

jwalcorn commented May 8, 2019

jwalcorn commented May 8, 2019

teddyjtorres commented May 8, 2019

teddyjtorres commented May 8, 2019

tjwatson commented May 9, 2019

tjwatson commented May 9, 2019

tjwatson commented May 9, 2019

tjwatson commented May 10, 2019

arthurdm commented May 10, 2019

tjwatson commented May 10, 2019

jwalcorn commented May 13, 2019

jwalcorn commented May 13, 2019

marikaj123 commented May 15, 2019

tjwatson commented May 15, 2019

marikaj123 commented May 16, 2019

jwalcorn commented May 16, 2019

jwalcorn commented May 16, 2019

marikaj123 commented May 16, 2019

NPE in LTPAConfigurationImpl.loadConfig #7448

NPE in LTPAConfigurationImpl.loadConfig #7448

Comments

jwalcorn commented May 8, 2019

jwalcorn commented May 8, 2019

jwalcorn commented May 8, 2019

jwalcorn commented May 8, 2019

teddyjtorres commented May 8, 2019

teddyjtorres commented May 8, 2019

tjwatson commented May 9, 2019

tjwatson commented May 9, 2019

tjwatson commented May 9, 2019

tjwatson commented May 10, 2019

arthurdm commented May 10, 2019

tjwatson commented May 10, 2019

jwalcorn commented May 13, 2019

jwalcorn commented May 13, 2019

marikaj123 commented May 15, 2019

tjwatson commented May 15, 2019

marikaj123 commented May 16, 2019

jwalcorn commented May 16, 2019

jwalcorn commented May 16, 2019

marikaj123 commented May 16, 2019