Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Release 3.3) Egeria container fails to start #89

Closed
1 task done
planetf1 opened this issue Oct 28, 2021 · 10 comments · Fixed by odpi/egeria#5856 or odpi/egeria#5860
Closed
1 task done

(Release 3.3) Egeria container fails to start #89

planetf1 opened this issue Oct 28, 2021 · 10 comments · Fixed by odpi/egeria#5856 or odpi/egeria#5860
Assignees

Comments

@planetf1
Copy link
Member

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

When testing the egeria 3.3 release, the egeria container is failing to start with the error below

$ kubectl logs lab-odpi-egeria-lab-dev-0 [14:57:39]
Starting the Java application using /opt/jboss/container/java/run/run-java.sh ...
INFO exec java -XX:+UseParallelOldGC -XX:MinHeapFreeRatio=10 -XX:MaxHeapFreeRatio=20 -XX:GCTimeRatio=4 -XX:AdaptiveSizePolicyWeight=90 -XX:+ExitOnOutOfMemoryError -XX:MaxMetaspaceSize=1g -cp "." -jar /deployments/server/server-chassis-spring-3.3-SNAPSHOT.jar
Project Egeria - Open Metadata and Governance
____ __ ___ ___ ______ _____ ____ _ _ ___
/ __ \ / |/ // | / / / / ___ ____ _ __ ___ ____ / _ \ / / __ / / / _ / ____ _ _
/ / / // /|
/ // /| | / / __ _
\ / _ \ / __/| | / // _ \ / __/ / /
/ // // | / \ / / / | / // || |
/ /
/ // / / // ___ |/ /
/ / / // _// / | |/ // // / / __ // // / \ / / / // / // / / / / /
_
//
/ /
//
/ |
|_/ // ___/// |/ _/// // // _////// _/// // /_/

:: Powered by Spring Boot (v2.5.5) ::

2021-10-28 13:57:15.111 INFO 1 --- [ main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat initialized with port(s): 9443 (https)
2021-10-28 13:57:29.413 ERROR 1 --- [ main] o.s.boot.SpringApplication : Application run failed

org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'modelConverterRegistrar' defined in class path resource [org/springdoc/core/SpringDocConfiguration.class]: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [org.springdoc.core.converters.ModelConverterRegistrar]: Factory method 'modelConverterRegistrar' threw exception; nested exception is java.lang.NoClassDefFoundError: com/fasterxml/jackson/dataformat/yaml/YAMLFactory
at org.springframework.beans.factory.support.ConstructorResolver.instantiate(ConstructorResolver.java:658) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolver.java:638) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.instantiateUsingFactoryMethod(AbstractAutowireCapableBeanFactory.java:1352) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBeanInstance(AbstractAutowireCapableBeanFactory.java:1195) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:582) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:542) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:335) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:234) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:333) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:208) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:944) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:918) ~[spring-context-5.3.10.jar!/:5.3.10]
at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:583) ~[spring-context-5.3.10.jar!/:5.3.10]
at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.refresh(ServletWebServerApplicationContext.java:145) ~[spring-boot-2.5.5.jar!/:2.5.5]
at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:754) ~[spring-boot-2.5.5.jar!/:2.5.5]
at org.springframework.boot.SpringApplication.refreshContext(SpringApplication.java:434) ~[spring-boot-2.5.5.jar!/:2.5.5]
at org.springframework.boot.SpringApplication.run(SpringApplication.java:338) ~[spring-boot-2.5.5.jar!/:2.5.5]
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1343) ~[spring-boot-2.5.5.jar!/:2.5.5]
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1332) ~[spring-boot-2.5.5.jar!/:2.5.5]
at org.odpi.openmetadata.serverchassis.springboot.OMAGServerPlatform.main(OMAGServerPlatform.java:93) ~[classes!/:na]
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:na]
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:na]
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:na]
at java.base/java.lang.reflect.Method.invoke(Method.java:566) ~[na:na]
at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:49) ~[server-chassis-spring-3.3-SNAPSHOT.jar:na]
at org.springframework.boot.loader.Launcher.launch(Launcher.java:108) ~[server-chassis-spring-3.3-SNAPSHOT.jar:na]
at org.springframework.boot.loader.Launcher.launch(Launcher.java:58) ~[server-chassis-spring-3.3-SNAPSHOT.jar:na]
at org.springframework.boot.loader.PropertiesLauncher.main(PropertiesLauncher.java:467) ~[server-chassis-spring-3.3-SNAPSHOT.jar:na]
Caused by: org.springframework.beans.BeanInstantiationException: Failed to instantiate [org.springdoc.core.converters.ModelConverterRegistrar]: Factory method 'modelConverterRegistrar' threw exception; nested exception is java.lang.NoClassDefFoundError: com/fasterxml/jackson/dataformat/yaml/YAMLFactory
at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:185) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.ConstructorResolver.instantiate(ConstructorResolver.java:653) ~[spring-beans-5.3.10.jar!/:5.3.10]
... 27 common frames omitted
Caused by: java.lang.NoClassDefFoundError: com/fasterxml/jackson/dataformat/yaml/YAMLFactory
at io.swagger.v3.core.util.Json.mapper(Json.java:13) ~[swagger-core-2.1.11.jar!/:2.1.11]
at io.swagger.v3.core.converter.ModelConverters.(ModelConverters.java:31) ~[swagger-core-2.1.11.jar!/:2.1.11]
at io.swagger.v3.core.converter.ModelConverters.(ModelConverters.java:23) ~[swagger-core-2.1.11.jar!/:2.1.11]
at org.springdoc.core.converters.ModelConverterRegistrar.(ModelConverterRegistrar.java:42) ~[springdoc-openapi-common-1.5.12.jar!/:1.5.12]
at org.springdoc.core.SpringDocConfiguration.modelConverterRegistrar(SpringDocConfiguration.java:229) ~[springdoc-openapi-common-1.5.12.jar!/:1.5.12]
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:na]
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:na]
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:na]
at java.base/java.lang.reflect.Method.invoke(Method.java:566) ~[na:na]
at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:154) ~[spring-beans-5.3.10.jar!/:5.3.10]
... 28 common frames omitted
Caused by: java.lang.ClassNotFoundException: com.fasterxml.jackson.dataformat.yaml.YAMLFactory
at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:476) ~[na:na]
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:589) ~[na:na]
at org.springframework.boot.loader.LaunchedURLClassLoader.loadClass(LaunchedURLClassLoader.java:151) ~[server-chassis-spring-3.3-SNAPSHOT.jar:na]
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522) ~[na:na]
... 38 common frames omitted

Expected Behavior

chassis should launch ok

Steps To Reproduce

No response

Environment

- Egeria:
- OS:
- Java:
- Browser (for UI issues):
- Additional connectors and integration:

Any Further Information?

No response

@planetf1
Copy link
Member Author

This could be similar to an issue experienced by @sarbull & @marius-patrascu at https://lfaifoundation.slack.com/archives/C01F40J2XA8/p1635152393030500

However in that case a) I did not see the issue running the container b) the issue was resolved after a clean build

Need to review again 793a346b69d9373153ebf5b1ecfe288d0d7ebbec . There could be some relationship with jackson versions

It looks as if we may have a dependence on
com.fasterxml.jackson.dataformat
jackson-dataformat-yaml

but are not including it.

@planetf1 planetf1 self-assigned this Oct 28, 2021
@planetf1
Copy link
Member Author

planetf1 commented Oct 28, 2021

Further update - on closer inspection, my local container image was old -- prior to that commit references above. As such my test re: the slack message would have been invalid (had just moved from docker to podman, and the process is a little different). That doesn't mean the code didn't work. I don't know...

Of note is that our automated tests ran ok, including merge tests? it is unclear why as in a local and container test the failure was immediate and fatal.

I'm not yet going to propose a simple backoff - going to look at if packaging has changed and if we need to include additional library/dependency.

Affects master+release

@planetf1
Copy link
Member Author

This may be the cause

When the elastic search support was added we have a

   <dependency>
446
            <groupId>org.springdoc</groupId>
447
            <artifactId>springdoc-openapi-ui</artifactId>
@marius-patrascu
updated elasticsearch connector to save events to elasticsearch
17 days ago
448
            <exclusions>
449
                <exclusion>
450
                    <groupId>com.fasterxml.jackson.dataformat</groupId>
451
                    <artifactId>jackson-dataformat-yaml</artifactId>
452
                </exclusion>
453
            </exclusions>
@planetf1
odpi/egeria#1124 first pass of springdoc support
2 years ago
454
        </dependency>
        

This was added in 32ed9f248e6294f4b48db42ab65382ae4dec8e6a ( added in odpi/egeria#5733 )and may be causing the issue, as it could cause the required class to be skipped from inclusion in the jar. But it could also be co-incidence

@planetf1
Copy link
Member Author

planetf1 commented Oct 28, 2021

Removed above exclusion. Noted it wasn't excluded in gradle.

With this done, the assembly fails with:

17:15:04,699 [INFO] --- maven-enforcer-plugin:3.0.0-M3:enforce (enforce-versions) @ open-metadata-assemblies ---
17:15:05,499 [WARNING] Rule 5: org.apache.maven.plugins.enforcer.RequireUpperBoundDeps failed with message:
Failed while enforcing RequireUpperBoundDeps. The error(s) are [
Require upper bound dependencies error for com.fasterxml.jackson.dataformat:jackson-dataformat-yaml:2.10.4 paths to dependency are:
+-org.odpi.egeria:open-metadata-assemblies:3.4-SNAPSHOT
  +-org.odpi.egeria:elasticsearch-integration-connector:3.4-SNAPSHOT
    +-org.elasticsearch:elasticsearch-x-content:7.15.0
      +-com.fasterxml.jackson.dataformat:jackson-dataformat-yaml:2.10.4
and
+-org.odpi.egeria:open-metadata-assemblies:3.4-SNAPSHOT
  +-org.odpi.egeria:server-chassis-spring:3.4-SNAPSHOT
    +-org.springdoc:springdoc-openapi-ui:1.5.12
      +-org.springdoc:springdoc-openapi-webmvc-core:1.5.12
        +-org.springdoc:springdoc-openapi-common:1.5.12
          +-io.swagger.core.v3:swagger-integration:2.1.11
            +-io.swagger.core.v3:swagger-core:2.1.11
              +-com.fasterxml.jackson.dataformat:jackson-dataformat-yaml:2.12.1
]

This is an inconsistency between the elasticsearch integration connector and swagger.

Crucially we must have a working core egeria platform - so the options are:

  • backlevel components - but our policy is to keep up to date as much as we can to reduce security impacts. Jackson in particular has had many exposures in the past
  • remove swagger - though it's far from perfect, we know plenty of people are using this
  • force a common version between the two usages (our usual approach) . Notably we already have the same approach for many jackson components, and pinning the version here maintains consistency with the broad spectrum of jackson libraries we use
  • If this doesn't work, the integration connector could use a shaded library to ensure it is using it's preferred version

3 above is the most desirable, so will create a PR along those lines

cc: @marius-patrascu @lpalashevski - is this ok for you

Since this is a complete breakage of the chassis I will merge this PR if tests pass, and if we need to change the fix we will do so in a subsequent PR. I will then continue with release testing (Including the open lineage notebook)

planetf1 referenced this issue in planetf1/egeria Oct 28, 2021
…n/version incompat

Signed-off-by: Nigel Jones <nigel.l.jones+git@gmail.com>
@planetf1
Copy link
Member Author

After fix:

 chassis                                                                                         [17:41:23]
 Project Egeria - Open Metadata and Governance
    ____   __  ___ ___    ______   _____                                 ____   _         _     ___
   / __ \ /  |/  //   |  / ____/  / ___/ ___   ____ _   __ ___   ____   / _  \ / / __    / /  / _ /__   ____ _  _
  / / / // /|_/ // /| | / / __    \__ \ / _ \ / __/| | / // _ \ / __/  / /_/ // //   |  / _\ / /_ /  | /  _// || |
 / /_/ // /  / // ___ |/ /_/ /   ___/ //  __// /   | |/ //  __// /    /  __ // // /  \ / /_ /  _// / // /  / / / /
 \____//_/  /_//_/  |_|\____/   /____/ \___//_/    |___/ \___//_/    /_/    /_/ \__/\//___//_/   \__//_/  /_/ /_/

 :: Powered by Spring Boot (v2.5.5) ::

2021-10-28 17:41:33.290  INFO 41487 --- [           main] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat initialized with port(s): 9443 (https)
2021-10-28 17:41:40.823  INFO 41487 --- [           main] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat started on port(s): 9443 (https) with context path ''

Thu Oct 28 17:41:35 BST 2021 No OMAG servers listed in startup configuration
Thu Oct 28 17:41:40 BST 2021 OMAG server platform ready for more configuration

planetf1 referenced this issue in planetf1/egeria Oct 28, 2021
…n/version incompat

Signed-off-by: Nigel Jones <nigel.l.jones+git@gmail.com>
planetf1 referenced this issue in odpi/egeria Oct 28, 2021
#5855 Fix chassis start failure caused by jackson dataformat version
planetf1 referenced this issue in odpi/egeria Oct 28, 2021
(Release 3.3) #5855 Fix chassis start failure (incompat versions of jackson dataformat)
@planetf1
Copy link
Member Author

With the main builds having rerun

planetf1 referenced this issue in planetf1/egeria Oct 29, 2021
Signed-off-by: Nigel Jones <nigel.l.jones+git@gmail.com>
planetf1 referenced this issue in planetf1/egeria Oct 29, 2021
Signed-off-by: Nigel Jones <nigel.l.jones+git@gmail.com>
planetf1 referenced this issue in odpi/egeria Oct 29, 2021
#5855 add publish to quay.io for release pipeline
planetf1 referenced this issue in odpi/egeria Oct 29, 2021
(Release 3.3) #5855 add publish to quay.io for release pipeline
@planetf1
Copy link
Member Author

This is still occuring with the latest helm charts ie for 3.3:

  Normal   Started         24s (x2 over 53s)  kubelet            Started container egeria
  Normal   Pulled          24s                kubelet            Successfully pulled image "quay.io/odpi/egeria:3.3-SNAPSHOT" in 1.082747989s

giving the error.

Yet running the containers standalone with podman, for both the 3.3 & latest (3.4-SNAPSHOT) images is now working correctly - prior to the fixes here that too was failing. Updated 'docker-compose' environment is also working!

This means we still have a problem in openshift which must either be

  • Caching issues with deployment of the chart (despite the logs reporting the image was pulled)
  • Specific environment issue in the k8s configuration

Transferring this issue to the charts repo now that the underlying issue is resolved, and reopening

@planetf1 planetf1 reopened this Oct 29, 2021
@planetf1 planetf1 transferred this issue from odpi/egeria Oct 29, 2021
@lpalashevski lpalashevski pinned this issue Nov 1, 2021
@planetf1
Copy link
Member Author

planetf1 commented Nov 1, 2021

Located the issue - the log above reports the image is being pulled from '3.3-SNAPSHOT' (which has the bug) rather than '3.3' (which contains the final fix)

@planetf1
Copy link
Member Author

planetf1 commented Nov 1, 2021

In the 'values.yaml', though imageDefaults.tag was set to '3.3', egeria.version was set to '3.3-SNAPSHOT', it is this variable that is used for all core egeria images, whilst the image defaults tag is used for non egeria versions

planetf1 added a commit to planetf1/egeria-charts that referenced this issue Nov 1, 2021
Signed-off-by: Nigel Jones <nigel.l.jones+git@gmail.com>
planetf1 added a commit that referenced this issue Nov 1, 2021
@planetf1
Copy link
Member Author

planetf1 commented Nov 1, 2021

Fixed in #90

@planetf1 planetf1 closed this as completed Nov 1, 2021
@planetf1 planetf1 unpinned this issue Nov 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant