Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Erratic crashes with "java.lang.NoSuchFieldError: IS_OSGI_RUNNING" when starting ETL by web app #1462

Closed
dr0i opened this issue Oct 7, 2022 · 16 comments · Fixed by metafacture/metafacture-fix#264
Assignees
Projects

Comments

@dr0i
Copy link
Member

dr0i commented Oct 7, 2022

Starting an ETL process by invoking the webhook may result in:

Exception in thread "AlmaMarcXmlFix2lobidJsonEs" java.lang.NoSuchFieldError: IS_OSGI_RUNNING
at org.eclipse.emf.ecore.impl.EPackageRegistryImpl.createGlobalRegistry(EPackageRegistryImpl.java:55)
at org.eclipse.emf.ecore.EPackage$Registry.(EPackage.java:75)
at org.eclipse.xtext.xbase.XbaseStandaloneSetup.createInjectorAndDoEMFRegistration(XbaseStandaloneSetup.java:27)
at org.eclipse.xtext.xbase.XbaseStandaloneSetup.doSetup(XbaseStandaloneSetup.java:22)
at org.metafacture.metafix.FixStandaloneSetupGenerated.createInjectorAndDoEMFRegistration(FixStandaloneSetupGenerated.java:21)
at org.metafacture.metafix.validation.XtextValidator.getResource(XtextValidator.java:62)
at org.metafacture.metafix.validation.XtextValidator.getValidatedResource(XtextValidator.java:67)
at org.metafacture.metafix.FixStandaloneSetup.parseFix(FixStandaloneSetup.java:31)
at org.metafacture.metafix.Metafix.lambda$getRecordTransformer$0(Metafix.java:162)
at java.util.HashMap.computeIfAbsent(HashMap.java:1127)
at org.metafacture.metafix.Metafix.getRecordTransformer(Metafix.java:162)
at org.metafacture.metafix.Metafix.(Metafix.java:111)
at org.lobid.resources.run.AlmaMarcXmlFix2lobidJsonEs.receiverThread(AlmaMarcXmlFix2lobidJsonEs.java:206)
at org.lobid.resources.run.AlmaMarcXmlFix2lobidJsonEs.access$800(AlmaMarcXmlFix2lobidJsonEs.java:42)

This happens rather erratically, independent from using sbt start $port or the monit start script (executing activator). Sometimes opening a new terminal or invoking sbt clean and starting the web app changes the behaviour, but not for consistence.

As @blackwinter has pointed out, the EMFPlugin.IS_OSGI_RUNNING exists since version 2.23; and metafix has updated this dependency from 2.17 to 2.26 in January. So, one fix is to downgrade this dependency in metafix - a better solution found by @blackwinter is to make an explicit dependency of org.eclipse.emf:org.eclipse.emf.common in metafix/build.gradle.

@dr0i dr0i added this to Backlog in lobid board via automation Oct 7, 2022
@dr0i dr0i self-assigned this Oct 7, 2022
@dr0i dr0i moved this from Backlog to Review in lobid board Oct 7, 2022
@dr0i dr0i changed the title Erratically " java.lang.NoSuchFieldError: IS_OSGI_RUNNING" when starting web app Erratic crashes with "java.lang.NoSuchFieldError: IS_OSGI_RUNNING" when starting ETL by web app Oct 7, 2022
@blackwinter
Copy link
Member

FTR: We've seen NoSuchFieldError before, but this seems unrelated.

@blackwinter
Copy link
Member

Reproducible with the following Docker file (on branch dr0i/lobid-resources@1435-addAlmaFixProductioWorkflow):

FROM ubuntu:latest

ENV DEBIAN_FRONTEND noninteractive
RUN apt update && apt install -y git maven openjdk-8-jdk-headless unzip wget && update-alternatives --set java /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java

WORKDIR /setup
RUN wget http://downloads.typesafe.com/typesafe-activator/1.3.10/typesafe-activator-1.3.10-minimal.zip && unzip typesafe-activator-1.3.10-minimal.zip
RUN git clone https://github.com/metafacture/metafacture-fix.git && cd metafacture-fix && ./gradlew publishToMavenLocal

WORKDIR /app
ADD pom.xml .
ADD src src
RUN mvn install -DskipTests=true

WORKDIR web
ADD web .
RUN /setup/activator-1.3.10-minimal/bin/activator compile

ENTRYPOINT ["/setup/activator-1.3.10-minimal/bin/activator"]
CMD ["start"]
[pts/0]$ docker build -t lobid-resources .
[pts/0]$ docker run -t -i --rm -p 9000 --name lobid-resources lobid-resources
# waiting for "Listening for HTTP on /0.0.0.0:9000"
[pts/1]$ port=$(docker port lobid-resources | sed -n '1s/.*://p')
[pts/1]$ curl localhost:$port/resources/webhook/basedump-almafix?token=123
[pts/0] Exception in thread "AlmaMarcXmlFix2lobidJsonEs" java.lang.NoSuchFieldError: IS_OSGI_RUNNING

Can't confirm the erratic nature of it, though.

@blackwinter
Copy link
Member

The issue appears to be that org.eclipse.xtext:org.eclipse.xtext:2.26.0 depends on org.eclipse.emf:org.eclipse.emf.common:2.17.0, so it gets selected by sbt over org.eclipse.emf:org.eclipse.emf.common:2.26.0.

$ docker run -t -i --rm lobid-resources evicted
[warn] Found version conflict(s) in library dependencies; some are suspected to be binary incompatible:
[...]
[warn]  * org.eclipse.emf:org.eclipse.emf.common:2.17.0 is selected over {[2.26.0,3.0.0), [2.17.0,3.0.0)}
[warn]      +- org.eclipse.xtext:org.eclipse.xtext.util:2.26.0    (depends on 2.17.0)
[warn]      +- org.eclipse.xtext:org.eclipse.xtext:2.26.0         (depends on 2.17.0)
[warn]      +- org.eclipse.emf:org.eclipse.emf.ecore:2.20.0       (depends on [2.17.0,3.0.0))
[warn]      +- org.eclipse.emf:org.eclipse.emf.ecore:2.28.0       (depends on [2.26.0,3.0.0))
[...]
[warn] Run 'evicted' to see detailed eviction warnings
[info] Here are other depedency conflicts that were resolved:
[...]
[info]  * org.eclipse.emf:org.eclipse.emf.ecore:2.28.0 is selected over 2.20.0
[info]      +- org.eclipse.xtext:org.eclipse.xtext.util:2.26.0    (depends on 2.20.0)
[info]      +- org.eclipse.emf:org.eclipse.emf.ecore.xmi:2.16.0   (depends on 2.20.0)

But Gradle resolves to the same version without exhibiting the error (in metafacture-fix):

$ ./gradlew :metafix:dependencyInsight --dependency=org.eclipse.emf.common
org.eclipse.emf:org.eclipse.emf.common:2.17.0 (by constraint)
  Variant compile:
    | Attribute Name                 | Provided | Requested    |
    |--------------------------------|----------|--------------|
    | org.gradle.status              | release  |              |
    | org.gradle.category            | library  | library      |
    | org.gradle.libraryelements     | jar      | classes      |
    | org.gradle.usage               | java-api | java-api     |
    | org.gradle.dependency.bundling |          | external     |
    | org.gradle.jvm.environment     |          | standard-jvm |
    | org.gradle.jvm.version         |          | 8            |

org.eclipse.emf:org.eclipse.emf.common:2.17.0
+--- org.eclipse.xtext:org.eclipse.xtext:2.26.0
|    +--- compileClasspath
|    \--- org.eclipse.xtext:org.eclipse.xtext.common.types:2.26.0
|         \--- org.eclipse.xtext:org.eclipse.xtext.xbase:2.26.0
|              \--- compileClasspath
+--- org.eclipse.xtext:org.eclipse.xtext.util:2.26.0
|    \--- org.eclipse.xtext:org.eclipse.xtext:2.26.0 (*)
\--- org.eclipse.xtext:xtext-dev-bom:2.26.0
     \--- compileClasspath

org.eclipse.emf:org.eclipse.emf.common:[2.17.0,3.0.0) -> 2.17.0
\--- org.eclipse.emf:org.eclipse.emf.ecore:2.20.0
     +--- org.eclipse.xtext:xtext-dev-bom:2.26.0
     |    \--- compileClasspath
     +--- org.eclipse.xtext:org.eclipse.xtext.util:2.26.0
     |    \--- org.eclipse.xtext:org.eclipse.xtext:2.26.0
     |         +--- compileClasspath
     |         \--- org.eclipse.xtext:org.eclipse.xtext.common.types:2.26.0
     |              \--- org.eclipse.xtext:org.eclipse.xtext.xbase:2.26.0
     |                   \--- compileClasspath
     \--- org.eclipse.emf:org.eclipse.emf.ecore.xmi:2.16.0 (requested org.eclipse.emf:org.eclipse.emf.ecore:[2.18.0,3.0.0))
          +--- org.eclipse.xtext:xtext-dev-bom:2.26.0 (*)
          \--- org.eclipse.xtext:org.eclipse.xtext:2.26.0 (*)

(*) - dependencies omitted (listed previously)

So this doesn't explain the difference in behaviour between web app and CLI.

@blackwinter
Copy link
Member

@fsteeg: Do you have any idea?

@blackwinter
Copy link
Member

It can be solved with an explicit dependency on org.eclipse.emf:org.eclipse.emf.common:2.26.0:

--- Dockerfile.orig     2022-10-05 13:34:06.260475349 +0200
+++ Dockerfile  2022-10-05 13:30:57.588223177 +0200
@@ -5,7 +5,7 @@
 
 WORKDIR /setup
 RUN wget http://downloads.typesafe.com/typesafe-activator/1.3.10/typesafe-activator-1.3.10-minimal.zip && unzip typesafe-activator-1.3.10-minimal.zip
-RUN git clone https://github.com/metafacture/metafacture-fix.git && cd metafacture-fix && ./gradlew publishToMavenLocal
+RUN git clone https://github.com/metafacture/metafacture-fix.git && cd metafacture-fix && sed -i '/xtext\.xbase:/i\  implementation "org.eclipse.emf:org.eclipse.emf.common:${versions.xtext}"' metafix/build.gradle && ./gradlew publishToMavenLocal
 
 WORKDIR /app
 ADD pom.xml .
$ docker build -t lobid-resources-explicit .
$ docker run -t -i --rm lobid-resources-explicit evicted
[warn] Found version conflict(s) in library dependencies; some are suspected to be binary incompatible:
[...]
[warn]  * org.eclipse.emf:org.eclipse.emf.common:2.26.0 is selected over {[2.26.0,3.0.0), [2.17.0,3.0.0), 2.17.0}
[warn]      +- org.metafacture:metafix:0.4.0-SNAPSHOT             (depends on 2.26.0)
[warn]      +- org.eclipse.xtext:org.eclipse.xtext.util:2.26.0    (depends on 2.17.0)
[warn]      +- org.eclipse.xtext:org.eclipse.xtext:2.26.0         (depends on 2.17.0)
[warn]      +- org.eclipse.emf:org.eclipse.emf.ecore:2.20.0       (depends on [2.17.0,3.0.0))
[warn]      +- org.eclipse.emf:org.eclipse.emf.ecore:2.28.0       (depends on [2.26.0,3.0.0))
[...]
[warn] Run 'evicted' to see detailed eviction warnings
[info] Here are other depedency conflicts that were resolved:
[...]
[info]  * org.eclipse.emf:org.eclipse.emf.ecore:2.28.0 is selected over 2.20.0
[info]      +- org.eclipse.xtext:org.eclipse.xtext.util:2.26.0    (depends on 2.20.0)
[info]      +- org.eclipse.emf:org.eclipse.emf.ecore.xmi:2.16.0   (depends on 2.20.0)

But this definitely feels like a band-aid...

@fsteeg
Copy link
Member

fsteeg commented Oct 10, 2022

emf.common:2.17.0 gets selected by sbt over 2.26.0 [...] @fsteeg: Do you have any idea?

Hm, I tried a few things (thanks for the Dockerfile for reproduction!), I was suspecting that using the old activator was somehow affecting the dependency resolution behavior, but no success with the latest SBT either. Must be some different (or missing) implementation of the OSGi greaterOrEqual config in (Ivy-based) SBT vs. Gradle. I think adding the explicit dependency is a good fix, basically clarifying the greaterOrEqual dependency in Xtext for ourselves.

@blackwinter
Copy link
Member

Thanks for the analysis! Although it's still rather unsatisfying because Gradle resolves to the same version.

Anyway, let's go with the workaround then 😞

@blackwinter
Copy link
Member

Let's step back for a moment: What exactly are the differences between web app and CLI?

  • Build tool: sbt vs. Maven (Gradle in metafacture-fix)
  • Additional dependencies as specified in web/build.sbt
  • Anything else?

I wasn't able to get sbt-dependency-graph to install just now, but I may try again tomorrow in order to look at the actual dependencies.

@blackwinter
Copy link
Member

Approaching the issue from a different angle: Who's expecting that there is such a field IS_OSGI_RUNNING and then complaining if it can't be found? I don't even understand how this error comes into existence in the first place.

@dr0i
Copy link
Member Author

dr0i commented Oct 11, 2022

re "sbt-dependency-graph":

mkdir -p ~/.sbt/0.13/plugins/;
echo 'addSbtPlugin("net.virtual-void" % "sbt-dependency-graph" % "0.10.0-RC1")' >> ~/.sbt/0.13/plugins/plugins.sbt;
sbt dependencyTree;

@blackwinter
Copy link
Member

Thanks! I've tried with web/plugins.sbt before, which didn't work.

@fsteeg
Copy link
Member

fsteeg commented Oct 11, 2022

Who's expecting that there is such a field IS_OSGI_RUNNING and then complaining if it can't be found?

From what I understand it's some (dynamically loaded) OSGi module in EMF (which Xtext is based on). And SBT seems to resolve the lower bound of the EMF dependency in Xtext, resulting in a version without IS_OSGI_RUNNING made available to a module expecting it. I don't quite understand where or how the other module is coming in, the one expecting IS_OSGI_RUNNING. It might somehow be related to different behavior of the Maven repo (used by the lobid-resources and metafacture-fix builds) and the Ivy repo (used by SBT). Like somehow transitively, via Maven, the newer version is used, while directly, via SBT/Ivy, it's the older version?

A specific exclusion (of some of Xtext, Xbase, and/or EMF dependencies) should also do the trick, I've been trying that in pom.xml and build.sbt, but no success. It seems we need (or needed, seems to work without it now) an exclusion of org.eclipse.xtext/xtext-dev-bom in metafacture-playground, but that doesn't seem to help here either. Sorry, I'm out of ideas right now. For a future perspective, as mentioned elsewhere, we might consider replacing Xtext in metafacture-fix with direct ANTLR usage, to have less upstream complexity and be consistent with metafacture-flux.

@blackwinter
Copy link
Member

While both web app (sbt) and CLI (Maven) resolve to the same org.eclipse.emf:org.eclipse.emf.common:2.17.0, there are some differences:

Details
--- mvn 14:19:35.963557968 +0200
+++ sbt 14:42:45.175024451 +0200
+cglib:cglib-nodep:2.1_3
+ch.qos.logback:logback-classic:1.1.3
+ch.qos.logback:logback-core:1.1.3
+com.fasterxml:classmate:1.0.0
+com.fasterxml.jackson.datatype:jackson-datatype-jdk8:2.5.4
+com.fasterxml.jackson.datatype:jackson-datatype-jsr310:2.5.4
+com.github.jsonld-java:jsonld-java-jena:0.4.1
-com.google.code.findbugs:jsr305:1.3.7
+com.google.code.findbugs:jsr305:3.0.2
+com.google.code.gson:gson:2.3.1
+com.google.errorprone:error_prone_annotations:2.3.4
+com.google.guava:failureaccess:1.0.1
+com.google.guava:listenablefuture:9999.0-empty-to-avoid-conflict-with-guava
+com.google.inject.extensions:guice-assistedinject:4.0
+com.google.j2objc:j2objc-annotations:1.3
-commons-io:commons-io:2.5
+commons-io:commons-io:2.6
-com.ning:async-http-client:1.9.33
+com.ning:async-http-client:1.9.40
+com.novocode:junit-interface:0.11
+com.typesafe:config:1.3.0
+com.typesafe.netty:netty-http-pipelining:1.1.4
+com.typesafe.play:build-link:2.4.11
+com.typesafe.play:play-exceptions:2.4.11
+com.typesafe.play:play-netty-utils:2.4.11
-io.netty:netty:3.10.5.Final
+io.netty:netty:3.10.6.Final
+javax.transaction:jta:1.1
+javax.validation:validation-api:1.1.0.Final
+net.java.dev.jna:jna:4.1.0
+net.java.dev.jna:jna-platform:4.1.0
+net.jodah:typetools:0.4.3
+net.sf.ehcache:ehcache-core:2.6.11
+net.sourceforge.cssparser:cssparser:0.9.16
+net.sourceforge.htmlunit:htmlunit:2.18
+net.sourceforge.htmlunit:htmlunit-core-js:2.17
+net.sourceforge.nekohtml:nekohtml:1.9.22
+oauth.signpost:signpost-commonshttp4:1.2.1.2
+oauth.signpost:signpost-core:1.2.1.2
+org.apache.commons:commons-exec:1.3
+org.apache.commons:commons-lang3:3.4
-org.apache.jena:jena-arq:2.10.1
-org.apache.jena:jena-core:2.10.1
-org.apache.jena:jena-iri:0.9.6
+org.apache.jena:jena-arq:2.11.1
+org.apache.jena:jena-core:2.11.1
+org.apache.jena:jena-iri:1.0.1
+org.apache.tomcat:tomcat-servlet-api:8.0.21
+org.checkerframework:checker-qual:2.11.1
-org.eclipse.emf:org.eclipse.emf.ecore:2.20.0
+org.eclipse.emf:org.eclipse.emf.ecore:2.28.0
+org.eclipse.jetty:jetty-io:9.2.12.v20150709
+org.eclipse.jetty:jetty-util:9.2.12.v20150709
+org.eclipse.jetty.websocket:websocket-api:9.2.12.v20150709
+org.eclipse.jetty.websocket:websocket-client:9.2.12.v20150709
+org.eclipse.jetty.websocket:websocket-common:9.2.12.v20150709
+org.fluentlenium:fluentlenium-core:0.10.9
+org.hibernate:hibernate-validator:5.0.3.Final
+org.javassist:javassist:3.19.0-GA
+org.jboss.logging:jboss-logging:3.2.1.Final
+org.joda:joda-convert:1.7
+org.lobid:lobid-resources:0.4.1-thread-SNAPSHOT
-org.metafacture:metafacture-commons:5.3.2
+org.metafacture:metafacture-commons:5.4.0
-org.metafacture:metafacture-framework:5.3.2
+org.metafacture:metafacture-framework:5.4.0
-org.metafacture:metafacture-javaintegration:5.3.2
+org.metafacture:metafacture-javaintegration:5.4.0
+org.mockito:mockito-core:1.9.5
-org.ow2.asm:asm:9.1
+org.objenesis:objenesis:1.0
+org.ow2.asm:asm:9.2
-org.slf4j:jcl-over-slf4j:1.6.4
+org.reflections:reflections:0.9.9
+org.scala-sbt:test-interface:1.0
+org.seleniumhq.selenium:selenium-api:2.48.2
+org.seleniumhq.selenium:selenium-chrome-driver:2.48.2
+org.seleniumhq.selenium:selenium-edge-driver:2.48.2
+org.seleniumhq.selenium:selenium-firefox-driver:2.48.2
+org.seleniumhq.selenium:selenium-htmlunit-driver:2.48.2
+org.seleniumhq.selenium:selenium-ie-driver:2.48.2
+org.seleniumhq.selenium:selenium-java:2.48.2
+org.seleniumhq.selenium:selenium-leg-rc:2.48.2
+org.seleniumhq.selenium:selenium-remote-driver:2.48.2
+org.seleniumhq.selenium:selenium-safari-driver:2.48.2
+org.seleniumhq.selenium:selenium-support:2.48.2
+org.slf4j:jcl-over-slf4j:1.7.26
+org.slf4j:jul-to-slf4j:1.7.21
+org.springframework:spring-beans:4.1.6.RELEASE
+org.springframework:spring-context:4.1.6.RELEASE
+org.springframework:spring-core:4.1.6.RELEASE
+org.w3c.css:sac:1.3
+org.webbitserver:webbit:0.4.14
-xalan:xalan:2.7.0
+xalan:serializer:2.7.2
+xalan:xalan:2.7.2

Most notably:

  • org.metafacture:metafacture-*: Maven uses 5.3.2, sbt mixes 5.3.2 and 5.4.0
  • org.eclipse.emf:org.eclipse.emf.ecore: Maven uses 2.20.0, sbt uses 2.28.0

@blackwinter
Copy link
Member

Sorry, I'm out of ideas right now.

Me too. I'll try pinning org.eclipse.emf.ecore instead of org.eclipse.emf.common in metafacture-fix as that at least appears to have some foundation.

For a future perspective, as mentioned elsewhere, we might consider replacing Xtext in metafacture-fix with direct ANTLR usage, to have less upstream complexity and be consistent with metafacture-flux.

👍

blackwinter added a commit to metafacture/metafacture-fix that referenced this issue Oct 11, 2022
…es#1462)

It appears that sbt resolves to a newer version (2.28.0) than Maven/Gradle (2.20.0), which might cause some confusion in Xtext EMF registration.
@blackwinter
Copy link
Member

I'll try pinning org.eclipse.emf.ecore instead of org.eclipse.emf.common in metafacture-fix as that at least appears to have some foundation.

I've verified this workaround (same steps as before) and opened metafacture/metafacture-fix#264.

@dr0i: Can you do the functional review for the pull request?

@dr0i
Copy link
Member Author

dr0i commented Oct 11, 2022

Functional review: ✅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging a pull request may close this issue.

3 participants