Error Could not initialize class ch.systemsx.cisd.hdf5.CharacterEncoding on AffineExport on h5 file #8

boazmohar · 2022-04-21T21:01:27Z

I am trying to do an AffineExport with spark:

~/spark-janelia/flintstone.sh 4 \
/groups/spruston/home/moharb/BigStitcher-Spark/target/BigStitcher-Spark-0.0.2-SNAPSHOT.jar \ 
net.preibisch.bigstitcher.spark.AffineFusion \
-x '/groups/mousebrainmicro/mousebrainmicro/data/Lightsheet/20210812_AG/ML_Rendering-test/aligned_data.xml' \
-o  '/nrs/svoboda/moharb/test_ML.n5' -d '/s0'

And get this error:

2022-04-21 15:45:37,731 [task-result-getter-0] ERROR [TaskSetManager]: Task 1 in stage 0.0 failed 4 times; aborting job
org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 4 times, most recent failure: Lost task 1.3 in stage 0.0 (TID 78, 10.36.107.42, executor 0): java.lang.NoClassDefFoundError: Could not initialize class ch.systemsx.cisd.hdf5.CharacterEncoding
	at ch.systemsx.cisd.hdf5.HDF5BaseReader.<init>(HDF5BaseReader.java:143)
	at ch.systemsx.cisd.hdf5.HDF5BaseReader.<init>(HDF5BaseReader.java:126)
	at ch.systemsx.cisd.hdf5.HDF5ReaderConfigurator.reader(HDF5ReaderConfigurator.java:86)
	at ch.systemsx.cisd.hdf5.HDF5FactoryProvider$HDF5Factory.openForReading(HDF5FactoryProvider.java:54)
	at ch.systemsx.cisd.hdf5.HDF5Factory.openForReading(HDF5Factory.java:55)
	at bdv.img.hdf5.Hdf5ImageLoader.open(Hdf5ImageLoader.java:183)
	at bdv.img.hdf5.Hdf5ImageLoader.getSetupImgLoader(Hdf5ImageLoader.java:381)
	at bdv.img.hdf5.Hdf5ImageLoader.getSetupImgLoader(Hdf5ImageLoader.java:79)
	at net.preibisch.bigstitcher.spark.util.ViewUtil.getTransformedBoundingBox(ViewUtil.java:32)
	at net.preibisch.bigstitcher.spark.AffineFusion.lambda$call$7b7a6284$1(AffineFusion.java:268)
	at org.apache.spark.api.java.JavaRDDLike.$anonfun$foreach$1(JavaRDDLike.scala:351)
	at org.apache.spark.api.java.JavaRDDLike.$anonfun$foreach$1$adapted(JavaRDDLike.scala:351)
	at scala.collection.Iterator.foreach(Iterator.scala:941)
	at scala.collection.Iterator.foreach$(Iterator.scala:941)
	at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
	at org.apache.spark.rdd.RDD.$anonfun$foreach$2(RDD.scala:986)
	at org.apache.spark.rdd.RDD.$anonfun$foreach$2$adapted(RDD.scala:986)
	at org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2139)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
	at org.apache.spark.scheduler.Task.run(Task.scala:127)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

I can open it in Fiji and look at the data with BigStitcher without an issue.
The xml is in:
/groups/mousebrainmicro/mousebrainmicro/data/Lightsheet/20210812_AG/ML_Rendering-test/aligned_data.xml
Any idea what to do?
Found this, might be related.

Thanks,
Boaz

The text was updated successfully, but these errors were encountered:

carshadi · 2022-04-22T23:07:52Z

Hi @boazmohar and @StephanPreibisch , I'm getting the same error on a SLURM cluster running a standalone spark cluster.

java info:

openjdk version "1.8.0_332"
OpenJDK Runtime Environment (Zulu 8.62.0.19-CA-linux64) (build 1.8.0_332-b09)
OpenJDK 64-Bit Server VM (Zulu 8.62.0.19-CA-linux64) (build 25.332-b09, mixed mode)

mvn:

Maven home: /home/cameron.arshadi/opt/apache-maven-3.8.5
Java version: 1.8.0_332, vendor: Azul Systems, Inc., runtime: /allen/scratch/aindtemp/cameron.arshadi/tools/jvm/zulu8.62.0.19-ca-jdk8.0.332-linux_x64/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "3.10.0-1160.15.2.el7.x86_64", arch: "amd64", family: "unix"

submit command:

spark-submit --master ${MASTER_URL} \
             --total-executor-cores $((SLURM_NTASKS * SLURM_CPUS_PER_TASK)) \
             --class net.preibisch.bigstitcher.spark.AffineFusion \
             --deploy-mode client \
             --verbose \
             --conf spark.executor.instances=${SLURM_NTASKS_PER_NODE} \
             --conf spark.executor.cores=${SLURM_CPUS_PER_TASK} \
             --conf spark.executor.memory=${SPARK_MEM} \
             --conf spark.default.parallelism=${PARALLELISM} \
             /allen/scratch/aindtemp/cameron.arshadi/tools/jars/BigStitcher-Spark-0.0.2-SNAPSHOT.jar \
             -x "/allen/scratch/aindtemp/data/anatomy/exm-hemi-brain/aligned_data.xml" \
             -o "/allen/scratch/aindtemp/data/anatomy/exm-hemi-brain-fused.n5" \
             -d "/ch0/s0" \
             --blockSize "256,256,256" \
             --preserveAnisotropy \
             --UINT16 \
             --minIntensity 0.0 \
             --maxIntensity 65535.0 \
             --channelId 0

Using spark-3.2.1

This didn't happen when running locally with --master local[32]

StephanPreibisch · 2022-04-26T13:33:40Z

@trautmane, do you have some time to look at that?

StephanPreibisch · 2022-04-26T13:38:30Z

@mkitti - that sounds familiar, did we discuss this? The problem that HDF5 creates a local tmp directory?

mkitti · 2022-04-26T13:41:16Z

I will check the pom this afternoon.

StephanPreibisch · 2022-04-26T13:43:47Z

thanks @mkitti!

mkitti · 2022-04-26T15:06:08Z

ch.systemsx.cisd.hdf5.CharacterEncoding definitely does exist:
https://sissource.ethz.ch/sispub/jhdf5/-/blob/master/source/java/ch/systemsx/cisd/hdf5/CharacterEncoding.java

mkitti · 2022-04-26T15:10:43Z

The reported line number is slightly off. CharacterEncoding should be on line 141
https://sissource.ethz.ch/sispub/jhdf5/-/blob/master/source/java/ch/systemsx/cisd/hdf5/HDF5BaseReader.java#L141

mkitti · 2022-04-26T15:17:20Z

We may need to take a close look at your classpaths. Also either of you are running on Debian or Ubuntu? Is it possible that you have on old version of the libsis-jhdf5-java Debian installed and present on your default classpath?

mkitti · 2022-04-26T15:43:52Z

The current pom actually imports jhdf5 14.12.6. The above source links are for 19.04.

mkitti · 2022-04-26T15:48:20Z

Line 143 lines up with older jhdf5 source at https://svnsis.ethz.ch/repos/cisd/jhdf5/trunk/source/java/ch/systemsx/cisd/hdf5/HDF5BaseReader.java
https://svnsis.ethz.ch/repos/cisd/jhdf5/tags/release/14.12.x/14.12.6/jhdf5/source/java/ch/systemsx/cisd/hdf5/HDF5BaseReader.java

        this.encodingForNewDataSets =
                useUTF8CharEncoding ? CharacterEncoding.UTF8 : CharacterEncoding.ASCII;

carshadi · 2022-04-26T19:48:37Z

Hi @mkitti ,

echo $CLASSPATH returns an empty string for me.

cat /etc/os-release

NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

ldconfig -p | grep libsis-jhdf5-java returns nothing on the cluster login node

boazmohar · 2022-04-26T20:10:08Z

This is @mkitti with @boazmohar:
The problem is, as @trautmane found before, [lib]jhdf5.so getting extracted to a common temporary directory when parallel jobs are run. Multiple workers may try to extract the native shared library to a common directory, creating a problem.

Per https://unlimited.ethz.ch/display/JHDF/JHDF5+FAQ#JHDF5FAQ-Whataretheoptionstoprovidethenativelibraries? we can provide a JVM option to point Java to a pre-extracted location of the file.

In @boazmohar's case, we prepended SUBMIT_ARGS="--conf spark.executor.extraJavaOptions=-Dnative.libpath.jhdf5=/groups/spruston/home/moharb/libjhdf5.so", which fixes this issue.

We extracted libjhdf5.so from native\jhdf5\amd64-Linux inside the jhdf5 JAR file which you can open up as a zip file.

carshadi · 2022-04-26T21:59:01Z

Confirming the above also works on my end

--conf "spark.executor.extraJavaOptions=-Dnative.libpath.jhdf5=/allen/scratch/aindtemp/cameron.arshadi/tools/lib/libjhdf5.so"

mkitti · 2022-05-23T20:02:39Z

It may be useful to considering using native.caching.libpath here. If the jhdf5 library does not exist, then this will extract it to the specified path. If it does exist, it will check the version and refresh it if needed. The currently extracted version is correct, it will just use that.

StephanPreibisch closed this as completed May 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error Could not initialize class ch.systemsx.cisd.hdf5.CharacterEncoding on AffineExport on h5 file #8

Error Could not initialize class ch.systemsx.cisd.hdf5.CharacterEncoding on AffineExport on h5 file #8

boazmohar commented Apr 21, 2022

carshadi commented Apr 22, 2022 •

edited

StephanPreibisch commented Apr 26, 2022

StephanPreibisch commented Apr 26, 2022

mkitti commented Apr 26, 2022

StephanPreibisch commented Apr 26, 2022

mkitti commented Apr 26, 2022 •

edited

mkitti commented Apr 26, 2022

mkitti commented Apr 26, 2022

mkitti commented Apr 26, 2022

mkitti commented Apr 26, 2022

carshadi commented Apr 26, 2022

boazmohar commented Apr 26, 2022 •

edited

carshadi commented Apr 26, 2022

mkitti commented May 23, 2022

Error Could not initialize class ch.systemsx.cisd.hdf5.CharacterEncoding on AffineExport on h5 file #8

Error Could not initialize class ch.systemsx.cisd.hdf5.CharacterEncoding on AffineExport on h5 file #8

Comments

boazmohar commented Apr 21, 2022

carshadi commented Apr 22, 2022 • edited

StephanPreibisch commented Apr 26, 2022

StephanPreibisch commented Apr 26, 2022

mkitti commented Apr 26, 2022

StephanPreibisch commented Apr 26, 2022

mkitti commented Apr 26, 2022 • edited

mkitti commented Apr 26, 2022

mkitti commented Apr 26, 2022

mkitti commented Apr 26, 2022

mkitti commented Apr 26, 2022

carshadi commented Apr 26, 2022

boazmohar commented Apr 26, 2022 • edited

carshadi commented Apr 26, 2022

mkitti commented May 23, 2022

carshadi commented Apr 22, 2022 •

edited

mkitti commented Apr 26, 2022 •

edited

boazmohar commented Apr 26, 2022 •

edited