New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
could not find XmlIoBasicImgLoader implementation for format bdv.n5 #2
Comments
Thanks @boazmohar ... @tpietzsch we have seen this behavior before, do you remember what the reason was? Thanks :) |
not your fault @boazmohar ... this is about the annotations that are used to find Imgloader implementations at runtime ... |
Hi @boazmohar, I found how we solved a the same problem before ... I updated the pom.xml Can you please build with
and use the BigStitcher-Spark-0.0.1-SNAPSHOT.jar that is created? It should work ... |
@StephanPreibisch it changed location of the error, it is now on the workers running the task: 2021-12-26 10:06:29,223 [task-result-getter-2] ERROR [TaskSetManager]: Task 35 in stage 0.0 failed 4 times; aborting job
org.apache.spark.SparkException: Job aborted due to stage failure: Task 35 in stage 0.0 failed 4 times, most recent failure: Lost task 35.3 in stage 0.0 (TID 129, 10.36.107.23, executor 6): mpicbg.spim.data.SpimDataInstantiationException: could not find XmlIoBasicImgLoader implementation for format bdv.n5
at mpicbg.spim.data.generic.sequence.ImgLoaders.createXmlIoForFormat(ImgLoaders.java:72)
at mpicbg.spim.data.generic.sequence.XmlIoAbstractSequenceDescription.fromXml(XmlIoAbstractSequenceDescription.java:110) |
My bad, there was an issue with access to my home folder, putting the |
Cool, did you already try it on a bigger dataset? |
It is failing again, same error. |
Which same error? I thought it worked? If the same error persists we need to solve it … |
Well, it worked on the example dataset ONCE:
Full log: But the worker stdout still has this error: Here is one where the dataset is bigger and it failed. maybe it is running out of memory and the error is unrelated?? |
Hi, there is still a bug when creating the FATJAR ... I will need to examine what is going on |
Hi @boazmohar, can you please pull the newest version, call
It should load and display one of the stacks ... works for me, meaning it can find the N5 imgloader |
of course adjust the XML path and maybe try it locally on your machine ... and then on an interactive session on the cluster. If that all works, so should the Spark resaving (I think). |
It is working locally and from nomechine on the cluster.
Maybe it is related to some race condition? |
@trautmane do you have any ideas? Thanks 😊 |
I think it can be solved by adding this line : |
Could you give this a try @mzouink? That would be awesome ... |
done, @boazmohar, I added the code to a new branch |
we are making progress. now the problem is inside the jobs. |
Hi @StephanPreibisch and @boazmohar - I'm looking at all of this for the first time now under the assumption that the problem still exists. Please let me know if that assumption is incorrect. |
Yes, still an issue, thanks for looking into it! @boazmohar - I just left a question for you on slack ... |
Can confirm this solved this issue! Still had some memory issues but i used 5 cores per task and |
Awesome, thanks @trautmane! Could you please merge this into master?
I still do not understand why it worked locally with Spark though, do you know?
thanks again!! |
You should post the image and statistics on Twitter @boazmohar 😊 Thanks so much for b pushing this!! Now only the cloud version is missing @mzouink 😊 |
Thanks @trautmane, makes sense ... I didn't realize on my phone that it was in another repo, @tpietzsch has to approve it and cut a new release so we can distribute, I don't want to do it without him. It always creates this ripple-effect through many Fiji plugins ...
|
No, no ripple-effect expected... I just didn't have time to think it through and fix it... |
This is now merged and released in |
This is fixed by 0110317 . |
Hi @StephanPreibisch,
Tried running this on the Janelia cluster and got this error.
I am using the latest spark-janelia from here
This is how I build the repo and ran it
These are the driver logs with the error
Same happens for nonRigid, am I doing something wrong?
Thanks!
Boaz
The text was updated successfully, but these errors were encountered: