Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error 44 on good files #675

Closed
sandhitsu opened this issue Dec 15, 2018 · 31 comments

Comments

@sandhitsu
Copy link

commented Dec 15, 2018

Following up on the thread #671 ,
version 1.1.1 continues to have this issue. Certain files that look good and is viewable in a viewer like ITK-SNAP (see attached) are flagged as

"We were unable to read this file. Make sure it contains data (fileSize > 0 kB) and is not corrupted, incorrectly named, or incorrectly symlinked."

@chrisgorgo

This comment has been minimized.

Copy link
Contributor

commented Dec 15, 2018

Could you share the files?

@chrisgorgo

This comment has been minimized.

Copy link
Contributor

commented Dec 15, 2018

Or at least one example

@dorianps

This comment has been minimized.

Copy link

commented Dec 15, 2018

We can't share the original file, unfortunately.

@sandhitsu can you empty the image (zero voxels) and check it again. If bids-validator still complains, we can share with @chrisfilo the empty image.

@dorianps

This comment has been minimized.

Copy link

commented Dec 15, 2018

If the empty file resolves the problem, than the problem is the original file. We can share the original file, but only privately with @chrisfilo (not online), and we need to remove the subject ID to some random number. We need to respect the data agreements to the letter, unfortunately.

@sandhitsu

This comment has been minimized.

Copy link
Author

commented Dec 15, 2018

Zeroing out the image does not solve the problem. Here is the zero image.
sub-XXXX_ses-01_run-01_T1w.nii.gz

@chrisgorgo

This comment has been minimized.

Copy link
Contributor

commented Dec 15, 2018

Could not replicate:

me@christop ~/Downloads $ node --version                                                                                             v11.4.0                                                                                                                              
me@christop ~/Downloads $ bids-validator --version                                                                                   1.1.1                                                                                                                               
me@christop ~/Downloads $ bids-validator test_ds                                                                                   
This dataset appears to be BIDS compatible.                                                                                                  
Summary:                Available Tasks:        Available Modalities:                                                                2 
Files, 51.94KB                                T1w                                                                                  1 - Subject                                                                                                                          
1 - Session                                                                                                                                                                                                                                                       
If you have any questions please post on https://neurostars.org/tags/bids
--

test_ds.zip

Perhaps a permission issues or something to do with a network drive?

PS Storing personal identifiable information in participant labels is a bad and dangerous practice. I edited your original post to remove the screenshot which included the participant label.

@sandhitsu

This comment has been minimized.

Copy link
Author

commented Dec 15, 2018

OOPS I thought I cut the title bar so that the ID won't show, must have been in the info tab then. Sorry about that. I'll download your zip file test. Thanks for looking into this.

@sandhitsu

This comment has been minimized.

Copy link
Author

commented Dec 15, 2018

The downloaded zip file passes validation. Here is what is happening. When I copy the subject directory somewhere to make a single subject directory tree, it passes, but as part of the larger dataset, it fails.

@sandhitsu

This comment has been minimized.

Copy link
Author

commented Dec 15, 2018

The software is pretty fast checking thousands of files. I don't know the details of how disk I/O is implemented, but this will sound weird -- it feels like either the software or the disk runs out of gas!

@dorianps

This comment has been minimized.

Copy link

commented Dec 15, 2018

@sandhitsu

This comment has been minimized.

Copy link
Author

commented Dec 15, 2018

@sandhitsu

This comment has been minimized.

Copy link
Author

commented Jan 14, 2019

Following up on this after further experimentation. Turns out that the file reading issue mentioned above happens with the standalone command line version 1.1.1, but not with the docker version. To summarize, some images are not readable (empty or non-empty doesn't matter) by the command line version ONLY when they are part of a larger dataset tree. In a single subject dataset, there is no error. I have copied the data onto multiple mounted filesystems and that doesn't matter.

@chrisgorgo

This comment has been minimized.

Copy link
Contributor

commented Jan 14, 2019

Is there a difference in the version of Node.js between the container and bare metal installation?

What do you mean by "they are part of a larger dataset tree"?

Could you share the Docker command you used?

@sandhitsu

This comment has been minimized.

Copy link
Author

commented Jan 14, 2019

I'll have to get back to you on your first question later.

What I mean is that the same subject directory tree, when validated alone, as in that's the only subject in a dataset tree, it passes validation. However, it throws an error when the same subject directory is one of many. Few weeks ago when you couldn't replicate the error with the sample dataset I provided, this is what happened. I couldn't replicate it either when tested alone as a single-subject dataset, but when I put it back to the larger dataset, the error reappeared.

I am using

docker run -ti --rm -v $PWD:/data:ro bids/validator --verbose /data

@chrisgorgo

This comment has been minimized.

Copy link
Contributor

commented Jan 14, 2019

Could you try

docker run -ti --rm -v $PWD:/data:ro bids/validator:1.1.1 --verbose /data

?

@sandhitsu

This comment has been minimized.

Copy link
Author

commented Jan 14, 2019

I did. It appeared to download and then run. Did not throw the error.

@dorianps

This comment has been minimized.

Copy link

commented Jan 14, 2019

The local NodeJS is at v10.15.0. WWe have upgraded it recently to make sure that is not the problem.

Not sure how to find the node version within the docker image. Nothing of these works:

[dorian@chdimri ~]$ docker run -ti bids/validator:latest node --version
0.0.0
[dorian@chdimri ~]$ docker run -ti bids/validator:1.1.1 node --version
0.0.0
[dorian@chdimri ~]$ docker run -ti bids/validator:1.1.1 node
node does not exist
[dorian@chdimri ~]$ docker run -ti bids/validator:1.1.1 "node --version"
node --version does not exist
[dorian@chdimri ~]$ docker run -ti bids/validator:1.1.1 /bin/bash
/bin/bash does not exist
[dorian@chdimri ~]$ docker run -ti bids/validator:1.1.1 /bin/bash -c "node --version"
/bin/bash does not exist
[dorian@chdimri ~]$ docker run -ti --rm bids/validator:1.1.1 /bin/bash -c "node --version"
/bin/bash does not exist
[dorian@chdimri ~]$ node --version
v10.15.0

The docker file shows that you are using node image v8.11.3.

@chrisgorgo

This comment has been minimized.

Copy link
Contributor

commented Jan 14, 2019

docker run -ti --rm --entrypoint=node bids/validator:1.1.1 --version

@dorianps

This comment has been minimized.

Copy link

commented Jan 14, 2019

[dorian@chdimri ~]$ docker run -ti --rm --entrypoint=node bids/validator:1.1.1 --version
Unable to find image 'bids/validator:1.1.1' locally
1.1.1: Pulling from bids/validator
a073c86ecf9e: Pull complete
becc6a89816a: Pull complete
fa183c3e7c21: Pull complete
e2dada1dea71: Pull complete
df496f65d26c: Pull complete
Digest: sha256:66c42b3748d6dcf4f64cc85d0821870a5b3d87882b3cdee22ca4ce1b052425bd
Status: Downloaded newer image for bids/validator:1.1.1
v8.11.3
@chrisgorgo

This comment has been minimized.

Copy link
Contributor

commented Jan 14, 2019

Well, I am not sure what is going on. Things should work with 10.15. Unfortunately, I cannot replicate your issues locally.

@dorianps

This comment has been minimized.

Copy link

commented Jan 14, 2019

Obviously the best would be to give you the data, but that may require weeks of preparation of user agreements and signatures. Do you think an interactive session via webex can help, where we type in the commands you need to investigate the source of the problem?

@chrisgorgo

This comment has been minimized.

Copy link
Contributor

commented Jan 14, 2019

It would help, but unfortunately, I do not have the resources to provide such level of user support. Best stick with docker for now. Maybe another user runs into this and will be able to share the data. Maybe next refactoring will make this issue go away.

@dorianps

This comment has been minimized.

Copy link

commented Jan 15, 2019

Just an update.

Since docker is working and the local npm install is not, we broadening the permission of NPM installation folders or the data folder. Nothing of this worked.

We then observed that only 1281 nifti files out of 5000+ files show an access error, and of course, all are there and accessible. From the subject ID we could guess that file access was interrupted during the run at some point. To verify whether this is the issue, we split the dataset in two parts and, surprise, everything works fine.

So, this is definitely a file access issue when thousands of files are accessed rapidly. This can probably be attributed to python. I am not sure how to test this, or whether python 3 would be compatible with npm/bids-validator. So far I have tried the system python 2.7.5 and python 2.7.14 which came with Anaconda 2.

@chrisfilo , any idea on this? Have you ever tried to test datasets with many files (i.e., 5000+) ?

@chrisgorgo

This comment has been minimized.

Copy link
Contributor

commented Jan 15, 2019

@chrisgorgo

This comment has been minimized.

Copy link
Contributor

commented Jan 15, 2019

@dorianps

This comment has been minimized.

Copy link

commented Jan 16, 2019

For what is worth, I performed a verification that nifti headers can be read for all nifti files in the BIDS folder. Tried both PrintHeader from ANTs and c3d from ITKsnap.

[dorian@chdimri dorian]$ time find /DATA/dorian/converted_2019/ -iname '*.nii' -exec sh -c 'PrintHeader {} | grep Bounding' \; > /DATA/dorian/PrintHeader_test.log

real    10m5.339s
user    4m40.305s
sys     6m4.294s

[dorian@chdimri dorian]$ time find /DATA/dorian/converted_2019/ -iname '*.nii' -exec sh -c 'c3d {} -info' \; > /DATA/dorian/C3D_test.log

real    14m9.197s
user    6m42.885s
sys     7m24.460s

[dorian@chdimri dorian]$ cat PrintHeader_test.log | grep Bounding | wc -l
5367
[dorian@chdimri dorian]$ cat C3D_test.log | grep Image | wc -l
5367
@chrisgorgo

This comment has been minimized.

Copy link
Contributor

commented Jan 16, 2019

@chrisgorgo

This comment has been minimized.

Copy link
Contributor

commented Jan 16, 2019

Any ideas @DaNish808?

@dorianps

This comment has been minimized.

Copy link

commented Jan 16, 2019

Yes, it is indeed hitting the user open file limit. And I managed to resolve the issue. Once I increased my user file limit from the default 4096 to 10000, the validator works fine.

[dorian@chdimri ~]$ ulimit -Hn
10000
[dorian@chdimri ~]$ bids-validator /DATA/dorian/converted_2019/ --ignoreWarnings
This dataset appears to be BIDS compatible.
        Summary:                     Available Tasks:        Available Modalities:
        10740 Files, 113.03GB                                T1w
        445 - Subjects                                       T2w
        7 - Sessions

If you have any questions please post on https://neurostars.org/tags/bids

The question is why are the file IO's kept open by NodeJS or bids-validator.
There is another thread somewhat similar regarding NodeJS:
nodejs/node#4386

P.s. Note that to increase the user file limit one needs admin privileges, so it's not feasible for regular users in institutional clusters. I also checked the user file limit at a computing cluster in a major institution, and it's simply 4096.

@dorianps

This comment has been minimized.

Copy link

commented Jan 16, 2019

ulimit -Hn if you want to check it yourself.

@sandhitsu

This comment has been minimized.

Copy link
Author

commented Jan 16, 2019

@nellh nellh referenced this issue Jan 18, 2019
1 of 1 task complete
@rwblair rwblair closed this Jun 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.