Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: MatNWB fails to read certain DANDIsets #481

Closed
yarikoptic opened this issue Dec 14, 2022 · 7 comments
Closed

[Bug]: MatNWB fails to read certain DANDIsets #481

yarikoptic opened this issue Dec 14, 2022 · 7 comments
Assignees

Comments

@yarikoptic
Copy link

What happened?

While preparing healthchecking of dandisets, we ran into matnwb jobs never completing (dandi/dandisets-healthstatus#8), and if ran directly, errorring out with
Unable to resolve the name 'types.ndx_dandi_icephys.DandiIcephysMetadata'. (reported by @jwodder )
when ran on e.g. sub-mouse-AWOAY/sub-mouse-AWOAY_ses-20190829-sample-1_slice-20190829-slice-1_cell-20190829-sample-1_icephys.nwb from DANDI 000008 .

Steps to Reproduce

here were complete invocations while from FUSE'd dandisets


dandi     732837 93.4  0.5 6286668 383544 pts/15 Sl+  Dec05 9481:36           /mnt/backup/apps/MATLAB/R2022b/bin/glnxa64/MATLAB -batch nwb = nwbRead('/mnt/backup/dandi/dandisets-healthstatus/dandisets-fuse/000008/sub-mouse-AWOAY/sub-mouse-AWOAY_ses-20190829-sample-1_slice-20190829-slice-1_cell-20190829-sample-1_icephys.nwb', 'savedir', '/mnt/fast/dandi/dandisets-healthstatus') -nodesktop -prefersoftwareopengl
dandi     832307 93.4  0.5 6293812 337156 pts/15 Sl+  Dec05 9480:59           /mnt/backup/apps/MATLAB/R2022b/bin/glnxa64/MATLAB -batch nwb = nwbRead('/mnt/backup/dandi/dandisets-healthstatus/dandisets-fuse/000008/sub-mouse-BVPYH/sub-mouse-BVPYH_ses-20181121-sample-6_slice-20181121-slice-3_cell-20181121-sample-6_icephys.nwb', 'savedir', '/mnt/fast/dandi/dandisets-healthstatus') -nodesktop -prefersoftwareopengl


### Error Message

_No response_

### Operating System

Linux

### Matlab Version

R2022b

### Code of Conduct

- [X] I agree to follow this project's [Code of Conduct](https://github.com/neurodatawithoutborders/matnwb/blob/master/.github/CODE_OF_CONDUCT.md)
- [X] Have you ensured this bug was not already [reported](https://github.com/NeurodataWithoutBorders/matnwb/issues)?
@lawrence-mbf
Copy link
Collaborator

Testing both files appear to work on the current master branch. Perhaps a recent fix resolved this issue?

@lawrence-mbf
Copy link
Collaborator

I will state though, that AWOAY_ses-20190829-sample-1 did take a little bit longer to read though no error was thrown there either.

@yarikoptic
Copy link
Author

FTR, we finally have finished a complete run through all dandisets to check what we can or not open: from https://github.com/dandi/dandisets-healthstatus/blob/main/000008/status.yaml#L2671

overall with that matnwb we managed to read 103/7087 datasets/files, and failed to read 86/17816 datasets/files. (so unfortunately more failed than succeeded in terms of number of files). See https://github.com/dandi/dandisets-healthstatus#readme for more details.
It would be great to see how well current master version works whenever it gets released (code we use to install matnwb here)

@lawrence-mbf lawrence-mbf changed the title [Bug]: [Bug]: MatNWB fails to read certain DANDIsets Jan 17, 2023
@lawrence-mbf
Copy link
Collaborator

Hi @yarikoptic is there anything that would block using the tip of master for MatNWB?

@yarikoptic
Copy link
Author

yarikoptic commented Feb 2, 2023

looking at results of the most recent rerun on 000008 https://github.com/dandi/dandisets-healthstatus/blob/main/results/000008/status.yaml#L1337 it seems we did have that recent release matnwb: v2.6.0.0 but got all files fail as before, e.g.

Asset: sub-mouse-AAYYT/sub-mouse-AAYYT_ses-20180420-sample-2_slice-20180420-slice-2_cell-20180420-sample-2_icephys.nwb
Output:
    Unable to resolve the name 'types.ndx_dandi_icephys.DandiIcephysMetadata'.

    Error in io.parseGroup (line 85)
        parsed = eval([Type.typename '(kwargs{:})']);

    Error in io.parseGroup (line 38)
        subg = io.parseGroup(filename, group, Blacklist);

    Error in io.parseGroup (line 38)
        subg = io.parseGroup(filename, group, Blacklist);

    Error in nwbRead (line 59)
    nwb = io.parseGroup(filename, h5info(filename), Blacklist);

so - decided to reproduce "directly". Wrote this scriptie to test against multiple versions of matnwb on a file from dandi (used git-annex although could just download directly from the archive I guess):

#!/bin/bash

set -eu

cd "$(mktemp -d ${TMPDIR:-/tmp}/matnwb-XXXXXXX)"
pwd

ds="$1"
f="$2"

shift 
shift

git clone https://github.com/dandisets/$ds
( cd $ds; 
  echo "Version of the dandiset: ";
  git describe --always
  git annex get "$f"
)

mkdir out

git clone https://github.com/NeurodataWithoutBorders/matnwb
cd matnwb

set -x
for v in "$@"; do 
	git checkout $v; 
	echo "Running using: "; git describe --tags; 
	matlab -nodesktop -sd $PWD -batch 'generateCore()'; 
	MATLABPATH=$PWD time matlab -nodesktop -batch "nwb = nwbRead('../$ds/$f', 'savedir', '../out')" || echo "Exited with $?"
done

pwd
and it resulted in failures if ran on released or master version of matnwb
(base) dandi@drogon:~/proj$ ./checknwb.sh 000008 sub-mouse-ZZVQT/sub-mouse-ZZVQT_ses-20181003-sample-9_slice-20181003-slice-9_cell-20181003-sample-9_icephys.nwb v2.6.0.0 master
/tmp/matnwb-0gIBFpU
Cloning into '000008'...
remote: Enumerating objects: 92335, done.
remote: Counting objects: 100% (47585/47585), done.
remote: Compressing objects: 100% (26044/26044), done.
remote: Total 92335 (delta 16734), reused 41732 (delta 16130), pack-reused 44750
Receiving objects: 100% (92335/92335), 12.28 MiB | 21.17 MiB/s, done.
Resolving deltas: 100% (31598/31598), done.
Version of the dandiset: 
0.211014.0809-7-gcbbc79d2b

  Remote origin not usable by git-annex; setting annex-ignore

  https://github.com/dandisets/000008/config download failed: Not Found
get sub-mouse-ZZVQT/sub-mouse-ZZVQT_ses-20181003-sample-9_slice-20181003-slice-9_cell-20181003-sample-9_icephys.nwb (from web...) 
(scanning for annexed files...)      
ok
(recording state in git...)
Cloning into 'matnwb'...
remote: Enumerating objects: 8841, done.
remote: Counting objects: 100% (2004/2004), done.
remote: Compressing objects: 100% (811/811), done.
remote: Total 8841 (delta 1327), reused 1623 (delta 1189), pack-reused 6837
Receiving objects: 100% (8841/8841), 27.43 MiB | 22.00 MiB/s, done.
Resolving deltas: 100% (5702/5702), done.
+ for v in "$@"
+ git checkout v2.6.0.0
Note: switching to 'v2.6.0.0'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at 4bb8ef2 Merge pull request #482 from NeurodataWithoutBorders/schema-260-release
+ echo 'Running using: '
Running using: 
+ git describe --tags
v2.6.0.0
+ matlab -nodesktop -sd /tmp/matnwb-0gIBFpU/matnwb -batch 'generateCore()'
+ MATLABPATH=/tmp/matnwb-0gIBFpU/matnwb
+ time matlab -nodesktop -batch 'nwb = nwbRead('\''../000008/sub-mouse-ZZVQT/sub-mouse-ZZVQT_ses-20181003-sample-9_slice-20181003-slice-9_cell-20181003-sample-9_icephys.nwb'\'', '\''savedir'\'', '\''../out'\'')'
Unable to resolve the name 'types.ndx_dandi_icephys.DandiIcephysMetadata'.

Error in io.parseGroup (line 85)
    parsed = eval([Type.typename '(kwargs{:})']);

Error in io.parseGroup (line 38)
    subg = io.parseGroup(filename, group, Blacklist);

Error in io.parseGroup (line 38)
    subg = io.parseGroup(filename, group, Blacklist);

Error in nwbRead (line 59)
nwb = io.parseGroup(filename, h5info(filename), Blacklist);
 
Command exited with non-zero status 1
23.69user 1.27system 0:21.11elapsed 118%CPU (0avgtext+0avgdata 1128812maxresident)k
392inputs+1808outputs (18major+249145minor)pagefaults 0swaps
+ echo 'Exited with 1'
Exited with 1
+ for v in "$@"
+ git checkout master
Previous HEAD position was 4bb8ef2 Merge pull request #482 from NeurodataWithoutBorders/schema-260-release
Switched to branch 'master'
Your branch is up to date with 'origin/master'.
+ echo 'Running using: '
Running using: 
+ git describe --tags
v2.6.0.0-3-g16a5411
+ matlab -nodesktop -sd /tmp/matnwb-0gIBFpU/matnwb -batch 'generateCore()'
+ MATLABPATH=/tmp/matnwb-0gIBFpU/matnwb
+ time matlab -nodesktop -batch 'nwb = nwbRead('\''../000008/sub-mouse-ZZVQT/sub-mouse-ZZVQT_ses-20181003-sample-9_slice-20181003-slice-9_cell-20181003-sample-9_icephys.nwb'\'', '\''savedir'\'', '\''../out'\'')'
Unable to resolve the name 'types.ndx_dandi_icephys.DandiIcephysMetadata'.

Error in io.parseGroup (line 85)
    parsed = eval([Type.typename '(kwargs{:})']);

Error in io.parseGroup (line 38)
    subg = io.parseGroup(filename, group, Blacklist);

Error in io.parseGroup (line 38)
    subg = io.parseGroup(filename, group, Blacklist);

Error in nwbRead (line 59)
nwb = io.parseGroup(filename, h5info(filename), Blacklist);
 
Command exited with non-zero status 1
23.28user 1.18system 0:20.83elapsed 117%CPU (0avgtext+0avgdata 1111008maxresident)k
4040inputs+1784outputs (37major+246652minor)pagefaults 0swaps
+ echo 'Exited with 1'
Exited with 1
+ pwd
/tmp/matnwb-0gIBFpU/matnwb

please advise since original issue seems to remain, at least for this randomly selected file from 000008

@lawrence-mbf
Copy link
Collaborator

lawrence-mbf commented Feb 2, 2023

The read works for me but I am doing so without the custom savedir. I'm guessing since we're using ../out as the class directory, it should also be on the MATLAB path.

dandibot pushed a commit to dandi/dandisets-healthstatus that referenced this issue Feb 2, 2023
@yarikoptic
Copy link
Author

confirming that fixing up the script (now at https://github.com/dandi/dandisets-healthstatus/blob/main/code/checknwb.sh)

results in clean run for both released and not version, looking like
+ matlab -nodesktop -sd /tmp/matnwb-KhiKdQa/matnwb -batch 'generateCore()'
+ MATLABPATH=/tmp/matnwb-KhiKdQa/matnwb:/tmp/matnwb-KhiKdQa/matnwb/../out
+ time matlab -nodesktop -batch 'nwb = nwbRead('\''../000008/sub-mouse-ZZVQT/sub-mouse-ZZVQT_ses-20181003-sample-9_slice-20181003-slice-9_cell-20181003-sample-9_icephys.nwb'\'', '\''savedir'\'', '\''../out'\'')'

nwb = 

  NwbFile with properties:

                                             nwb_version: '2.2.5'
                                        file_create_date: [1x1 types.untyped.DataStub]
                         general_source_script_file_name: []
                                              identifier: 'cb09e8b3-d26b-4fb9-b402-f1c9371f0030'
                                     session_description: 'Current clamp square'
                                      session_start_time: 2018-10-03T00:00:00.000000-04:00
                               timestamps_reference_time: 2018-10-03T00:00:00.000000-04:00
                                             acquisition: [80x1 types.untyped.Set]
                                                analysis: [0x1 types.untyped.Set]
                                                 general: [1x1 types.untyped.Set]
                                 general_data_collection: []
                                         general_devices: [1x1 types.untyped.Set]
                          general_experiment_description: 'Current clamp square'
                                    general_experimenter: [1x1 types.untyped.DataStub]
                             general_extracellular_ephys: [0x1 types.untyped.Set]
                  general_extracellular_ephys_electrodes: []
                                     general_institution: 'Baylor College of Medicine'
                             general_intracellular_ephys: [1x1 types.untyped.Set]
     general_intracellular_ephys_experimental_conditions: []
                   general_intracellular_ephys_filtering: []
    general_intracellular_ephys_intracellular_recordings: []
                 general_intracellular_ephys_repetitions: []
       general_intracellular_ephys_sequential_recordings: []
     general_intracellular_ephys_simultaneous_recordings: []
                 general_intracellular_ephys_sweep_table: [1x1 types.core.SweepTable]
                                        general_keywords: []
                                             general_lab: 'Tolias'
                                           general_notes: []
                                    general_optogenetics: [0x1 types.untyped.Set]
                                  general_optophysiology: [0x1 types.untyped.Set]
                                    general_pharmacology: []
                                        general_protocol: []
                            general_related_publications: []
                                      general_session_id: '20181003_sample_9'
                                          general_slices: []
                                   general_source_script: []
                                        general_stimulus: []
                                         general_subject: [1x1 types.ndx_dandi_icephys.CreSubject]
                                         general_surgery: []
                                           general_virus: []
                                               intervals: [0x1 types.untyped.Set]
                                        intervals_epochs: []
                                 intervals_invalid_times: []
                                        intervals_trials: []
                                              processing: [0x1 types.untyped.Set]
                                                 scratch: [0x1 types.untyped.Set]
                                   stimulus_presentation: [79x1 types.untyped.Set]
                                      stimulus_templates: [0x1 types.untyped.Set]
                                                   units: []

20.11user 1.06system 0:19.25elapsed 109%CPU (0avgtext+0avgdata 941552maxresident)k
152inputs+1488outputs (9major+200975minor)pagefaults 0swaps

so let's consider issue for now addressed, we will fix up our side, and do full pass to give more detailed report (if anything fails) later on in a follow up issue. Thank you @lawrence-mbf for support!

yarikoptic added a commit to dandi/dandisets-healthstatus that referenced this issue Feb 2, 2023
apparently it should be included to be able to find those classes.
Ref: NeurodataWithoutBorders/matnwb#481 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants