Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Cannot define property .. in class .. because the property has already been defined in the superclass .. #491

Closed
2 tasks done
yarikoptic opened this issue Feb 10, 2023 · 18 comments
Assignees

Comments

@yarikoptic
Copy link

What happened?

originally mentioned in #490 (comment)

in the fresh run of https://github.com/dandi/dandisets-healthstatus saw in the logs

the error
Asset: sub-744912845/sub-744912845_ses-766640955.nwb
Output:
    {^HCannot define property 'strain' in class
    'types.ndx_aibs_ecephys.EcephysSpecimen' because the property has already been
    defined in the superclass 'types.core.Subject'.

    Error in io.parseGroup (line 85)
        parsed = eval([Type.typename '(kwargs{:})']);

    Error in io.parseGroup (line 38)
        subg = io.parseGroup(filename, group, Blacklist);

    Error in io.parseGroup (line 38)
        subg = io.parseGroup(filename, group, Blacklist);

    Error in nwbRead (line 59)
    nwb = io.parseGroup(filename, h5info(filename), Blacklist);
    }^H 

which then reproduced against master on that sample file

(dandisets-2) dandi@drogon:~/cronlib/dandisets-healthstatus/code$ ./checknwb.sh 000022 sub-744912845/sub-744912845_ses-766640955.nwb master
/tmp/matnwb-TJq1ZBO
Cloning into '000022'...
remote: Enumerating objects: 8581, done.
remote: Counting objects: 100% (8581/8581), done.
remote: Compressing objects: 100% (3139/3139), done.
remote: Total 8581 (delta 4437), reused 7798 (delta 3654), pack-reused 0
Receiving objects: 100% (8581/8581), 790.29 KiB | 3.24 MiB/s, done.
Resolving deltas: 100% (4437/4437), done.
Version of the dandiset:
06ef421

  Remote origin not usable by git-annex; setting annex-ignore

  https://github.com/dandisets/000022/config download failed: Not Found
get sub-744912845/sub-744912845_ses-766640955.nwb (from web...)
ok
(recording state in git...)
Cloning into 'matnwb'...
remote: Enumerating objects: 8841, done.
remote: Counting objects: 100% (2004/2004), done.
remote: Compressing objects: 100% (811/811), done.
remote: Total 8841 (delta 1327), reused 1623 (delta 1189), pack-reused 6837
Receiving objects: 100% (8841/8841), 27.43 MiB | 7.06 MiB/s, done.
Resolving deltas: 100% (5702/5702), done.
+ for v in "$@"
+ git checkout master
Already on 'master'
Your branch is up to date with 'origin/master'.
+ echo 'Running using: '
Running using: 
+ git describe --tags
v2.6.0.0-3-g16a5411
+ matlab -nodesktop -sd /tmp/matnwb-TJq1ZBO/matnwb -batch 'generateCore()'
+ MATLABPATH=/tmp/matnwb-TJq1ZBO/matnwb:/tmp/matnwb-TJq1ZBO/matnwb/../out
+ time matlab -nodesktop -batch 'f='\''../000022/sub-744912845/sub-744912845_ses-766640955.nwb'\''; disp(util.getSchemaVersion(f)); nwb = nwbRead(f, '\''savedir'\'', '\''../out'\'')'
2.2.2
Cannot define property 'strain' in class
'types.ndx_aibs_ecephys.EcephysSpecimen' because the property has already been
defined in the superclass 'types.core.Subject'.

Error in io.parseGroup (line 85)
    parsed = eval([Type.typename '(kwargs{:})']);

Error in io.parseGroup (line 38)
    subg = io.parseGroup(filename, group, Blacklist);

Error in io.parseGroup (line 38)
    subg = io.parseGroup(filename, group, Blacklist);

Error in nwbRead (line 59)
nwb = io.parseGroup(filename, h5info(filename), Blacklist);
 
Command exited with non-zero status 1
17.10user 1.03system 0:17.62elapsed 102%CPU (0avgtext+0avgdata 898684maxresident)k
10752inputs+1520outputs (18major+194751minor)pagefaults 0swaps
+ echo 'Exited with 1'
Exited with 1
+ pwd
/tmp/matnwb-TJq1ZBO/matnwb
(dandisets-2) dandi@drogon:~/cronlib/dandisets-healthstatus$ grep -l 'because the property has already been' results/*/2023.02.0[2-9]*.log
results/000021/2023.02.05.11.32.02_matnwb_nwbRead_errors.log
results/000022/2023.02.05.19.19.37_matnwb_nwbRead_errors.log
(dandisets-2) dandi@drogon:~/cronlib/dandisets-healthstatus$ grep -h 'because the property has already been' results/*/2023.02.0[2-9]*.log | sort | uniq -c
      8     'types.ndx_aibs_ecephys.EcephysProbe' because the property has already been
    371     'types.ndx_aibs_ecephys.EcephysSpecimen' because the property has already been

Steps to Reproduce

see above

Error Message

No response

Operating System

Linux

Matlab Version

R2022b Update 1 (9.13.0.2080170) 64-bit (glnxa64)

Code of Conduct

@lawrence-mbf
Copy link
Collaborator

Ironically I cannot reproduce this.

I get a completely different error:
Error using assert
Unexpected properties {unit}.

Your schema version may be incompatible with the file.  Consider checking the schema version of the file with `util.getSchemaVersion(filename)`
and comparing with the YAML namespace version present in nwb-schema/core/nwb.namespace.yaml

Error in types.util.checkUnset (line 13)
assert(isempty(dropped),...

Error in types.hdmf_common.VectorData (line 26)
            types.util.checkUnset(obj, unique(varargin(1:2:end)));

Error in io.parseDataset (line 81)
    parsed = eval([Type.typename '(kwargs{:})']);

Error in io.parseGroup (line 22)
    dataset = io.parseDataset(filename, datasetInfo, fullPath, Blacklist);

Error in io.parseGroup (line 38)
    subg = io.parseGroup(filename, group, Blacklist);

Error in nwbRead (line 59)
nwb = io.parseGroup(filename, h5info(filename), Blacklist);

The error that you have may indicate another Path issue. Would it be possible to check that generateCore is simply never called before reading a NWB file? For your test case, could you also ensure that the +types module in the MatNWB installation directory only contains +types/+util and +types/+untyped?

The error I got speaks to a possible bug in the data generation.

@lawrence-mbf lawrence-mbf self-assigned this Feb 13, 2023
@yarikoptic
Copy link
Author

yarikoptic commented Feb 13, 2023

  • re different behavior -- was it using my reproducer or some other custom way to build up/invoke?
  • note that .nwb files are annexed, so symlinks to the actual content files. not sure if relevant but begs the question - does matnwb de-reference paths and do something path related later on?
  • here is a full listing of files in that temp directory http://www.oneukrainian.com/tmp/matnwb-TJq1ZBO-ls.txt

has both

edit 1 : more correct grep for the same statement

❯ grep -e '+types/+util' matnwb-TJq1ZBO-ls.txt | head -n 1
   391704       4 drwxr-xr-x   3 dandi    dandi          4096 Feb 10 17:36 ./matnwb/+types/+util
❯ grep -e '+types/+untyped' matnwb-TJq1ZBO-ls.txt | head -n 1
   391678       4 drwxr-xr-x   3 dandi    dandi          4096 Feb 10 17:36 ./matnwb/+types/+untyped

@lawrence-mbf
Copy link
Collaborator

was it using my reproducer?

No, this is also on a Windows machine anyway. To sum up what I did:

  • clone from master (no changes, up to date).
  • Ensure that +types only contains +types/+util and +types/untyped. This is equivalent to just cloning a new repository in master
  • Call nwb = nwbRead('<file location>'); while working directory is in the matnwb install directory. No Path modifications are made here.

does matnwb de-reference paths and do something path related later on?

MatNWB only adds to the path using addpath at most. These are to use the YAML parsing JAR (using javaaddpath) and to use the ISO8601 external mathworks package (using addpath). These do not appear to be relevant to your issue though.

Looking at the file listing, the following directories should be deleted in ./matnwb/+types:

  • +core
  • +hdmf_common
  • +hdmf_experimental

@yarikoptic
Copy link
Author

are you suggesting to delete them manually or just stating that something in the generateCore should be fixed?

@yarikoptic
Copy link
Author

FWIW, theoretically the reproducer should work even on Windows happen you install git-annex and datalad in msys2 env... will try later some time to see if there is some hidden gotchas

@lawrence-mbf
Copy link
Collaborator

are you suggesting to delete them manually...

Yes. This issue only came about because generateCore was called without the custom ./out directory. I wonder if I should just add a script that clears the namespace for the user. I'd imagine it's not comfortable for people to modify their installation directories like that.

@yarikoptic
Copy link
Author

yarikoptic commented Feb 14, 2023

FWIW

  • ideally users should not call any script manually -- "installation distribution" should provide all necessary pieces or, if necessary, produce/update them at run time (e.g. somewhere within ~/.cache/matnwb/{version_id_of_some_kind}
  • AFAIK we have specified savedir, out folder for only because were instructed to do so. If it is not necessary/recommended for testing basic functionality of nwbRead - I would be happy to remove that from our test.

With that in mind I have tried without savedir to the same error as you

(dandisets-2) dandi@drogon:/tmp/matnwb-TJq1ZBO/matnwb$ MATLABPATH=$PWD:$PWD/../out time matlab -nodesktop -batch "f='../000022/sub-744912845/sub-744912845_ses-766640955.nwb'; disp(util.getSchemaVersion(f)); nwb = nwbRead(f)"
2.2.2
Error using assert
Unexpected properties {unit}.

Your schema version may be incompatible with the file.  Consider checking the
schema version of the file with `util.getSchemaVersion(filename)` and comparing
with the YAML namespace version present in nwb-schema/core/nwb.namespace.yaml

Error in types.util.checkUnset (line 13)
assert(isempty(dropped),...

Error in types.hdmf_common.VectorData (line 26)
            types.util.checkUnset(obj, unique(varargin(1:2:end)));

Error in io.parseDataset (line 81)
    parsed = eval([Type.typename '(kwargs{:})']);

Error in io.parseGroup (line 22)
    dataset = io.parseDataset(filename, datasetInfo, fullPath, Blacklist);

Error in io.parseGroup (line 38)
    subg = io.parseGroup(filename, group, Blacklist);

Error in nwbRead (line 59)
nwb = io.parseGroup(filename, h5info(filename), Blacklist);
 
Command exited with non-zero status 1

so I guess you got different error because you called differently.

as for this error, seems like other Windows users encounter: #492 and I got in #490 .

Do you see the same error and can reproduce if you do specify savedir?

@lawrence-mbf
Copy link
Collaborator

ideally users should not call any script manually

Correct, you do not need generateCore if you're reading files. This issue only comes out of calling generateCore, then using savedir when calling nwbRead. To fix this issue requires either a script to run or to manually delete +types/+core, +types/+hdmf_*, or other extensions which are erroneously generated.

AFAIK we have specified savedir, out folder for only because were instructed to do so...

That's fine. In fact, for this specific purpose of testing multiple files with a single installation, you should continue using savedir and point to a directory that gets cleaned up after every read. This ensures that the installation directory is not polluted with extraneous or inaccurate files and properly emulates a fresh clone of a repository.

@yarikoptic
Copy link
Author

Correct, you do not need generateCore if you're reading files.

well -- the documentation doesn't say that it is "optional" -- https://github.com/NeurodataWithoutBorders/matnwb#step-2-generate-the-api pretty much states to do that. (next step for extensions is listed to be optional so we do not)

That's fine. In fact, for this specific purpose of testing multiple files with a single installation, you should continue using savedir and point to a directory that gets cleaned up after every read.

ok, we can keep running generateCore + "continue using savedir" and ensure cleaning it up (saveDir) after each run. But I am still confused about the

To fix this issue requires either a script to run or to manually delete +types/+core, +types/+hdmf_*, or other extensions which are erroneously generated.

what "script to run"? shouldn't it be fixed within matnwb? Note that, as reproducer shows, we ran for that file in a completely clean fresh installation so there is no "pollution" of any kind.

@lawrence-mbf
Copy link
Collaborator

the documentation doesn't say that it is "optional"

Yeah, that should be clarified. You only need to run generateCore and generateExtension if you're writing files. For reading files it's unnecessary anyway.

ok, we can keep running generateCore + "continue using savedir"

I advise not doing that, frankly, as generateCore is what "pollutes" the namespace with NWB schema 2.6.0 which might not be what the file was written in. That's the primary cause of the issue since that's just how MATLAB path priority works.

@lawrence-mbf
Copy link
Collaborator

lawrence-mbf commented Feb 15, 2023

Sorry, I need to complete the thought and actually offer something constructive:

Just don't use generateCore if you're going to use nwbRead(..., 'savedir', ...).

@yarikoptic
Copy link
Author

Sorry for being a pain, I am just trying to arrive to some logical conclusion of how to instruct users "generally" on how to use matnwb to load or write .nwb files.

Just don't use generateCore if you're going to use nwbRead(..., 'savedir', ...).

That would boil to overall distinction of two "modes" (with or without generateCore) of how matnwb's nwbRead should be installed/used (not using/using savedir), correct?

@lawrence-mbf
Copy link
Collaborator

Most users should be fine with just nwbRead without savedir which works just fine with or without generateCore (assuming a properly embedded schema).

I'd suggest only using savedir for advanced automation over a lot of NWB files with different schema versions (i.e. our test suite CI). I guess the fundamental question is "where do you want your class files to be?". generateCore and nwbRead both do that and if you're not consistent you'll run into these kinds of MATLAB path issues. If you don't care, don't use savedir.

In the case of your CI, I think my concern was about false negatives where the install directory doesn't really represent a clean git clone MatNWB installation.

@yarikoptic
Copy link
Author

In the case of your CI, I think my concern was about false negatives where the install directory doesn't really represent a clean git clone MatNWB installation.

we do it clean (see here) -- remove if prior one exists, download tarball of matnwb from github , extract, do generateCore() -- everything as the README.md teaches us.

I'd suggest only using savedir for advanced automation over a lot of NWB files with different schema versions (i.e. our test suite CI).

That is exactly us - automation over a lot of NWB from different datasets, hence difference schema versions! And how then we should take advantage of savedir - should we just not do generateCore and point savedir to the same location (how does it then work for different versions?)?

@yarikoptic
Copy link
Author

anyways , back to current issue, if I do not generateCore (or to the same effect just git clean -dfx in existing matnwb checkout) and do not use savedir,

I still get that error we both observed
(dandisets-2) dandi@drogon:/tmp/matnwb-TJq1ZBO/matnwb$ MATLABPATH=$PWD time matlab -nodesktop -batch "f='../000022/sub-744912845/sub-744912845_ses-766640955.nwb'; disp(util.getSchemaVersion(f)); nwb = nwbRead(f)"
2.2.2
Error using assert
Unexpected properties {unit}.

Your schema version may be incompatible with the file.  Consider checking the
schema version of the file with `util.getSchemaVersion(filename)` and comparing
with the YAML namespace version present in nwb-schema/core/nwb.namespace.yaml

Error in types.util.checkUnset (line 13)
assert(isempty(dropped),...

Error in types.hdmf_common.VectorData (line 26)
            types.util.checkUnset(obj, unique(varargin(1:2:end)));

Error in io.parseDataset (line 81)
    parsed = eval([Type.typename '(kwargs{:})']);

Error in io.parseGroup (line 22)
    dataset = io.parseDataset(filename, datasetInfo, fullPath, Blacklist);

Error in io.parseGroup (line 38)
    subg = io.parseGroup(filename, group, Blacklist);

Error in nwbRead (line 59)
nwb = io.parseGroup(filename, h5info(filename), Blacklist);
 
Command exited with non-zero status 1
19.63user 1.09system 0:19.44elapsed 106%CPU (0avgtext+0avgdata 915580maxresident)k
10800inputs+1480outputs (12major+198358minor)pagefaults 0swaps

so the file has 2.2.2 and there seems to be 2.2.2 available within matnwb:

(dandisets-2) dandi@drogon:/tmp/matnwb-TJq1ZBO/matnwb$ ls nwb-schema/
2.0.2  2.1.0  2.2.0  2.2.1  2.2.2  2.2.3  2.2.4  2.2.5  2.3.0  2.4.0  2.5.0  2.6.0

also IIRC there should be a copy of schema within nwb file ... how do we open/load such a file?

@lawrence-mbf
Copy link
Collaborator

so the file has 2.2.2 and there seems to be 2.2.2 available within matnwb:

This is not relevant in case of nwbRead because:

also IIRC there should be a copy of schema within nwb file ... how do we open/load such a file?

nwbRead actually internally calls the equivalent of generateCore pointing to the embedded schema and all extensions. This is why these pathing issues can easily occur if you're using savedir.

Looking at the file again, I have to apologize because I thought I had read this file successfully before though I had not. In fact, this error is tied this old issue: #43 where an anonymous VectorData defined by the Units table adds fields. Still thinking about fixes there but a cheap workaround is possible to allow for ignoring this class of error possibly.

@yarikoptic
Copy link
Author

if you think it is at large duplicate of #43 -- feel free to close. I've subscribed to #43 as well for now.

so the file has 2.2.2 and there seems to be 2.2.2 available within matnwb:

This is not relevant in case of nwbRead because:

also IIRC there should be a copy of schema within nwb file ... how do we open/load such a file?

nwbRead actually internally calls the equivalent of generateCore pointing to the embedded schema and all extensions.

then error message suggesting some incompatibility ("Your schema version may be incompatible with the file.") is even more confusing, and may be this message should be made more specific/adequate to different cases of problems.

@lawrence-mbf
Copy link
Collaborator

Issue now duplicate of #238

@lawrence-mbf lawrence-mbf closed this as not planned Won't fix, can't repro, duplicate, stale May 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants