add config file option #86

bendichter · 2022-03-02T18:28:28Z

restructure available_checks
add config json schema
add example config file
add configure_checks function and corresponding test

TODO: expose to command line

add config json schema add example config file add configure_checks function and corresponding test TODO: expose to command line

for more information, see https://pre-commit.ci

bendichter · 2022-03-02T18:34:11Z

@CodyCBakerPhD I started to build out the feature for custom configuration file that changes the severity of checks. In doing so, I simplified the available_checks dictionary from:

{Importance: {nd_type: [checks]}}

to

{Importance: [checks]}

the nd_type is pulled from the attribute of the check. This is a simplification without loss of function for the library, and it makes the config feature easier to implement.

CodyCBakerPhD · 2022-03-02T18:43:27Z

@bendichter 👍 Yup, looking over it now.

The main reason for pre-sorting by neurodata type was to save time complexity from having to manually inspect it during each step of the outer iteration, but I can re-analyze it again and see if there's an even better way to structure that.

nwbinspector/config.schema.json

nwbinspector/nwbinspector.py

CodyCBakerPhD · 2022-03-02T21:12:01Z

As I look at the primary iteration usage of ignore and select, would these make sense to include somehow as config options used to shorten the full available_checks? Or even possibly importance_threshold?

That way, we could move those nested if checks outside of the iteration and instead pass a shortened collection of checks into inspect_nwb, reducing the number of operations that get multiplied. And also making the input arguments to inspect_nwb much simpler by only expecting some pre-configured specification of checks to run on the NWBFile.

Extreme example: we only want to select a couple checks. Currently the code still iterates over the entire registry of check functions for each object in the NWBFile.

                    for nwbfile_object in nwbfile.objects.values():    <---- could have a large number of objects
                        for check_function in check_functions:    <---- could have a large number of functions
                            if issubclass(type(nwbfile_object), check_function.neurodata_type):
                                if ignore is not None and check_function.__name__ in ignore: <-- line gets hit every time
                                    continue
                                if select is not None and check_function.__name__ not in select:
                                    continue
                                output = check_function(nwbfile_object)    <---- only reaches here a couple of times

CodyCBakerPhD · 2022-03-02T23:30:58Z

Oh, I also think it would help the design of the configuration approach (and probably other things too...) if the initial registry availabe_checks was a dictionary whose keys are the check function names and whose values are the check functions themselves. An advantage of this would be the ability to take those string names from the JSON and directly index like available_checks[check_function_name].

With this approach, the default configuration would simply be the available_checks structured into what they are now (an OrderedDict of Importance with lists of check functions as values at each level), but any other config would follow the same structure +/- manipulation of importance levels, inclusion or exclusion of particular check names, etc.

What do you think?

Co-authored-by: Cody Baker <51133164+CodyCBakerPhD@users.noreply.github.com>

nwbinspector/nwbinspector.py

include config in cli (untested)

# Conflicts: # nwbinspector/nwbinspector.py

for more information, see https://pre-commit.ci

bendichter · 2022-03-03T01:59:23Z

@CodyCBakerPhD yes, I think that could make things simpler

bendichter · 2022-03-03T02:01:41Z

So the config logic would simply be something like:

checks = copy.copy(available_checks)
for importance, func_names in config.items():
    func = checks[func]
    func.importance = importance
    check[func] = func

CodyCBakerPhD · 2022-03-03T16:42:26Z

So the config logic would simply be something like:

Yeah, pretty much. Looks much cleaner.

for more information, see https://pre-commit.ci

bendichter · 2022-03-03T16:48:20Z

@CodyCBakerPhD I went even further and just made available_checks a list of checks

nwbinspector/dandi.inspector_config.yaml

requirements.txt

CodyCBakerPhD · 2022-03-03T17:13:44Z

I went even further and just made available_checks a list of checks

@bendichter Yup, this all looks really good now.

CodyCBakerPhD · 2022-03-03T17:16:14Z

Also, I just remembered Dorota pointed out that it may be for the best to only allow elevation of Importance levels via the config to prevent people from being able to forcibly downgrade important things; do you agree with that?

Also also, Yarik mentioned having the config be able to specify SKIP type behavior on a per-NWBFile level when this config is meant to be run on a directory. Any ideas for how that could go into this schema?

bendichter · 2022-03-03T20:18:06Z

Also also, Yarik mentioned having the config be able to specify SKIP type behavior on a per-NWBFile level when this config is meant to be run on a directory. Any ideas for how that could go into this schema?

I don't know what you mean

CodyCBakerPhD · 2022-03-03T20:37:43Z

From #37

It is needed since some of such ignores are "per dataset" and thus it should be possible to specify what to ignore in a specific dataset (or dandiset) in DANDI land, instead of relying on user to know what command line options to provide.

Unless he means to apply the SKIP rule over all files in that directory and not NWBFile-specific.

Example: /000003/file1.nwb has a check_data_orientation violation that is later confirmed to be OK so check_data_orientation is added to the SKIP field of the config used when running on the entire dataset 000003. But /000003/file2.nwb also had a check_data_orientation violation that was confirmed to be a user mistake.

bendichter · 2022-03-03T22:16:41Z

@CodyCBakerPhD OK, I understand. Skipping checks for specific files. I'd say if you want to do that you can run your own script and do something like

for file in glob.glob(os.path.join(path, '*.nwb'):
    if file == 'bad_file.nwb':
        ignore = ["check_data_orientation"]
    else:
        ignore = None
    inspect_nwb(file, ignore=ignore)

Co-authored-by: Cody Baker <51133164+CodyCBakerPhD@users.noreply.github.com>

nwbinspector/config.schema.json

nwbinspector/nwbinspector.py

CodyCBakerPhD · 2022-03-04T17:23:01Z

@bendichter Two minor suggestions for schema-pertaining items.

Otherwise LGTM

codecov-commenter · 2022-03-04T17:33:45Z

Codecov Report

Merging #86 (e311f69) into dev (47698e8) will decrease coverage by 0.75%.
The diff coverage is 90.36%.

@@            Coverage Diff             @@
##              dev      #86      +/-   ##
==========================================
- Coverage   94.31%   93.55%   -0.76%     
==========================================
  Files          12       12              
  Lines         422      481      +59     
==========================================
+ Hits          398      450      +52     
- Misses         24       31       +7

Flag	Coverage Δ
unittests	`93.55% <90.36%> (-0.76%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
nwbinspector/nwbinspector.py	`80.64% <80.00%> (-2.54%)`	⬇️
nwbinspector/utils.py	`96.15% <93.33%> (-3.85%)`	⬇️
nwbinspector/checks/tables.py	`100.00% <100.00%> (ø)`
nwbinspector/register_checks.py	`98.85% <100.00%> (+0.02%)`	⬆️

restructure available_checks

e426359

add config json schema add example config file add configure_checks function and corresponding test TODO: expose to command line

bendichter requested a review from CodyCBakerPhD March 2, 2022 18:28

pre-commit-ci bot and others added 2 commits March 2, 2022 18:30

[pre-commit.ci] auto fixes from pre-commit.com hooks

6389ea2

for more information, see https://pre-commit.ci

Merge branch 'dev' into add_config

7ad60f1

CodyCBakerPhD reviewed Mar 2, 2022

View reviewed changes

nwbinspector/config.schema.json Outdated Show resolved Hide resolved

CodyCBakerPhD reviewed Mar 2, 2022

View reviewed changes

nwbinspector/nwbinspector.py Outdated Show resolved Hide resolved

CodyCBakerPhD reviewed Mar 2, 2022

View reviewed changes

nwbinspector/nwbinspector.py Show resolved Hide resolved

CodyCBakerPhD added the category: enhancement improvements of code or code behavior label Mar 2, 2022

CodyCBakerPhD assigned bendichter Mar 2, 2022

CodyCBakerPhD added this to the DANDI-related checks and Intention milestone Mar 2, 2022

Update nwbinspector/config.schema.json

05d814e

Co-authored-by: Cody Baker <51133164+CodyCBakerPhD@users.noreply.github.com>

bendichter commented Mar 3, 2022

View reviewed changes

nwbinspector/nwbinspector.py Outdated Show resolved Hide resolved

Update nwbinspector/nwbinspector.py

bd811fe

bendichter commented Mar 3, 2022

View reviewed changes

nwbinspector/nwbinspector.py Outdated Show resolved Hide resolved

bendichter and others added 4 commits March 2, 2022 20:07

Update nwbinspector/nwbinspector.py

b0332cf

add SKIP

1dc4758

include config in cli (untested)

Merge remote-tracking branch 'origin/add_config' into add_config

e020175

# Conflicts: # nwbinspector/nwbinspector.py

[pre-commit.ci] auto fixes from pre-commit.com hooks

dd755aa

for more information, see https://pre-commit.ci

bendichter and others added 3 commits March 3, 2022 12:45

reformat available_checks to be flat

d5a90b5

Merge remote-tracking branch 'origin/add_config' into add_config

ffacaac

[pre-commit.ci] auto fixes from pre-commit.com hooks

78ba3d4

for more information, see https://pre-commit.ci

add PyYAML to setup requirements

b1fe733

Merge remote-tracking branch 'origin/add_config' into add_config

b38e272

CodyCBakerPhD reviewed Mar 3, 2022

View reviewed changes

nwbinspector/dandi.inspector_config.yaml Outdated Show resolved Hide resolved

CodyCBakerPhD reviewed Mar 3, 2022

View reviewed changes

requirements.txt Show resolved Hide resolved

CodyCBakerPhD mentioned this pull request Mar 3, 2022

NWBInspector Integration dandi/dandi-cli#924

Open

3 tasks

Merge branch 'dev' into add_config

64dc1b7

Update nwbinspector/dandi.inspector_config.yaml

00784ce

Co-authored-by: Cody Baker <51133164+CodyCBakerPhD@users.noreply.github.com>

bendichter marked this pull request as ready for review March 4, 2022 17:00

bendichter requested a review from CodyCBakerPhD March 4, 2022 17:00

CodyCBakerPhD reviewed Mar 4, 2022

View reviewed changes

nwbinspector/config.schema.json Outdated Show resolved Hide resolved

CodyCBakerPhD reviewed Mar 4, 2022

View reviewed changes

nwbinspector/nwbinspector.py Show resolved Hide resolved

CodyCBakerPhD self-requested a review March 4, 2022 17:23

CodyCBakerPhD approved these changes Mar 4, 2022

View reviewed changes

bendichter added 2 commits March 4, 2022 13:23

add version to config.schema.json

f88bcc9

add config validation

e311f69

CodyCBakerPhD merged commit dda7735 into dev Mar 6, 2022

CodyCBakerPhD deleted the add_config branch March 6, 2022 02:50

CodyCBakerPhD mentioned this pull request Mar 9, 2022

Add flag for DANDI intended upload #35

Closed

yarikoptic mentioned this pull request Mar 21, 2022

[Feature]: per dataset configuration file to skip/ignore false positives #117

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add config file option #86

add config file option #86

bendichter commented Mar 2, 2022

bendichter commented Mar 2, 2022

CodyCBakerPhD commented Mar 2, 2022

CodyCBakerPhD commented Mar 2, 2022 •

edited

CodyCBakerPhD commented Mar 2, 2022

bendichter commented Mar 3, 2022

bendichter commented Mar 3, 2022

CodyCBakerPhD commented Mar 3, 2022 •

edited

bendichter commented Mar 3, 2022

CodyCBakerPhD commented Mar 3, 2022

CodyCBakerPhD commented Mar 3, 2022

bendichter commented Mar 3, 2022

CodyCBakerPhD commented Mar 3, 2022 •

edited

bendichter commented Mar 3, 2022

CodyCBakerPhD commented Mar 4, 2022

codecov-commenter commented Mar 4, 2022

add config file option #86

add config file option #86

Conversation

bendichter commented Mar 2, 2022

bendichter commented Mar 2, 2022

CodyCBakerPhD commented Mar 2, 2022

CodyCBakerPhD commented Mar 2, 2022 • edited

CodyCBakerPhD commented Mar 2, 2022

bendichter commented Mar 3, 2022

bendichter commented Mar 3, 2022

CodyCBakerPhD commented Mar 3, 2022 • edited

bendichter commented Mar 3, 2022

CodyCBakerPhD commented Mar 3, 2022

CodyCBakerPhD commented Mar 3, 2022

bendichter commented Mar 3, 2022

CodyCBakerPhD commented Mar 3, 2022 • edited

bendichter commented Mar 3, 2022

CodyCBakerPhD commented Mar 4, 2022

codecov-commenter commented Mar 4, 2022

Codecov Report

CodyCBakerPhD commented Mar 2, 2022 •

edited

CodyCBakerPhD commented Mar 3, 2022 •

edited

CodyCBakerPhD commented Mar 3, 2022 •

edited