BUG: predict empty file handling #634

nhoglen · 2023-02-25T01:38:01Z

Describe the bug
predict breaks when converting segments to crowsetta seq if no segments were found in the file.

To Reproduce

vak prep check_this_recording.toml runs fine.

vak predict check_this_recording.toml runs until vak.core.predict - INFO - converting predictions to annotations then throws this error:

Traceback (most recent call last):
  File "C:\Users\Nerissa\anaconda3\envs\vak-env4\Scripts\vak-script.py", line 9, in <module>
    sys.exit(main())
  File "C:\Users\Nerissa\anaconda3\envs\vak-env4\lib\site-packages\vak\__main__.py", line 48, in main
    cli.cli(command=args.command, config_file=args.configfile)
  File "C:\Users\Nerissa\anaconda3\envs\vak-env4\lib\site-packages\vak\cli\cli.py", line 49, in cli
    COMMAND_FUNCTION_MAP[command](toml_path=config_file)
  File "C:\Users\Nerissa\anaconda3\envs\vak-env4\lib\site-packages\vak\cli\cli.py", line 18, in predict
  File "C:\Users\Nerissa\anaconda3\envs\vak-env4\lib\site-packages\vak\cli\predict.py", line 50, in predict
    core.predict(
  File "C:\Users\Nerissa\anaconda3\envs\vak-env4\lib\site-packages\vak\core\predict.py", line 244, in predict
    seq = crowsetta.Sequence.from_keyword(
  File "C:\Users\Nerissa\anaconda3\envs\vak-env4\lib\site-packages\crowsetta\sequence.py", line 382, in from_keyword
    labels) = cls._validate_onsets_offsets_labels(onsets_s,
  File "C:\Users\Nerissa\anaconda3\envs\vak-env4\lib\site-packages\crowsetta\sequence.py", line 224, in _validate_onsets_offsets_labels
    raise ValueError('must provide either onset_inds and offset_inds, or '
ValueError: must provide either onset_inds and offset_inds, or onsets_s and offsets_s

I did a little bit of digging and determined that audio segments without any calls are tripping up the crowsetta conversion process. Specifically, at line 239 of the vak predict.py file, there is a check for empty segments that is going awry because labels is turning up an empty list rather than a None object.

if labels is None and onsets_s is None and offsets_s is None:
                # handle the case when all time bins are predicted to be unlabeled
                # see https://github.com/NickleDave/vak/issues/383
                continue
seq = crowsetta.Sequence.from_keyword(
                labels=labels, onsets_s=onsets_s, offsets_s=offsets_s
            )

Inspecting the variables at this step shows that onset_s and offset_s are None as expected, but not labels.

(Pdb) print(len(labels))
0
(Pdb) print(len(onsets_s))
*** TypeError: object of type 'NoneType' has no len()
(Pdb) print(len(offsets_s))
*** TypeError: object of type 'NoneType' has no len()

Expected behavior
In my previous vak install (version 0.4.0, I think), this ran to completion and generated a full set of predictions.

Desktop (please complete the following information):

Operating System: Windows 10
Versions: vak 0.8.0, TweetyNet 0.9.0, crowsetta 3.4.1

Additional context
I will email the files to reproduce; the zip is too big to upload here.

The text was updated successfully, but these errors were encountered:

NickleDave · 2023-02-26T01:18:39Z

Sorry you had this issue @nhoglen and thank you for this very thorough bug report.

I'll know for sure when I test with the data but my hunch is you have correctly identified the issue; we do not correctly handle the case where there are no predicted segments because we don't look for the empty list

This was supposed to be fixed by #394 but clearly it wasn't!

Could you please share the data some other way? I wasn't able to open the sharepoint link. Will ask by email too but wanted to reply here (and thank you profusely for making this easy for me to fix 😼 😁 )

edit: Box link you sent worked, nvm!

second edit: just to cross-link, this was originally posted on the forum: https://forum.vocalpy.org/t/error-in-last-step-of-generating-predictions/54

NickleDave · 2023-02-26T03:16:18Z

🤔 I'm not able to replicate this bug -- vak predict runs fine for me on Linux using the files you provided.

Could you please reply with the following files attached?

the .toml configuration file you're using
the output of running the following command in your conda environment: conda env export > environment.yml
as well as conda list --explicit > spec-file.txt (also in the conda environment you are using when you get this bug)

I'm attaching the .toml file I used as well as the csv of annotations that was produced, in a .tar

predict-results.tar.gz

fwiw I have the same versions of vak, tweetynet, and crowsetta installed. I can try making an env on Windows with conda files just in case for some reason the platform is the source of the issue.

Thank you! 🙏

nhoglen · 2023-02-28T17:57:23Z

Oops, sorry I missed this -- I'm not used to looking out for things on GitHub.

I have found that getting the environment set up right on Windows takes some finagling, so I would not be surprised if there are weird version issues baked in there.

The attached zip has the listed files. environment-etc.zip

I'm happy to keep poking at this and will keep a closer eye on GitHub/figure out how to get it to send more notifications.

NickleDave · 2023-03-03T03:19:02Z

Thank you @nhoglen! That helped.
It was an option in the TOML config file that I was missing, the min_segment_dur.
Once I added that to the config I had, I was able to replicate your error.

I now also get the same exception that you first reported on the forum:

Traceback (most recent call last):
  File "/home/pimienta/miniconda3/envs/vak8-env/bin/vak", line 10, in <module>
    sys.exit(main())
  File "/home/pimienta/miniconda3/envs/vak8-env/lib/python3.8/site-packages/vak/__main__.py", line 48, in main
    cli.cli(command=args.command, config_file=args.configfile)
  File "/home/pimienta/miniconda3/envs/vak8-env/lib/python3.8/site-packages/vak/cli/cli.py", line 49, in cli
    COMMAND_FUNCTION_MAP[command](toml_path=config_file)
  File "/home/pimienta/miniconda3/envs/vak8-env/lib/python3.8/site-packages/vak/cli/cli.py", line 18, in predict
    predict(toml_path=toml_path)
  File "/home/pimienta/miniconda3/envs/vak8-env/lib/python3.8/site-packages/vak/cli/predict.py", line 50, in predict
    core.predict(
  File "/home/pimienta/miniconda3/envs/vak8-env/lib/python3.8/site-packages/vak/core/predict.py", line 243, in predict
    seq = crowsetta.Sequence.from_keyword(
  File "/home/pimienta/miniconda3/envs/vak8-env/lib/python3.8/site-packages/crowsetta/sequence.py", line 382, in from_keyword
    labels) = cls._validate_onsets_offsets_labels(onsets_s,
  File "/home/pimienta/miniconda3/envs/vak8-env/lib/python3.8/site-packages/crowsetta/sequence.py", line 224, in _validate_onsets_offsets_labels
    raise ValueError('must provide either onset_inds and offset_inds, or '
ValueError: must provide either onset_inds and offset_inds, or onsets_s and offsets_s
(vak8-env)  ✘ pimienta@pop-os  ~/Documents/data/vocal/bug-report

It does look like this is in fact a bug I introduced when I added a new feature here:
8c6aee1
where for some reason that's not clear to me now I changed this function to return an empty string when there were no predicted segments after post-processing (e.g. with a min. segment dur):

vak/src/vak/transforms/labeled_timebins/functional.py

Line 191 in 8c6aee1

return '', None, None

whereas the same condition would have caused the previous version of this function would have returned all Nones:

vak/src/vak/labeled_timebins.py

Line 441 in 25a0c8a

return None, None, None

Not sure if changing it back will break any other changes I made--will find out when I test--but my plan is to change it back for now and release a bugfix version.

BUG: Have `to_segments` return all Nones for no segments, fix #634

nhoglen added the BUG Something isn't working label Feb 25, 2023

NickleDave self-assigned this Feb 25, 2023

NickleDave closed this as completed in 56e6154 Mar 3, 2023

NickleDave added a commit that referenced this issue Mar 3, 2023

MRG: #636 from vocalpy/bugfix-to-segments-return-all-nones

86e023f

BUG: Have `to_segments` return all Nones for no segments, fix #634

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: predict empty file handling #634

BUG: predict empty file handling #634

nhoglen commented Feb 25, 2023

NickleDave commented Feb 26, 2023 •

edited

Loading

NickleDave commented Feb 26, 2023 •

edited

Loading

nhoglen commented Feb 28, 2023

NickleDave commented Mar 3, 2023

BUG: predict empty file handling #634

BUG: predict empty file handling #634

Comments

nhoglen commented Feb 25, 2023

NickleDave commented Feb 26, 2023 • edited Loading

NickleDave commented Feb 26, 2023 • edited Loading

nhoglen commented Feb 28, 2023

NickleDave commented Mar 3, 2023

NickleDave commented Feb 26, 2023 •

edited

Loading

NickleDave commented Feb 26, 2023 •

edited

Loading