Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"BrokenPipeError" on Windows #293

Closed
NickleDave opened this issue Jan 27, 2021 · 16 comments
Closed

"BrokenPipeError" on Windows #293

NickleDave opened this issue Jan 27, 2021 · 16 comments
Assignees
Labels
BUG Something isn't working

Comments

@NickleDave
Copy link
Collaborator

@yardencsGitHub reports the following error on Windows when running vak learncurve.
This occurs right when training starts, first batch after tqdm progress bar pops up.

Full error attached as a .txt file
err_msg_output.txt

(most recent call last):
  File "<string>", line 1, in <module>
  0%|                                                                                         | 0/8520 [00:52<?, ?it/s]
Traceback (most recent call last):
  File "c:\anaconda3\envs\vak-clone\lib\multiprocessing\spawn.py", line 105, in spawn_main
  File "c:\anaconda3\envs\vak-clone\lib\runpy.py", line 193, in _run_module_as_main
    exitcode = _main(fd)
      File "c:\anaconda3\envs\vak-clone\lib\multiprocessing\spawn.py", line 115, in _main
"__main__", mod_spec)    self = reduction.pickle.load(from_parent)

  File "c:\anaconda3\envs\vak-clone\lib\site-packages\torch\__init__.py", line 117, in <module>
  File "c:\anaconda3\envs\vak-clone\lib\runpy.py", line 85, in _run_code
    raise err
    OSError: [WinError 1455] The paging file is too small for this operation to complete. Error loading "c:\anaconda3\envs\vak-clone\lib\site-packages\torch\lib\caffe2_detectron_ops_gpu.dll" or one of its dependencies.
exec(code, run_globals)
  File "C:\Anaconda3\envs\vak-clone\Scripts\vak.exe\__main__.py", line 7, in <module>
  File "c:\anaconda3\envs\vak-clone\lib\site-packages\vak\__main__.py", line 44, in main
    config_file=args.configfile)
  File "c:\anaconda3\envs\vak-clone\lib\site-packages\vak\cli\cli.py", line 31, in cli
    learning_curve(toml_path=config_file)
  File "c:\anaconda3\envs\vak-clone\lib\site-packages\vak\cli\learncurve.py", line 75, in learning_curve
    logger=logger,
  File "c:\anaconda3\envs\vak-clone\lib\site-packages\vak\core\learncurve.py", line 502, in learning_curve
    **window_dataset_kwargs
  File "c:\anaconda3\envs\vak-clone\lib\site-packages\vak\core\train.py", line 289, in train
    device=device)
  File "c:\anaconda3\envs\vak-clone\lib\site-packages\vak\engine\model.py", line 409, in fit
    ckpt_step)
  File "c:\anaconda3\envs\vak-clone\lib\site-packages\vak\engine\model.py", line 109, in _train
    for ind, batch in enumerate(progress_bar):
  File "c:\anaconda3\envs\vak-clone\lib\site-packages\tqdm\std.py", line 1166, in __iter__
    for obj in iterable:
  File "c:\anaconda3\envs\vak-clone\lib\site-packages\torch\utils\data\dataloader.py", line 352, in __iter__
    return self._get_iterator()
  File "c:\anaconda3\envs\vak-clone\lib\site-packages\torch\utils\data\dataloader.py", line 294, in _get_iterator
    return _MultiProcessingDataLoaderIter(self)
  File "c:\anaconda3\envs\vak-clone\lib\site-packages\torch\utils\data\dataloader.py", line 801, in __init__
    w.start()
  File "c:\anaconda3\envs\vak-clone\lib\multiprocessing\process.py", line 105, in start
    self._popen = self._Popen(self)
  File "c:\anaconda3\envs\vak-clone\lib\multiprocessing\context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "c:\anaconda3\envs\vak-clone\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "c:\anaconda3\envs\vak-clone\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
    reduction.dump(process_obj, to_child)
  File "c:\anaconda3\envs\vak-clone\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe
@NickleDave NickleDave added the BUG Something isn't working label Jan 27, 2021
@NickleDave NickleDave self-assigned this Jan 27, 2021
@NickleDave
Copy link
Collaborator Author

@yardencsGitHub what happens if you set num_workers = 0 in the learncurve section?
Does the error message change?

I see similar errors on this issue, guessing it has to do with some interaction of low-level Torch code, Windows, and a custom dataloader:
pytorch/pytorch#2341

@NickleDave
Copy link
Collaborator Author

Can you please also run

$ conda env export > environment.yml
$ conda list --explicit > spec-file.txt

and reply with those files as an attachment?

@yardencsGitHub
Copy link
Collaborator

Attaching files
spec-file.zip

@NickleDave
Copy link
Collaborator Author

@yardencsGitHub I'm getting a similar error on Windows 10-- see traceback below--but on the prep step, and it's thrown by a different library. So seems like it's not torch specific

I created a Python 3.6 conda environment, installed pytorch and torchvision from the pytorch channel, and then let pip install everything else when I did pip install vak==0.4.0dev1

Going to try setting up a development environment on Windows to see if it's something conda specific or not

Might be the case that I can relax the restrictions on what versions of our dependencies we use, and make it that on Windows we use previous versions that we already know worked before. Will also test that out by using a dev environment

Here's the full traceback:

[                                        ] | 0% Completed |  0.4s
Traceback (most recent call last):
  File "c:\users\nickl\anaconda3\envs\vak-env\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "c:\users\nickl\anaconda3\envs\vak-env\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\Nickl\anaconda3\envs\vak-env\Scripts\vak.exe\__main__.py", line 7, in <module>
  File "c:\users\nickl\anaconda3\envs\vak-env\lib\site-packages\vak\__main__.py", line 44, in main
    config_file=args.configfile)
  File "c:\users\nickl\anaconda3\envs\vak-env\lib\site-packages\vak\cli\cli.py", line 19, in cli
    prep(toml_path=config_file)
  File "c:\users\nickl\anaconda3\envs\vak-env\lib\site-packages\vak\cli\prep.py", line 109, in prep
    logger=logger,
  File "c:\users\nickl\anaconda3\envs\vak-env\lib\site-packages\vak\core\prep.py", line 211, in prep
    logger=logger)
  File "c:\users\nickl\anaconda3\envs\vak-env\lib\site-packages\vak\io\dataframe.py", line 137, in from_files
    labelset=labelset)
  File "c:\users\nickl\anaconda3\envs\vak-env\lib\site-packages\vak\io\audio.py", line 218, in to_spect
    spect_files = list(bag.map(_spect_file))
  File "c:\users\nickl\anaconda3\envs\vak-env\lib\site-packages\dask\bag\core.py", line 1426, in __iter__
    return iter(self.compute())
  File "c:\users\nickl\anaconda3\envs\vak-env\lib\site-packages\dask\base.py", line 279, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "c:\users\nickl\anaconda3\envs\vak-env\lib\site-packages\dask\base.py", line 561, in compute
    results = schedule(dsk, keys, **kwargs)
  File "c:\users\nickl\anaconda3\envs\vak-env\lib\site-packages\dask\multiprocessing.py", line 228, in get
    **kwargs
  File "c:\users\nickl\anaconda3\envs\vak-env\lib\site-packages\dask\local.py", line 487, in get_async
    raise_exception(exc, tb)
  File "c:\users\nickl\anaconda3\envs\vak-env\lib\site-packages\dask\multiprocessing.py", line 119, in reraise
    raise exc
dask.multiprocessing.OSError: [WinError 1455] The paging file is too small for this operation to complete.

Traceback
---------
  File "c:\users\nickl\anaconda3\envs\vak-env\lib\site-packages\dask\local.py", line 221, in execute_task
    task, data = loads(task_info)
  File "c:\users\nickl\anaconda3\envs\vak-env\lib\site-packages\vak\__init__.py", line 13, in <module>
    from . import __main__
  File "c:\users\nickl\anaconda3\envs\vak-env\lib\site-packages\vak\__main__.py", line 8, in <module>
    from .cli import cli
  File "c:\users\nickl\anaconda3\envs\vak-env\lib\site-packages\vak\cli\__init__.py", line 4, in <module>
    from .cli import cli
  File "c:\users\nickl\anaconda3\envs\vak-env\lib\site-packages\vak\cli\cli.py", line 1, in <module>
    from .eval import eval
  File "c:\users\nickl\anaconda3\envs\vak-env\lib\site-packages\vak\cli\eval.py", line 4, in <module>
    from .. import config
  File "c:\users\nickl\anaconda3\envs\vak-env\lib\site-packages\vak\config\__init__.py", line 2, in <module>
    from . import models
  File "c:\users\nickl\anaconda3\envs\vak-env\lib\site-packages\vak\config\models.py", line 6, in <module>
    from ..engine.model import Model
  File "c:\users\nickl\anaconda3\envs\vak-env\lib\site-packages\vak\engine\__init__.py", line 1, in <module>
    from . import model
  File "c:\users\nickl\anaconda3\envs\vak-env\lib\site-packages\vak\engine\model.py", line 3, in <module>
    import torch
  File "c:\users\nickl\anaconda3\envs\vak-env\lib\site-packages\torch\__init__.py", line 117, in <module>
    raise err

@NickleDave
Copy link
Collaborator Author

I tried to set up a dev environment on Windows using poetry, but as far as I can tell it's not possible right now to declare torch + torchvision as dependencies using poetry, because reasons

python-poetry/poetry#2247

mainly having to do with how pytorch provides its distributions
pytorch/pytorch#26340
pytorch/pytorch#25639

I even tried a workaround like this, but no dice:
pytorch/pytorch#25639 (comment)

Even just creating a new project with poetry and declaring just torch and torchvision as the only dependencies fails at the poetry install step

@NickleDave
Copy link
Collaborator Author

I think for now the fix might be to add back setup.py, build Windows-specific wheels, and then upload those to PyPI.
Going to give it a try

@NickleDave
Copy link
Collaborator Author

NickleDave commented Feb 3, 2021

related discussion here, for future reference:

python-poetry/poetry#1391 (comment)

spinalcordtoolbox/spinalcordtoolbox#1526

@NickleDave
Copy link
Collaborator Author

@yardencsGitHub can you please also reply with environment.yml and spec-file.txt for the environment you have on Windows where vak + tweetynet is working?
Just trying to figure out what specific versions to use

On Windows I was able to build a source distribution from a setup.py file and successfully install it but haven't had a chance to test whether it runs yet

@NickleDave
Copy link
Collaborator Author

NickleDave commented Feb 5, 2021

after fighting with this more, I think the bug is not due to any change in how we build the distribution package.
It's some nasty interaction of our dependencies, Python versions, and conda environments.

Ways that I get a similar bug on Windows:

  • conda, Python 3.6 or 3.8, install pytorch and torchvision from the pytorch conda channel

so basically there's some sort of .DLL bug in the current conda binaries for torch/torchvision on Windows?

Ways that work on Windows -- do not give me the bug

  • conda, install pytorch and torchvision with pip as the pytorch.org instructs, using the f flag:
    pip install torch===1.7.1 torchvision===0.8.2 -f https://download.pytorch.org/whl/torch_stable.html
  • using python venv and then doing the same

edit: tried again with conda python==3.6 and dask is hanging weirdly. Seems like python==3.8 works better

so for conda do

C:\You> conda create -n vak-env python==3.8
C:\You> conda activate vak-env
(vak-env) C:\You> pip install torch===1.7.1 torchvision===0.8.2 -f https://download.pytorch.org/whl/torch_stable.html
(vak-env) C:\You> pip install vak==0.4.0.dev1
(vak-env) C:\You> pip install tweetynet

for venv do

C:\You> python -m venv .venv
C:\You> .venv/Scripts/activate.ps1  # activate.bat if you are not using powershell
(.venv) C:\You> pip install torch===1.7.1 torchvision===0.8.2 -f https://download.pytorch.org/whl/torch_stable.html
(.venv) C:\You> pip install vak==0.4.0.dev1
(.venv) C:\You> pip install tweetynet

@yardencsGitHub
Copy link
Collaborator

yardencsGitHub commented Feb 5, 2021

@yardencsGitHub can you please also reply with environment.yml and spec-file.txt for the environment you have on Windows where vak + tweetynet is working?
Just trying to figure out what specific versions to use

On Windows I was able to build a source distribution from a setup.py file and successfully install it but haven't had a chance to test whether it runs yet

I installed newer versions of vak + tweetynet in that environment.
Will try to roll back..
In this env. I'm running vak from a local cloned repo.
I believe I pip installed tweetynet and after an update I got crashes.
Those were simple mismatches between crowsetta and vak that I debugged (variable name changes)
(I also needed to pip install soundfile)
attached are the environment files (it works now)
spec-file.zip

@yardencsGitHub
Copy link
Collaborator

yardencsGitHub commented Feb 5, 2021

after fighting with this more, I think the bug is not due to any change in how we build the distribution package.
It's some nasty interaction of our dependencies, Python versions, and conda environments.

Ways that I get a similar bug on Windows:

  • conda, Python 3.6 or 3.8, install pytorch and torchvision from the pytorch conda channel

so basically there's some sort of .DLL bug in the current conda binaries for torch/torchvision on Windows?

Ways that work on Windows -- do not give me the bug

  • conda, install pytorch and torchvision with pip as the pytorch.org instructs, using the f flag:
    pip install torch===1.7.1 torchvision===0.8.2 -f https://download.pytorch.org/whl/torch_stable.html
  • using python venv and then doing the same

edit: tried again with conda python==3.6 and dask is hanging weirdly. Seems like python==3.8 works better

so for conda do

C:\You> conda create -n vak-env python==3.8
C:\You> conda activate vak-env
(vak-env) C:\You> pip install torch===1.7.1 torchvision===0.8.2 -f https://download.pytorch.org/whl/torch_stable.html
(vak-env) C:\You> pip install vak==0.4.0.dev1
(vak-env) C:\You> pip install tweetynet

for venv do

C:\You> python -m venv .venv
C:\You> .venv/Scripts/activate.ps1  # activate.bat if you are not using powershell
(.venv) C:\You> pip install torch===1.7.1 torchvision===0.8.2 -f https://download.pytorch.org/whl/torch_stable.html
(.venv) C:\You> pip install vak==0.4.0.dev1
(.venv) C:\You> pip install tweetynet

This worked for me in the beginning but crashed in the first validation attempt:

training TweetyNet
epoch 1 / 1
Epoch 1, batch 249. Loss: 0.4351. Global step: 249: 3%|▊ | 248/8520 [01:30<15:03, 9.15it/s]Step 250 is a validation step; computing metrics on validation set

Epoch 1, batch 249. Loss: 0.4351. Global step: 249: 3%|▊ | 248/8520 [01:40<15:03, 9.15it/s]Traceback (most recent call last):
File "", line 1, in
File "c:\anaconda3\envs\vak-pip-py38\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "c:\anaconda3\envs\vak-pip-py38\lib\multiprocessing\spawn.py", line 126, in main
self = reduction.pickle.load(from_parent)
File "c:\anaconda3\envs\vak-pip-py38\lib\site-packages\torch_init
.py", line 117, in
raise err
OSError: [WinError 1455] The paging file is too small for this operation to complete. Error loading "c:\anaconda3\envs\vak-pip-py38\lib\site-packages\torch\lib\caffe2_detectron_ops_gpu.dll" or one of its dependencies.

@NickleDave
Copy link
Collaborator Author

NickleDave commented Feb 6, 2021

Thank you @yardencsGitHub for the update and for attaching the zip file with

From the spec-file, it looks like both pytorch and torchvision were installed into that environment by conda, not using pip

...
https://conda.anaconda.org/pytorch/win-64/pytorch-1.4.0-py3.6_cuda101_cudnn7_0.tar.bz2
...
https://conda.anaconda.org/pytorch/win-64/torchvision-0.5.0-py36_cu101.tar.bz2
...

I'm pretty sure it's the conda packages that are giving us this DLL-related error on Windows

Ways that I get a similar bug on Windows:

conda, Python 3.6 or 3.8, install pytorch and torchvision from the pytorch conda channel

so basically there's some sort of .DLL bug in the current conda binaries for torch/torchvision on Windows?

Can you please try creating a new conda environment where you use pip to install both torch and torchvision, then install both vak and tweetynet with pip as well? Like so:

C:\You> conda create -n vak-env python==3.8
C:\You> conda activate vak-env
(vak-env) C:\You> pip install torch===1.7.1 torchvision===0.8.2 -f https://download.pytorch.org/whl/torch_stable.html
(vak-env) C:\You> pip install vak==0.4.0.dev1
(vak-env) C:\You> pip install tweetynet

and see if you still get the DLL error or any other error about "the paging file"?

@yardencsGitHub
Copy link
Collaborator

Traceback (most recent call last):
File "c:\anaconda3\envs\vak-pip-py38\lib\runpy.py", line 192, in _run_module_as_main
return run_code(code, main_globals, None,
File "c:\anaconda3\envs\vak-pip-py38\lib\runpy.py", line 85, in run_code
exec(code, run_globals)
File "C:\Anaconda3\envs\vak-pip-py38\Scripts\vak.exe_main
.py", line 7, in
File "c:\anaconda3\envs\vak-pip-py38\lib\site-packages\vak_main
.py", line 43, in main
cli(command=args.command,
File "c:\anaconda3\envs\vak-pip-py38\lib\site-packages\vak\cli\cli.py", line 31, in cli
learning_curve(toml_path=config_file)
File "c:\anaconda3\envs\vak-pip-py38\lib\site-packages\vak\cli\learncurve.py", line 56, in learning_curve
core.learning_curve(model_config_map,
File "c:\anaconda3\envs\vak-pip-py38\lib\site-packages\vak\core\learncurve.py", line 485, in learning_curve
train(model_config_map,
File "c:\anaconda3\envs\vak-pip-py38\lib\site-packages\vak\core\train.py", line 282, in train
model.fit(train_data=train_data,
File "c:\anaconda3\envs\vak-pip-py38\lib\site-packages\vak\engine\model.py", line 405, in fit
self._train(train_data,
File "c:\anaconda3\envs\vak-pip-py38\lib\site-packages\vak\engine\model.py", line 128, in _train
metric_vals = self._eval(val_data)
File "c:\anaconda3\envs\vak-pip-py38\lib\site-packages\vak\engine\model.py", line 198, in _eval
for ind, batch in enumerate(progress_bar):
File "c:\anaconda3\envs\vak-pip-py38\lib\site-packages\tqdm\std.py", line 1166, in iter
for obj in iterable:
File "c:\anaconda3\envs\vak-pip-py38\lib\site-packages\torch\utils\data\dataloader.py", line 435, in next
data = self._next_data()
File "c:\anaconda3\envs\vak-pip-py38\lib\site-packages\torch\utils\data\dataloader.py", line 1065, in _next_data
return self._process_data(data)
File "c:\anaconda3\envs\vak-pip-py38\lib\site-packages\torch\utils\data\dataloader.py", line 1111, in _process_data
data.reraise()
File "c:\anaconda3\envs\vak-pip-py38\lib\site-packages\torch_utils.py", line 428, in reraise
raise self.exc_type(msg)
MemoryError: Caught MemoryError in DataLoader worker process 3.
Original Traceback (most recent call last):
File "c:\anaconda3\envs\vak-pip-py38\lib\site-packages\torch\utils\data_utils\worker.py", line 198, in _worker_loop
data = fetcher.fetch(index)
File "c:\anaconda3\envs\vak-pip-py38\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "c:\anaconda3\envs\vak-pip-py38\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "c:\anaconda3\envs\vak-pip-py38\lib\site-packages\vak\datasets\vocal_dataset.py", line 72, in getitem
spect_dict = files.spect.load(spect_path)
File "c:\anaconda3\envs\vak-pip-py38\lib\site-packages\vak\files\spect.py", line 88, in load
spect_dict = constants.SPECT_FORMAT_LOAD_FUNCTION_MAPspect_format
File "c:\anaconda3\envs\vak-pip-py38\lib\site-packages\scipy\io\matlab\mio.py", line 226, in loadmat
matfile_dict = MR.get_variables(variable_names)
File "c:\anaconda3\envs\vak-pip-py38\lib\site-packages\scipy\io\matlab\mio5.py", line 333, in get_variables
res = self.read_var_array(hdr, process)
File "c:\anaconda3\envs\vak-pip-py38\lib\site-packages\scipy\io\matlab\mio5.py", line 293, in read_var_array
return self._matrix_reader.array_from_header(header, process)
File "mio5_utils.pyx", line 671, in scipy.io.matlab.mio5_utils.VarReader5.array_from_header
File "mio5_utils.pyx", line 701, in scipy.io.matlab.mio5_utils.VarReader5.array_from_header
File "mio5_utils.pyx", line 775, in scipy.io.matlab.mio5_utils.VarReader5.read_real_complex
File "mio5_utils.pyx", line 448, in scipy.io.matlab.mio5_utils.VarReader5.read_numeric
File "mio5_utils.pyx", line 353, in scipy.io.matlab.mio5_utils.VarReader5.read_element
File "streams.pyx", line 174, in scipy.io.matlab.streams.ZlibInputStream.read_string
File "streams.pyx", line 150, in scipy.io.matlab.streams.ZlibInputStream.read_into
File "streams.pyx", line 137, in scipy.io.matlab.streams.ZlibInputStream._fill_buffer
MemoryError

@NickleDave
Copy link
Collaborator Author

@yardencsGitHub it case it helps here's the two conda envs I tested on Windows (not Ubuntu WSL2!) -- attached as zip

the one that is just called vakpy38.environment.yaml does work -- I was able to prep a dataset and train single TweetyNet model in that environment.
This is a case where I installed torch and torchvision with pip using the -f flag pointing at a file on the pytorch server as they explain on their "getting started" page.
You could try and use this to create the same env on your machine if you want.

the one called vakpy38-condatorch.evironment.yml does not work -- I find that dask hangs forever when I try to prep a dataset. Not clear why, I had to Ctrl-C to kill it.

Attaching them both in case we want to try and do more forensics at some point

vakenvs.zip

@yardencsGitHub
Copy link
Collaborator

So far running with WSL was smooth.
I got a crash because of an error during loading of a Matlab file. (see trackback below)
This did not reproduce when loading (using scipy.io.loadmat) manually.

Since I'm repeating experiments it's impossible to repeat just a part of the experiment (The original and new training set durations need to be identical). This means that if such errors happen I need to rerun the whole experiment.

Trackback:
File "/home/yardenc/anaconda3/envs/vak-pytorch-conda/bin/vak", line 8, in
sys.exit(main())
File "/home/yardenc/anaconda3/envs/vak-pytorch-conda/lib/python3.8/site-packages/vak/main.py", line 43, in main
cli(command=args.command,
File "/home/yardenc/anaconda3/envs/vak-pytorch-conda/lib/python3.8/site-packages/vak/cli/cli.py", line 31, in cli
learning_curve(toml_path=config_file)
File "/home/yardenc/anaconda3/envs/vak-pytorch-conda/lib/python3.8/site-packages/vak/cli/learncurve.py", line 56, in learning_curve
core.learning_curve(model_config_map,
File "/home/yardenc/anaconda3/envs/vak-pytorch-conda/lib/python3.8/site-packages/vak/core/learncurve.py", line 485, in learning_curve
train(model_config_map,
File "/home/yardenc/anaconda3/envs/vak-pytorch-conda/lib/python3.8/site-packages/vak/core/train.py", line 282, in train
model.fit(train_data=train_data,
File "/home/yardenc/anaconda3/envs/vak-pytorch-conda/lib/python3.8/site-packages/vak/engine/model.py", line 405, in fit
self._train(train_data,
File "/home/yardenc/anaconda3/envs/vak-pytorch-conda/lib/python3.8/site-packages/vak/engine/model.py", line 109, in _train
for ind, batch in enumerate(progress_bar):
File "/home/yardenc/anaconda3/envs/vak-pytorch-conda/lib/python3.8/site-packages/tqdm/std.py", line 1166, in iter
for obj in iterable:
File "/home/yardenc/anaconda3/envs/vak-pytorch-conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 435, in next
data = self._next_data()
File "/home/yardenc/anaconda3/envs/vak-pytorch-conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1065, in _next_data
return self._process_data(data)
File "/home/yardenc/anaconda3/envs/vak-pytorch-conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
data.reraise()
File "/home/yardenc/anaconda3/envs/vak-pytorch-conda/lib/python3.8/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
OSError: Caught OSError in DataLoader worker process 1.
Original Traceback (most recent call last):
File "/home/yardenc/anaconda3/envs/vak-pytorch-conda/lib/python3.8/site-packages/scipy/io/matlab/mio.py", line 39, in _open_file
return open(file_like, mode), True
FileNotFoundError: [Errno 2] No such file or directory: '/mnt/d/YardenBirds/llb16/annotated/llb16_1486_2018_05_10_06_12_13.wav.mat'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/yardenc/anaconda3/envs/vak-pytorch-conda/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
data = fetcher.fetch(index)
File "/home/yardenc/anaconda3/envs/vak-pytorch-conda/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/yardenc/anaconda3/envs/vak-pytorch-conda/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/yardenc/anaconda3/envs/vak-pytorch-conda/lib/python3.8/site-packages/vak/datasets/window_dataset.py", line 211, in getitem
window, labelvec = self.__get_window_labelvec(idx)
File "/home/yardenc/anaconda3/envs/vak-pytorch-conda/lib/python3.8/site-packages/vak/datasets/window_dataset.py", line 191, in __get_window_labelvec
spect_dict = files.spect.load(spect_path)
File "/home/yardenc/anaconda3/envs/vak-pytorch-conda/lib/python3.8/site-packages/vak/files/spect.py", line 88, in load spect_dict = constants.SPECT_FORMAT_LOAD_FUNCTION_MAPspect_format
File "/home/yardenc/anaconda3/envs/vak-pytorch-conda/lib/python3.8/site-packages/scipy/io/matlab/mio.py", line 224, in loadmat
with _open_file_context(file_name, appendmat) as f:
File "/home/yardenc/anaconda3/envs/vak-pytorch-conda/lib/python3.8/contextlib.py", line 113, in enter
return next(self.gen)
File "/home/yardenc/anaconda3/envs/vak-pytorch-conda/lib/python3.8/site-packages/scipy/io/matlab/mio.py", line 17, in _open_file_context
f, opened = _open_file(file_like, appendmat, mode)
File "/home/yardenc/anaconda3/envs/vak-pytorch-conda/lib/python3.8/site-packages/scipy/io/matlab/mio.py", line 47, in _open_file
raise IOError(
OSError: Reader needs file name or open file-like object

NickleDave added a commit that referenced this issue Mar 9, 2021
installing `torch` on windows is complicated and there's
no simple fix using `poetry`, see discussion on #293
@NickleDave
Copy link
Collaborator Author

Closing this as stale. We can search through closed issues if we end up having to deal with this again. I think the conda-forge package may help as stated in #381

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BUG Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants