Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error executing job with overrides: ['exp=base_test'] #11

Open
manman25 opened this issue Nov 28, 2022 · 10 comments
Open

Error executing job with overrides: ['exp=base_test'] #11

manman25 opened this issue Nov 28, 2022 · 10 comments

Comments

@manman25
Copy link

manman25 commented Nov 28, 2022

Hello!
I tried to use " python train.py exp=base_test " in Eshell, but some errors were found as shown follows:

image

@jbmaxwell
Copy link

jbmaxwell commented Dec 16, 2022

Yeah, I'm having the same issue. I can see that datamodule isn't in config.yaml, so I guess it's just broken...
Not a huge deal to write a new training script.

@manman25
Copy link
Author

Yeah, I'm having the same issue. I can see that datamodule isn't in config.yaml, so I guess it's just broken... Not a huge deal to write a new training script.

Thanks, so you suppose that I need to write a new training script with datamodule?

@gauravkuppa
Copy link

I am facing this issue right now too! Can you please provide some pointers on what needs to be changed? The config yaml file or train.py?

@flavioschneider
Copy link
Member

I updated the base_test.yaml it should work now. Make sure to have the latest versions installed (pip install audio-diffusion-pytorch audio-encoders-pytorch -U)

@jbmaxwell
Copy link

Okay, running now. Thanks!

I'm curious; since I'm not familiar with Hydra or OmegaConf (yet) I'm not really clear what I'm training on when running base_test.yaml. Is this training on my files at DIR_DATA in my .env, or it is something from YouTube (just looking at the contents of base_test.datamodule).

Any clarification appreciated.

@flavioschneider
Copy link
Member

It downloads a single song from YouTube as a test, you can see the url in the base_test.yaml config.

@jbmaxwell
Copy link

jbmaxwell commented Dec 21, 2022

Okay, I assumed it must download something, but does it only use that song during this test training? Mostly I'm confused about the step of adding my data path to .env if it's just going to train on one YouTube song.

What would I change to use my own data (i.e., the path I specified in .env)?
I see that base_small.yaml references WAVDataset... is that what I'd use to point to DIR_DATA?

UPDATE: I've been looking at train.py, but I see that I can probably get a bit further with the notebook...

@manman25
Copy link
Author

I updated the base_test.yaml it should work now. Make sure to have the latest versions installed (pip install audio-diffusion-pytorch audio-encoders-pytorch -U)
image

Hello. I have tried to run this script again, but our computer server () is in china where the net was forbidden to browse or download information from Youtube. ( From the pic you can see the details that shell shows me “DownloadError” )
so what do I can to change the data path to another url to download the song? (i.e., the path I specified in .env or base_test.yaml)?

@mbuckler
Copy link

mbuckler commented Jan 12, 2023

I updated the base_test.yaml it should work now. Make sure to have the latest versions installed (pip install audio-diffusion-pytorch audio-encoders-pytorch -U)

Thank you for the attempted fix! Unfortunately the current commit (d5f6870) still experiences this error. I've put the full error below in text form so that it can be more easily searched by other people with the same issue:

Expand for full error (venv7) quoththeraver@ip-26-0-139-220:~/workspace/audio-diffusion-pytorch-trainer$ python train.py exp=base_test [2023-01-12 20:35:59,628][main.utils][INFO] - Disabling python warnings! Global seed set to 12345 [2023-01-12 20:35:59,637][__main__][INFO] - Instantiating datamodule . Error executing job with overrides: ['exp=base_test'] Traceback (most recent call last): File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/_internal/utils.py", line 644, in _locate obj = getattr(obj, part) AttributeError: module 'main' has no attribute 'module_base'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/_internal/utils.py", line 650, in _locate
obj = import_module(mod)
File "/usr/lib/python3.8/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1014, in _gcd_import
File "", line 991, in _find_and_load
File "", line 975, in _find_and_load_unlocked
File "", line 671, in _load_unlocked
File "", line 848, in exec_module
File "", line 219, in _call_with_frames_removed
File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/main/module_base.py", line 3, in
import librosa
File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/librosa/init.py", line 209, in
from . import core
File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/librosa/core/init.py", line 5, in
from .convert import * # pylint: disable=wildcard-import
File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/librosa/core/convert.py", line 7, in
from . import notation
File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/librosa/core/notation.py", line 8, in
from ..util.exceptions import ParameterError
File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/librosa/util/init.py", line 77, in
from .utils import * # pylint: disable=wildcard-import
File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/librosa/util/utils.py", line 9, in
import numba
File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/numba/init.py", line 42, in
from numba.np.ufunc import (vectorize, guvectorize, threading_layer,
File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/numba/np/ufunc/init.py", line 3, in
from numba.np.ufunc.decorators import Vectorize, GUVectorize, vectorize, guvectorize
File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/numba/np/ufunc/decorators.py", line 3, in
from numba.np.ufunc import _internal
SystemError: initialization of _internal failed without raising an exception

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 134, in _resolve_target
target = _locate(target)
File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/_internal/utils.py", line 658, in _locate
raise ImportError(
ImportError: Error loading 'main.module_base.Datamodule':
SystemError('initialization of _internal failed without raising an exception')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "train.py", line 110, in
main()
File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/main.py", line 90, in decorated_main
_run_hydra(
File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/_internal/utils.py", line 394, in _run_hydra
_run_app(
File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/_internal/utils.py", line 457, in _run_app
run_and_report(
File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/_internal/utils.py", line 222, in run_and_report
raise ex
File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/_internal/utils.py", line 219, in run_and_report
return func()
File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/_internal/utils.py", line 458, in
lambda: hydra.run(
File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 132, in run
_ = ret.return_value
File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/core/utils.py", line 260, in return_value
raise self._return_value
File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/core/utils.py", line 186, in run_job
ret.return_value = task_function(task_cfg)
File "train.py", line 25, in main
datamodule = hydra.utils.instantiate(config.datamodule, convert="partial")
File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 226, in instantiate
return instantiate_node(
File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 333, in instantiate_node
target = _resolve_target(node.get(_Keys.TARGET), full_key)
File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 139, in _resolve_target
raise InstantiationException(msg) from e
hydra.errors.InstantiationException: Error locating target 'main.module_base.Datamodule', set env var HYDRA_FULL_ERROR=1 to see chained exception.
full_key: datamodule

Specifically, this happens when following the install and run instructions in the readme (with a fresh virtual environment). Interestingly, I was able to successfully run the experiment on my local machine which already had plenty of packages installed, so there is clearly something wrong with the requirements. When I install the packages below (from my local machine's full pip freeze) I am able to run the experiment on the remote machine.

requirements_FULL.txt

I'd submit a PR with this updated requirements.txt, but obviously that would be pretty messy since not all of these packages are required. In case the issue was simply versioning, I did make a version of the repo's requirments.txt with explicit versions based on what I had in my local environment (see below) but unfortunately that also failed which means that there are some missing packages.

requirements with versions (fails).txt

Because I am now currently unblocked I think I'll end my debugging here, but I wanted to share this info with you guys so that future people are unblocked and maybe someone will go through each package to see where the problem is :P

@EmilianPostolache
Copy link

EmilianPostolache commented May 12, 2023

When this error manifested to me, it was related to the fact that datamodule in the hydra config was a child of exp and not a root key. As such it had to be accessed as config.exp.datamodule and not config.datamodule, and the same for all keys under exp. The problem was that # @package _global_ was missing at the beggining of the yaml configuration! So if you delete that by mistake this error happens!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants