You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to learn NeMo from "tutorials/01_NeMo_Models.ipynb"
at the end of the page after crating NeMoGPTv2 class try to create a model : model = NeMoGPTv2(cfg=cfg.model)
facing the following error :
File "/home/shabs/anaconda3/envs/NeMo/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3577, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-67-1b7caab869c2>", line 1, in <module>
model = NeMoGPTv2(cfg=cfg.model)
^^^^^^^^^^^^^^^^^^^^^^^^
File "<ipython-input-31-f04b7157a9ba>", line 3, in __init__
super().__init__(cfg=cfg, trainer=trainer)
File "/home/shabs/anaconda3/envs/NeMo/lib/python3.11/site-packages/nemo/core/classes/modelPT.py", line 154, in __init__
self.setup_multiple_validation_data(val_data_config=cfg.validation_ds)
File "/home/shabs/anaconda3/envs/NeMo/lib/python3.11/site-packages/nemo/core/classes/modelPT.py", line 539, in setup_multiple_validation_data
model_utils.resolve_validation_dataloaders(model=self)
File "/home/shabs/anaconda3/envs/NeMo/lib/python3.11/site-packages/nemo/utils/model_utils.py", line 293, in resolve_validation_dataloaders
model.setup_validation_data(cfg.validation_ds)
File "<ipython-input-66-0c8f18429ac6>", line 23, in setup_validation_data
vocab = f.read().split('')[:-1] # the -1 here is for the dangling token in the file
^^^^^^^^^^^^^^^^^^
ValueError: empty separator
The text was updated successfully, but these errors were encountered:
In this modified version provided here, split() is called without specifying any separator, which defaults to splitting based on whitespace characters such as space, tab, or newline. This resolved the ValueError caused by the empty separator.You need to modify this tutorial as follows :
class NeMoGPTv2(NeMoGPT):
def setup_training_data(self, train_data_config: OmegaConf):
self.vocab = None
self._train_dl = self._setup_data_loader(train_data_config)
# Save the vocab into a text file for now
with open('vocab.txt', 'w') as f:
for token in self.vocab:
f.write(f"{token}")
# This is going to register the file into .nemo!
# When you later use .save_to(), it will copy this file into the tar file.
self.register_artifact('vocab_file', 'vocab.txt')
def setup_validation_data(self, val_data_config: OmegaConf):
vocab_file = self.register_artifact('vocab_file', 'vocab.txt')
with open(vocab_file, 'r') as f:
vocab = f.read().split()[:-1] # Split based on whitespace characters
self.vocab = vocab
self._validation_dl = self._setup_data_loader(val_data_config)
def setup_test_data(self, test_data_config: OmegaConf):
# This is going to try to find the same file, and if it fails,
# it will use the copy in .nemo
vocab_file = self.register_artifact('vocab_file', 'vocab.txt')
with open(vocab_file, 'r') as f:
vocab = []
vocab = f.read().split()[:-1] # the -1 here is for the dangling token in the file
self.vocab = vocab
self._test_dl = self._setup_data_loader(test_data_config)
I am trying to learn NeMo from "tutorials/01_NeMo_Models.ipynb"
at the end of the page after crating NeMoGPTv2 class try to create a model :
model = NeMoGPTv2(cfg=cfg.model)
facing the following error :
The text was updated successfully, but these errors were encountered: