You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi I was trying to run my code on the source code repo, everything seems fine until run into the batch_by_size function at line 220 in fairseq/data/data_utils.py:
def batch_by_size(
indices, num_tokens_fn, max_tokens=None, max_sentences=None,
required_batch_size_multiple=1,
):
""" Yield mini-batches of indices bucketed by size. Batches may contain sequences of different lengths. Args: indices (List[int]): ordered list of dataset indices num_tokens_fn (callable): function that returns the number of tokens at a given index max_tokens (int, optional): max number of tokens in each batch (default: None). max_sentences (int, optional): max number of sentences in each batch (default: None). required_batch_size_multiple (int, optional): require batch size to be a multiple of N (default: 1)."""
try:
from fairseq.data.data_utils_fast import batch_by_size_fast
except ImportError:
raise ImportError(
'Please build Cython components with: `pip install --editable .` ''or `python setup.py build_ext --inplace`'
)
max_tokens = max_tokens if max_tokens is not None else -1
max_sentences = max_sentences if max_sentences is not None else -1
bsz_mult = required_batch_size_multiple
if isinstance(indices, types.GeneratorType):
indices = np.fromiter(indices, dtype=np.int64, count=-1)
return batch_by_size_fast(indices, num_tokens_fn, max_tokens, max_sentences, bsz_mult)
Code
The reported error is as follows:
2020-05-15 01:02:23 | INFO | fairseq_cli.train | model default-captioning-arch, criterion LabelSmoothedCrossEntropyCriterion
2020-05-15 01:02:23 | INFO | fairseq_cli.train | num. model params: 45776896 (num. trained: 45776896)
2020-05-15 01:02:24 | INFO | fairseq_cli.train | training on 4 GPUs
2020-05-15 01:02:24 | INFO | fairseq_cli.train | max tokens per GPU = 4096 and max sentences per GPU = None
2020-05-15 01:02:24 | INFO | fairseq.trainer | no existing checkpoint found .checkpoints/checkpoint_last.pt
2020-05-15 01:02:24 | INFO | fairseq.trainer | loading train data for epoch 1
2020-05-15 01:02:24 | INFO | fairseq.data.data_utils | loaded 566747 examples from: output/train-captions.en
<!-- before everthing's fine -->Traceback (most recent call last): File "/home/c/Cpt/fair/main.py", line 37, in <module> train() File "/home/c/Cpt/fair/main.py", line 33, in train cli_main() File "/home/c/Cpt/fair/fairseq_cli/train.py", line 355, in cli_main nprocs=args.distributed_world_size, File "/home/c/miniconda3/envs/fa/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 200, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method='spawn') File "/home/c/miniconda3/envs/fa/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 158, in start_processes while not context.join(): File "/home/c/miniconda3/envs/fa/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 119, in join raise Exception(msg)Exception: -- Process 1 terminated with the following error:Traceback (most recent call last): File "/home/c/Cpt/fair/fairseq/data/data_utils.py", line 220, in batch_by_size from fairseq.data.data_utils_fast import batch_by_size_fastModuleNotFoundError: No module named 'fairseq.data.data_utils_fast'During handling of the above exception, another exception occurred:Traceback (most recent call last): File "/home/c/miniconda3/envs/fa/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 20, in _wrap fn(i, *args) File "/home/c/Cpt/fair/fairseq_cli/train.py", line 324, in distributed_main main(args, init_distributed=True) File "/home/c/Cpt/fair/fairseq_cli/train.py", line 104, in main extra_state, epoch_itr = checkpoint_utils.load_checkpoint(args, trainer) File "/home/c/Cpt/fair/fairseq/checkpoint_utils.py", line 157, in load_checkpoint epoch=1, load_dataset=True, **passthrough_args File "/home/c/Cpt/fair/fairseq/trainer.py", line 296, in get_train_iterator epoch=epoch File "/home/c/Cpt/fair/fairseq/tasks/fairseq_task.py", line 181, in get_batch_iterator required_batch_size_multiple=required_batch_size_multiple, File "/home/c/Cpt/fair/fairseq/data/data_utils.py", line 223, in batch_by_size'Please build Cython components with: `pip install --editable .`'ImportError: Please build Cython components with: `pip install --editable .` or `python setup.py build_ext --inplace`
What I have tried?
Following the tips to install Cython from all kinds of sources (build from source, pip, conda) -> did not work.
Found that data/data_utils_fast.pyx is a Cython file. Thus tried:
You cloned fairseq master right? Then you should also run python setup.py build_ext --inplace from the root fairseq directory to build the Cython components.
You cloned fairseq master right? Then you should also run python setup.py build_ext --inplace from the root fairseq directory to build the Cython components.
Hi Myle, yes, it perfectly works now! I appreciate your response!
You cloned fairseq master right? Then you should also run python setup.py build_ext --inplace from the root fairseq directory to build the Cython components.
❓
Hi I was trying to run my code on the source code repo, everything seems fine until run into the
batch_by_size
function at line 220 infairseq/data/data_utils.py
:The block error occurs:
Code
The reported error is as follows:
What I have tried?
Cython
from all kinds of sources (build from source, pip, conda) -> did not work.data/data_utils_fast.pyx
is a Cython file. Thus tried:Also failed :(
Similar to Issue 1376
My environment?
The text was updated successfully, but these errors were encountered: