Evaluating loss function: TypeError: can only concatenate tuple (not "dict") to tuple #81

mxbi · 2018-08-15T17:45:48Z

I am running the latest master branch (offline), and when the code gets to the training point it crashes when trying to forward() the model and evaluate the loss function:

neptune: Executing in Offline Mode.
2018-08-15 18-34-40 google-ai-odt >>> training
2018-08-15 18-35-03 google-ai-odt >>> Training on a reduced class subset: ['Person', 'Car', 'Dress', 'Footwear']
2018-08-15 18:35:05 steppy >>> initializing Step label_encoder...
2018-08-15 18:35:05 steppy >>> initializing Step label_encoder...
2018-08-15 18:35:05 steppy >>> initializing experiment directories under experiments
2018-08-15 18:35:05 steppy >>> initializing experiment directories under experiments
2018-08-15 18:35:05 steppy >>> done: initializing experiment directories
2018-08-15 18:35:05 steppy >>> done: initializing experiment directories
2018-08-15 18:35:05 steppy >>> Step label_encoder initialized
2018-08-15 18:35:05 steppy >>> Step label_encoder initialized
2018-08-15 18:35:05 steppy >>> initializing Step loader...
2018-08-15 18:35:05 steppy >>> initializing Step loader...
2018-08-15 18:35:05 steppy >>> initializing experiment directories under experiments
2018-08-15 18:35:05 steppy >>> initializing experiment directories under experiments
2018-08-15 18:35:05 steppy >>> done: initializing experiment directories
2018-08-15 18:35:05 steppy >>> done: initializing experiment directories
2018-08-15 18:35:05 steppy >>> Step loader initialized
2018-08-15 18:35:05 steppy >>> Step loader initialized
neptune: Executing in Offline Mode.
2018-08-15 18:35:07 steppy >>> initializing Step retinanet...
2018-08-15 18:35:07 steppy >>> initializing Step retinanet...
2018-08-15 18:35:07 steppy >>> initializing experiment directories under experiments
2018-08-15 18:35:07 steppy >>> initializing experiment directories under experiments
2018-08-15 18:35:07 steppy >>> done: initializing experiment directories
2018-08-15 18:35:07 steppy >>> done: initializing experiment directories
2018-08-15 18:35:07 steppy >>> Step retinanet initialized
2018-08-15 18:35:07 steppy >>> Step retinanet initialized
2018-08-15 18:35:07 steppy >>> cleaning cache...
2018-08-15 18:35:07 steppy >>> cleaning cache...
2018-08-15 18:35:07 steppy >>> cleaning cache done
2018-08-15 18:35:07 steppy >>> cleaning cache done
2018-08-15 18:35:07 steppy >>> Step label_encoder, adapting inputs...
2018-08-15 18:35:07 steppy >>> Step label_encoder, adapting inputs...
2018-08-15 18:35:07 steppy >>> Step label_encoder, fitting and transforming...
2018-08-15 18:35:07 steppy >>> Step label_encoder, fitting and transforming...
2018-08-15 18:35:10 steppy >>> Step label_encoder, persisting transformer to the experiments/transformers/label_encoder
2018-08-15 18:35:10 steppy >>> Step label_encoder, persisting transformer to the experiments/transformers/label_encoder
2018-08-15 18:35:10 steppy >>> Step loader, adapting inputs...
2018-08-15 18:35:10 steppy >>> Step loader, adapting inputs...
2018-08-15 18:35:10 steppy >>> Step loader, transforming...
2018-08-15 18:35:10 steppy >>> Step loader, transforming...
2018-08-15 18:35:10 steppy >>> Step retinanet, unpacking inputs...
2018-08-15 18:35:10 steppy >>> Step retinanet, unpacking inputs...
2018-08-15 18:35:10 steppy >>> Step retinanet, fitting and transforming...
2018-08-15 18:35:10 steppy >>> Step retinanet, fitting and transforming...
2018-08-15 18:35:13 steppy >>> starting training...
2018-08-15 18:35:13 steppy >>> starting training...
2018-08-15 18:35:13 steppy >>> initial lr: 1e-05
2018-08-15 18:35:13 steppy >>> initial lr: 1e-05
2018-08-15 18:35:13 steppy >>> epoch 0 ...
2018-08-15 18:35:13 steppy >>> epoch 0 ...
2018-08-15 18:35:13 steppy >>> epoch 0 batch 0 ...
2018-08-15 18:35:13 steppy >>> epoch 0 batch 0 ...
Traceback (most recent call last):
  File "main.py", line 78, in <module>
    main()
  File "/home/m09170/anaconda3/lib/python3.6/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/home/m09170/anaconda3/lib/python3.6/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/home/m09170/anaconda3/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/m09170/anaconda3/lib/python3.6/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/m09170/anaconda3/lib/python3.6/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "main.py", line 16, in train
    pipeline_manager.train(pipeline_name, dev_mode)
  File "/media/nvme1/kaggle-openimages/src/open-solution-googleai-object-detection/src/pipeline_manager.py", line 21, in train
    train(pipeline_name, dev_mode)
  File "/media/nvme1/kaggle-openimages/src/open-solution-googleai-object-detection/src/pipeline_manager.py", line 85, in train
    pipeline.fit_transform(data)
  File "/media/nvme1/kaggle-openimages/src/open-solution-googleai-object-detection/src/steppy_dev/base.py", line 280, in fit_transform
    step_output_data = self._cached_fit_transform(step_inputs)
  File "/media/nvme1/kaggle-openimages/src/open-solution-googleai-object-detection/src/steppy_dev/base.py", line 390, in _cached_fit_transform
    step_output_data = self.transformer.fit_transform(**step_inputs)
  File "/home/m09170/anaconda3/lib/python3.6/site-packages/steppy/base.py", line 605, in fit_transform
    self.fit(*args, **kwargs)
  File "/media/nvme1/kaggle-openimages/src/open-solution-googleai-object-detection/src/models.py", line 32, in fit
    metrics = self._fit_loop(data)
  File "/media/nvme1/kaggle-openimages/src/open-solution-googleai-object-detection/src/models.py", line 63, in _fit_loop
    batch_loss = loss_function(outputs_batch, target) * weight
  File "/home/m09170/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in __call__
    result = self.forward(*input, **kwargs)
  File "/media/nvme1/kaggle-openimages/src/open-solution-googleai-object-detection/src/parallel.py", line 137, in forward
    outputs = _criterion_parallel_apply(replicas, inputs, targets, kwargs)
  File "/media/nvme1/kaggle-openimages/src/open-solution-googleai-object-detection/src/parallel.py", line 192, in _criterion_parallel_apply
    raise output
  File "/media/nvme1/kaggle-openimages/src/open-solution-googleai-object-detection/src/parallel.py", line 167, in _worker
    output = module(*(input + target), **kwargs)
TypeError: can only concatenate tuple (not "dict") to tuple

It looks like the "target" variable used for the loss function is supposed to be a tuple, but instead it is a dictionary. I have to admit I'm not sure what exactly's causing this, but I wanted to see if you have any immediate ideas before I spend time going through the code line by line. Execution command is just: python main.py -- train --pipeline_name retinanet, and the whole config has been filled out with (supposedly) the correct files.

Thanks!

The text was updated successfully, but these errors were encountered:

i008 · 2018-08-15T17:54:21Z

In the neptune config file:
You can change the batch_size_inference to the same number you have set for batch_size_train for training, but you will need to change it back to 1 if using evaluate or prediction pipes.

jakubczakon · 2018-08-15T18:56:31Z

@mxbi I answered on kaggle but what @i008 is saying is pretty much exactly that. Are you using multi gpu or single gpu is another question.

jakubczakon self-assigned this Aug 15, 2018

jakubczakon closed this as completed Aug 17, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluating loss function: TypeError: can only concatenate tuple (not "dict") to tuple #81

Evaluating loss function: TypeError: can only concatenate tuple (not "dict") to tuple #81

mxbi commented Aug 15, 2018

i008 commented Aug 15, 2018 •

edited

jakubczakon commented Aug 15, 2018

Evaluating loss function: TypeError: can only concatenate tuple (not "dict") to tuple #81

Evaluating loss function: TypeError: can only concatenate tuple (not "dict") to tuple #81

Comments

mxbi commented Aug 15, 2018

i008 commented Aug 15, 2018 • edited

jakubczakon commented Aug 15, 2018

i008 commented Aug 15, 2018 •

edited