Skip to content
This repository has been archived by the owner on Nov 13, 2023. It is now read-only.

Bug report: BaseLogger callback raises a ValueError when a 'list' is passed #1

Open
krishna-esrlabs opened this issue Jan 10, 2020 · 3 comments

Comments

@krishna-esrlabs
Copy link

krishna-esrlabs commented Jan 10, 2020

Actual Behavior

The 'on_iteration_end' function of the BaseLogger callback gets optional loss and metric parameters. It should print the values to the terminal in a pretty tabular format. It works fine as long as the loss/metric values are numbers/None. It fails if it gets a list of losses or metrics.

Stacktrace

  File "/nas/krishna/ext_repo/nucleus7/nucleus7/utils/log_utils.py", line 57, in wrapper       
    result = function(self, *args, **inputs)                                                   
  File "/nas/krishna/ext_repo/nucleus7/nucleus7/coordinator/callback.py", line 172, in __call__
    return self.on_iteration_end(**inputs)                                                     
  File "/nas/krishna/ext_repo/ncgenes7/ncgenes7/callbacks/generic.py", line 75, in on_iteration_end                                                                                           
    _get_printable_names_and_values(loss, metric))                                             
  File "/nas/krishna/ext_repo/ncgenes7/ncgenes7/callbacks/generic.py", line 143, in _get_printable_names_and_values                                                                           
    [(n, _get_value_printable(v)) for (n, v) in zip(names, values)                             
ValueError: not enough values to unpack (expected 2, got 0)  

Steps to reproduce the problem

  • Define a simple network with two loss functions whose outputs must be the same name,
    i.e. if the first loss function outputs a key loss_x, second loss must output loss_x
  • use the base logger with the default config
  • train the network, and the ValueError is raised

Observation

I extracted the BaseLogger functions and made a stand alone script to test different entries.
The problem is _get_value_printable function returns None if the list is passed. So in a situation when both loss and metric is None, the Zip function simply returns an empty tuple and then unpack error is raised.

import numbers
import numpy as np

def _get_value_printable(value):
    if isinstance(value, numbers.Number):
        return value
    if (isinstance(value, np.ndarray)
            and value.ndim == 1 and value.shape[0] == 1):
        return value[0]
    if (isinstance(value, np.ndarray)
            and value.ndim == 0):
        return value
    return None


def _get_printable_names_and_values(loss, metric):
    loss = loss or {}
    metric = metric or {}
    if not isinstance(loss, dict):
        loss = {'loss': loss}
    if not isinstance(metric, dict):
        metric = {'metric': metric}
    loss_names = sorted(loss)
    metric_names = sorted(metric)
    loss = [loss[name] for name in loss_names]
    metric = [metric[name] for name in metric_names]
    names = loss_names + metric_names
    values = loss + metric
    if not names:
        printable_names, printable_values = [], []
    else:
        printable_names, printable_values = zip(*(
            [(n, _get_value_printable(v)) for (n, v) in zip(names, values)
             if _get_value_printable(v) is not None]))
    return printable_names, printable_values


if __name__ == '__main__':
    print(_get_printable_names_and_values(1.9552441, 10))
    print(_get_printable_names_and_values(1.9552441, None))
    print(_get_printable_names_and_values([1.9552441, 2.7354164], 10))
    print(_get_printable_names_and_values(1.9552441, [10, 20]))
    print(_get_printable_names_and_values([1.9552441, 2.7354164], None))
OUTPUT
------------
(('loss', 'metric'), (1.9552441, 10))
(('loss',), (1.9552441,))
(('metric',), (10,))
(('loss',), (1.9552441,))
Traceback (most recent call last):
  File "test_script.py", line 43, in <module>
    print(_get_printable_names_and_values([1.9552441, 2.7354164], None))
  File "test_script.py", line 33, in _get_printable_names_and_values
    [(n, _get_value_printable(v)) for (n, v) in zip(names, values)
ValueError: not enough values to unpack (expected 2, got 0)

Expected Behavior

There are several ways to fix the bug, the question is what should we do when there is a list of losses?
Should the BaseLogger print all the list or an average of the list? Same question for metric as well.

Another question is, why is it getting a list of losses in the first place when it is supposed to get the total_loss which is a single value? This happens only when I have two different loss functions that outputs the same key loss_x and loss_x. If I change one of them to loss_y, the BaseLogger gets only the total_loss which is a single value.
@OleksandrVorobiov may be this is also another bug coming from Nucleus7, what do you think?

@OleksandrVorobiov
Copy link
Contributor

@krishna-esrlabs nucleus7 works as it should - if there are multiple values with same key, it collapses it to the list.
The problem is in BaseLogger - it should raise an error if values other than single values are passed. I will check it.

@krishna-esrlabs
Copy link
Author

@OleksandrVorobiov I am working on the solution. Should the BaseLogger raise an error if there is non-single values in both loss and metric or is it only for the loss and the metric is simply ignored?

@OleksandrVorobiov
Copy link
Contributor

I would go for an error if not valid value is provided.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants