fastai efficientdet fails on learn.validate() with AttributeError: 'NoneType' object has no attribute 'shape' #638

potipot · 2021-02-11T15:54:06Z

🐛 Bug

when trying to simply validate metrics for an efficientdet model with fastai

KeyError: 'image_id'

AttributeError: 'NoneType' object has no attribute 'shape'

it fails when trying to read the batch size automatically: in accumulate, find_bs

class AvgLoss(Metric):
    "Average the losses taking into account potential different batch sizes"
    def reset(self):           self.total,self.count = 0.,0
    def accumulate(self, learn):
        bs = find_bs(learn.yb)
        self.total += learn.to_detach(learn.loss.mean())*bs
        self.count += bs
    @property
    def value(self): return self.total/self.count if self.count != 0 else None
    @property
    def name(self):  return "loss"

To Reproduce
Steps to reproduce the behavior:

Go to https://colab.research.google.com/drive/1i4aXYu4wIKA7eLUK86GwTm7lq7zku_oF?usp=sharing

The text was updated successfully, but these errors were encountered:

potipot · 2021-02-11T16:04:27Z

I found that applying a following patch:

from fastai.learner import AvgLoss
from fastai.torch_core import find_bs

@patch
def accumulate(self:AvgLoss, learn):
    #bs = find_bs(learn.yb)
    bs = find_bs(learn.xb)
    self.total += learn.to_detach(learn.loss.mean())*bs
    self.count += bs

fixes the issue. Im not sure though how should we store that patch in icevision. Anny suggestions?

potipot · 2021-02-15T14:04:15Z

a linked issue: https://forums.fast.ai/t/solved-fastai2-how-to-change-the-flow-for-metrics-calculation/70851/2

ganesh3 · 2021-02-15T14:07:09Z

I too had this issue which got solved with this patch and online help from @potipot . Thanks a lot. Hopefully, this is pushed as a PR.

potipot · 2021-02-23T09:13:54Z

@lgvaz this is still an issue. Just wanted to ask for your opinion on where to put this fastai patch?

lgvaz · 2021-02-26T18:53:31Z

@potipot i get access denied on the colab notebook, can you change the permissions?

lgvaz · 2021-03-04T21:35:05Z

Wow, this error is really subtle, specially the why it was not happening before.

The error was introduced by #630 but the bug was there since the beginning. To first understand it we have to take a look at how find_bs works:

def find_bs(b):
    "Recursively search the batch size of `b`."
    return item_find(b).shape[0]
    
def item_find(x, idx=0):
    "Recursively takes the `idx`-th element of `x`"
    if is_listy(x): return item_find(x[idx])
    if isinstance(x,dict):
        key = list(x.keys())[idx] if isinstance(idx, int) else idx
        return item_find(x[key])
    return x

In our case x that is passed to item_find will be a dictionary (the effdet target), note that find_bs calls item_find with idx=0, so key = list(x.keys())[**idx**] will take the first key from the dict.

And this is what #630 changed, before the first key in the dict was bbox:

And calling find_bs gives returns .shape[0] of the first prediction, which in this case is 3. Which is incorrect, the batch size is actually 16. So this is a silent bug.

After it started being img_size:

Now find_bs is called on None, hence the error.

The solution is to do what @potipot proposed here but without patching, instead we should do the same thing that we do for torchvision models, when creating the learner (models/torchvision/fastai/learner.py):

    class RCNNAvgLoss(fastai.AvgLoss):
        def accumulate(self, learn):
            bs = len(learn.yb)
            self.total += fastai.to_detach(learn.loss.mean()) * bs
            self.count += bs

    recorder = [cb for cb in learn.cbs if isinstance(cb, fastai.Recorder)][0]
    recorder.loss = RCNNAvgLoss()

We also have to check if the same happens for the mmdet models.

@potipot would you like to do a PR for this one?

potipot · 2021-03-05T09:03:51Z

@lgvaz im not sure whats the idea behind find_bs cause in the test cases it always is 1. Seems like accumulate operates on the current batch element, while the actual batch_size is 2. Copying the solution from RCNNAvgLoss works but now I'm wondering if this isn't some other issue in fact.

potipot · 2021-03-05T09:25:14Z

In the case of rcnn_learner the results are actually different. Don't know why this is necessary in case of rcnn_but this is what I get:

In[18]: len(learn.yb), find_bs(learn.yb), learn.dls.bs
Out[18]: (1, 2, 2)

lgvaz · 2021-03-05T12:15:07Z

Wait, I got a bit confused, what solution actually gives the correct batch size?

For rcnn yb is a list of dictonaries, so len(yb) will give the number of samples in the current batch. This is different for effdet where yb is a dictionary. The implementation would be a bit different for both cases

potipot · 2021-03-05T13:18:38Z

Problem is, I'm not sure what this is trying to do. I wasn't able to get values other than 1 there. So I guess the idea is to take an average of the loss and weigh it with the batch size. We could actually ignore this and just put a blank 1, so no weighing would happen.

When training via tests with bs=2 using RCNN and EfficientDet both I was always getting a single element (not a batch) from learn.yb.

potipot · 2021-03-06T17:16:53Z

The problem is that the test dataset (fridge_ds) has only 1 element in the validation set. Therefore I cannot test which list corresponds to the actual batch size, as it is always 1. :D I will increase the size of validation set to say 3 and then test again.

potipot added the bug Something isn't working label Feb 11, 2021

potipot mentioned this issue Mar 5, 2021

override find_bs with len in accumulate #660

Merged

lgvaz closed this as completed in #660 Mar 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fastai efficientdet fails on learn.validate() with AttributeError: 'NoneType' object has no attribute 'shape' #638

fastai efficientdet fails on learn.validate() with AttributeError: 'NoneType' object has no attribute 'shape' #638

potipot commented Feb 11, 2021

potipot commented Feb 11, 2021

potipot commented Feb 15, 2021

ganesh3 commented Feb 15, 2021 •

edited

potipot commented Feb 23, 2021

lgvaz commented Feb 26, 2021

lgvaz commented Mar 4, 2021 •

edited

potipot commented Mar 5, 2021

potipot commented Mar 5, 2021

lgvaz commented Mar 5, 2021

potipot commented Mar 5, 2021

potipot commented Mar 6, 2021

fastai efficientdet fails on learn.validate() with AttributeError: 'NoneType' object has no attribute 'shape' #638

fastai efficientdet fails on learn.validate() with AttributeError: 'NoneType' object has no attribute 'shape' #638

Comments

potipot commented Feb 11, 2021

🐛 Bug

potipot commented Feb 11, 2021

potipot commented Feb 15, 2021

ganesh3 commented Feb 15, 2021 • edited

potipot commented Feb 23, 2021

lgvaz commented Feb 26, 2021

lgvaz commented Mar 4, 2021 • edited

potipot commented Mar 5, 2021

potipot commented Mar 5, 2021

lgvaz commented Mar 5, 2021

potipot commented Mar 5, 2021

potipot commented Mar 6, 2021

ganesh3 commented Feb 15, 2021 •

edited

lgvaz commented Mar 4, 2021 •

edited