Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems in training images with 5 or more bounding boxes #94

Open
jeasung-pf opened this issue Apr 24, 2020 · 1 comment
Open

Problems in training images with 5 or more bounding boxes #94

jeasung-pf opened this issue Apr 24, 2020 · 1 comment

Comments

@jeasung-pf
Copy link

Hello.

I am now training the FSNS datasets with images containing 5 or more boxes and the thing is that when calculating losses in your model, there are preconfigured weights to be multiplied with the loss on each bounding boxes and that makes an index out of bound error when calculating the loss on the fifth bounding box. Below are code blocks you wrote.

        loss_weights = [1, 1.25, 2, 1.25]
        for i, (predictions, grid, labels) in enumerate(zip(batch_predictions, F.separate(grids, axis=0), F.separate(t, axis=1)), start=1):
            with cuda.get_device_from_array(getattr(predictions, 'data', predictions[0].data)):
                # adapt ctc weight depending on current prediction position and labels
                # if all labels are blank, we want this weight to be full weight!
                print("{}".format(i - 1))
                overall_loss_weight = loss_weights[i - 1]
                loss = self.calc_actual_loss(predictions, grid, labels)
                # label_lengths = self.get_label_lengths(labels)

                for sub_grid in F.separate(grid, axis=1):
                    width, height = self.get_bbox_side_lengths(sub_grid)
                    loss += self.area_loss_factor * self.calc_area_loss(width, height)
                    loss += self.aspect_ratio_loss_factor * self.calc_aspect_ratio_loss(width, height)
                    loss += self.calc_direction_loss(sub_grid)
                    loss += self.calc_height_loss(height)
                loss *= overall_loss_weight
                losses.append(loss)

Does it mean that usually the size and complexity of decoding sequences in the third box is bigger than any other boxes in the data? If so, did you conduct experiments with equally weighted bounding boxes?

@Bartzi
Copy link
Owner

Bartzi commented Apr 24, 2020

That is kind of right.
I added these loss_weights in order to encourage the network to correct errors in the third word, since the dataset mainly consists of images with three words. This was just a performance measure and everything should work fine with loss equally weighted loss weights.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants