Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: maximum box coordinate value is too large #1754

Closed
ericj974 opened this issue Jun 24, 2017 · 46 comments
Closed

Error: maximum box coordinate value is too large #1754

ericj974 opened this issue Jun 24, 2017 · 46 comments
Labels
stat:awaiting response Waiting on input from the contributor type:support

Comments

@ericj974
Copy link
Contributor

ericj974 commented Jun 24, 2017

System information

  • Have I written custom code: No
  • OS Platform and Distribution: XUbuntu 16.04
  • TensorFlow installed from: binary
  • TensorFlow version: 1.1.0
  • CUDA/cuDNN version: CUDA Version 8.0.61 / cuDNN 5.1

During training (rfcn architecture), the following error appears after some random steps:

InvalidArgumentError (see above for traceback): assertion failed: [maximum box coordinate value is larger than 1.01: ] [1.0111111]
Which happens to be thrown by box_list_ops.to_normalized_coordinates after a failed assertion.

I have to restart from the latest checkpoint to continue the training, however the same type of error can be thrown again.

Log can be found here: https://gist.github.com/ericj974/f270855bf6368509c74c05e94b6cb7b8
Config file here: https://gist.github.com/ericj974/8af390e1841b4f9be463b70573dc17d1

@schesho
Copy link

schesho commented Jun 28, 2017

Hi, have you find what caused this error?
Have the same error here

update fixed:
In my case i've mixed the groundtruth regions (height and width were exchanged)

@ali01
Copy link

ali01 commented Jun 30, 2017

Could you please verify that your ground truth regions are set correctly?

@ali01 ali01 added stat:awaiting response Waiting on input from the contributor type:support labels Jun 30, 2017
@ericj974
Copy link
Contributor Author

ericj974 commented Jul 4, 2017

Hi, something similar on my side, after looking twice at my ground truth regions, some of them had their box coordinates greater than either the image width or height.

@ericj974 ericj974 closed this as completed Jul 4, 2017
@FPerezHernandez92
Copy link

I have similar error:
https://pastebin.com/3KrSaEAF
but I check my bbox files and is all ok. Besides, I delete de files and I replace by other and the error continues. What I can do?

@TKassis
Copy link

TKassis commented Sep 9, 2017

I'm having the same error when I use any of the following:
faster_rcnn_inception_resnet_v2_atrous_coco
rfcn_resnet101_coco

But NOT when I use:
ssd_inception_v2_coco
ssd_mobilenet_v1_coco

My training images are a mixture of 300x300 and 450x450 pixels. I don't believe any of my bounding boxes are outside the image coordinates. Even if that's the case why would the last two models work but not the resnet models?

@slsd123
Copy link

slsd123 commented Sep 11, 2017

I'm having the same issue when I train either of the faster_rcnn models and the tfcn_resnet model but not with either of the ssd models.

@TKassis
Copy link

TKassis commented Sep 11, 2017

Let me know if you find a solution. I've been struggling with this for 3 days now and I have no clue why this is happening.

@yinggo
Copy link

yinggo commented Oct 26, 2017

Is there anyone solve this problem? I also can not use faster_rcnn_inception_resnet_v2_atrous_coco
and rfcn_resnet101_coco
The error is:
2017-10-25 21:23:15.296232: W tensorflow/core/framework/op_kernel.cc:1158] Invalid argument: assertion failed: [maximum box coordinate value is larger than 1.01: ] [1.0140625] [[Node: ToAbsoluteCoordinates/Assert/AssertGuard/Assert = Assert[T=[DT_STRING, DT_FLOAT], summarize=3, _device="/job:localhost/replica:0/task:0/cpu:0"](ToAbsoluteCoordinates/Assert/AssertGuard/Assert/Switch/_167, ToAbsoluteCoordinates/Assert/AssertGuard/Assert/data_0, ToAbsoluteCoordinates/Assert/AssertGuard/Assert/Switch_1/_169)]] 2017-10-25 21:23:15.296248: W tensorflow/core/framework/op_kernel.cc:1158] Invalid argument: assertion failed: [maximum box coordinate value is larger than 1.01: ] [1.0140625] [[Node: ToAbsoluteCoordinates/Assert/AssertGuard/Assert = Assert[T=[DT_STRING, DT_FLOAT], summarize=3, _device="/job:localhost/replica:0/task:0/cpu:0"](ToAbsoluteCoordinates/Assert/AssertGuard/Assert/Switch/_167, ToAbsoluteCoordinates/Assert/AssertGuard/Assert/data_0, ToAbsoluteCoordinates/Assert/AssertGuard/Assert/Switch_1/_169)]]

@yinggo
Copy link

yinggo commented Oct 29, 2017

Actually, I just ignore this error, change the ./core/box_list_ops.py
max_assert = tf.Assert(tf.greater_equal(1.1, box_maximum), ['maximum box coordinate value is larger ' 'than 1.1: ', box_maximum])
1.01 to 1.1, as my error max value is 1.014.
By this way to make the code keep on. Seems the result is not affected.

@madi
Copy link

madi commented Feb 6, 2018

I have the same problem, it would be useful to return the name of the image that fails when it raises this error

@CARASO
Copy link

CARASO commented Mar 13, 2018

try check_range=False

@kirk86
Copy link

kirk86 commented Mar 27, 2018

@yinggo @CARASO thanks for the valuable info. But after doing the appropriate changes now I get the following error:

InvalidArgumentError (see above for traceback): assertion failed: []
Condition x == y did not hold element-wise:] [x (Loss/BoxClassifierLoss/assert_equal_2/x:0) = ] [0] 
[y (Loss/BoxClassifierLoss/assert_equal_2/y:0) = ] [16]
 [[Node: Loss/BoxClassifierLoss/assert_equal_2/Assert/Assert = Assert[T=[DT_STRING, DT_STRING, DT_STRING, DT_INT32, DT_STRING, DT_INT32], summarize=3, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Loss/BoxClassifierLoss/assert_equal_2/All/_2891, Loss/BoxClassifierLoss/assert_equal_1/Assert/Assert/data_0, Loss/BoxClassifierLoss/assert_equal_1/Assert/Assert/data_1, Loss/BoxClassifierLoss/assert_equal_2/Assert/Assert/data_2, Loss/BoxClassifierLoss/assert_equal_2/x/_2893, Loss/BoxClassifierLoss/assert_equal_2/Assert/Assert/data_4, Loss/BoxClassifierLoss/ones_1/packed/_99)]]

Any suggestions or hints?

@CARASO
Copy link

CARASO commented Mar 27, 2018

@kirk86 Yes, I also found that if ‘check_range = false’ there are still other errors. Now I can only use the dichotomy to troubleshoot data sets. Hope that other friends have better solutions.

@kirk86
Copy link

kirk86 commented Mar 27, 2018

@CARASO Hi, sorry for bothering I was just wondering did you also tried in addition what @yinggo suggested? Did you still got the same error?

@CARASO
Copy link

CARASO commented Mar 27, 2018

@kirk86 Yes, I tried his method, but it didn't help me. Then I used the suggestion for you. I think the direction of the two of us is the same. We are all ignoring this cross-border mistake. But then I found this direction wrong, because it will only make the mistake erupt later. At the moment, I compressed the 4000+ dataset to only 600+, and I was able to train normally. Afterwards, I tried to incrementally add data and try to find out the problem data set. There is no better plan at this time. I hope you or other friends can tell me when they have better suggestions.

@asturm0
Copy link

asturm0 commented Apr 9, 2018

Hey guys,
I had the same error and found a solution that worked for me.
As my training-data I used several classes of the image-net dataset and wrote a script that also downloaded the bounding boxes. Then I used some scripts of this repo to convert the xml files to a csv(xml_to_csv.py), and after that turned it into a record-file (generate_tfrecord.py). Apparently, there are some wrong data in the image-net annotations or another issue with that. However, I wrote a little script to loads every image I had in my csv-file and checked it against the height/width info from the xml file. In my case there were a couple of wrong annotated images and after removing them from my csv, the training was successful. In case it helps anyone I uploaded my image-checking scirpt here. You will probably have a different workflow, but I just wanted to highlight that you should better doublecheck your annotation data if you have downloaded it from somewhere.

Cheers

@CARASO
Copy link

CARASO commented Apr 10, 2018

@asturm0 Yes, I also used similar methods to filter the data set, but unfortunately, the filtered data set still has an exception. I had to manually delete more than 1,500 photos. At present, 12000+ steps were trained and no abnormalities were found. I share the script that filters datasets here, I hope to be useful to other partners, and someone can give a more perfect solution.

Cheers

@tvkpz
Copy link

tvkpz commented Apr 18, 2018

This error also arises when the bounding boxes are two small compared to the size of the image. I deleted all boxes that are less than 1/16 th of the image size and the training works fine. Has anyone tried this before? Is there a specific proportion (instead of 1/16th) that we can take that is tied to how fasterRCNN is implemented?

@CARASO
Copy link

CARASO commented Apr 18, 2018

@tvkpz Are you saying area? Or is it length or width?

@tvkpz
Copy link

tvkpz commented Apr 23, 2018

area

@CARASO
Copy link

CARASO commented Apr 23, 2018

if object_area <= (width * height) / (16 * 16): raise Exception('object too small Error')

After filtering the data set above, I have trained over 450,000 steps to see no exceptions.

@tvkpz
Copy link

tvkpz commented Apr 23, 2018

Size of the area is dependent on some of the parameters (size of anchor etc) of fasterRCNN algorithm. Wondering if anyone knows the connection. I have not yet looked into it.

@rmekdma
Copy link

rmekdma commented May 1, 2018

In my case few images gave different width and height on opencv and PIL. (w, h are exchanged for some reason)
I added lines to create_tf_record.py

width, height = image.size
if int(data['size']['width']) != width or int(data['size']['height']) != height:
    continue

@mape1082
Copy link

mape1082 commented May 1, 2018

Thanks @asturm0 . Your script helped me detect that one of my images had a wrong annotation (a box fully outside of the image - used labelimg).

@tamisalex
Copy link

My issue was that for some of my images, the height listed in my pascal xml was actually the width dimension and vice versa. This maybe because I used an out of date vatic build. Hopefully this helps someone

@Sibozhu
Copy link

Sibozhu commented Sep 22, 2018

try check_range=False

Dude, life saver

@metromark
Copy link

metromark commented Sep 30, 2018

try check_range=False

Dude, life saver

@Sibozhu Where did you change this?

@metromark
Copy link

I have the same problem, it would be useful to return the name of the image that fails when it raises this error

@madi how do you print the image name? I tried scouring for the function that prints the image name but couldn't find it in the repo. Any help would be appreciated!

@schliffen
Copy link

schliffen commented Oct 29, 2018

I have the same erro: InvalidArgumentError (see above for traceback): assertion failed: [maximum box coordinate value is larger than 1.100000: ] [1.1015625]
with "mobilenet_v2_1.0_128". It accures after saving checkpoints for first time.
It is not because ground truth are wrong, it seems algorithm does not learn well first.

@anujeet98
Copy link

@CARASO thanks ,your script helped me to identify the mistake

@anujeet98
Copy link

Use this size checker .
Annotation box should not be too small
https://github.com/EdjeElectronics/TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10/blob/master/sizeChecker.py

@LyonOconner
Copy link

try check_range=False

which file ?

@lightqsvip
Copy link

/software/models/research/object_detection/core/box_list_ops.py

@codexponent
Copy link

It seems that the training data has some bad annotations.
Rather than changing on the core file as suggested by @yinggo , I think it would be better if you could change the bbox coordnates, if possible, while preparing the dataset.
Example: if larger by 1.04 and 1.08, then just reduce the xmin, xmax, ymin and ymax by 1.5 or 2 and then normalize by width and height.

@TakumiWzy
Copy link

try check_range=False

Oh , thanks! It seems worked! :)
image

@natalhabenatti
Copy link

try check_range=False

Oh , thanks! It seems worked! :)
image

Where did you put this? I didn't find this function on my scripts :(

@teffanymae
Copy link

try check_range=False

Omg! You saved my life. T_T

THANK YOU VERY MUCH!!!

@D4n1aLLL
Copy link

I have the same problem, it would be useful to return the name of the image that fails when it raises this error

@madi how do you print the image name? I tried scouring for the function that prints the image name but couldn't find it in the repo. Any help would be appreciated!

Where to get file name?

@git-hamza
Copy link

try check_range=False

Oh , thanks! It seems worked! :)
image

Where did you put this? I didn't find this function on my scripts :(
#1754 (comment)

@pytholic
Copy link

Hey guys,
I had the same error and found a solution that worked for me.
As my training-data I used several classes of the image-net dataset and wrote a script that also downloaded the bounding boxes. Then I used some scripts of this repo to convert the xml files to a csv(xml_to_csv.py), and after that turned it into a record-file (generate_tfrecord.py). Apparently, there are some wrong data in the image-net annotations or another issue with that. However, I wrote a little script to loads every image I had in my csv-file and checked it against the height/width info from the xml file. In my case there were a couple of wrong annotated images and after removing them from my csv, the training was successful. In case it helps anyone I uploaded my image-checking scirpt here. You will probably have a different workflow, but I just wanted to highlight that you should better doublecheck your annotation data if you have downloaded it from somewhere.

Cheers

Thank you for this mate!!!
This works guys!

@CARASO
Copy link

CARASO commented Jan 23, 2020

try check_range=False

Omg! You saved my life. T_T

THANK YOU VERY MUCH!!!

Welcome🤣

@sainisanjay
Copy link

try check_range=False

@CARASO do you think that putting check_range=False will degrade the performance?

@iamrishab
Copy link

Before creating the TF record file, I changed my ground truth like this which solved my problem.
So now the minimum is (2, 2) other than (0,0) and the maximum is (w-2, h-2). You can change it to 1 and it should also work.

h, w = img.shape[:2]
xmin = max(2, x1)
ymin = max(2, y1)
xmax = min(w-2, x2)
ymax = min(h-2, y2)

@MISSIVA20
Copy link

je voulais savoir est ce que quand je relance la formation le modèle ça ne posera pas de problème vu que j'ai pas pu résolu le problème a chaque fois j'ai la même erreur

@AlvaroCavalcante
Copy link

I had this problem and solved it as @rmekdma suggested, correcting the difference in the image resolution when the image is read. It turns out that some images have the width and height changed because of the metadata of the file. To prevent this, in my generate_tfrecord.py script I changed how the image is opened from this:

image = Image.open(encoded_jpg_io)

to this:

from PIL import Image, ImageOps
image = ImageOps.exif_transpose(Image.open(encoded_jpg_io))

And it solved the problem.

@iaverypadberg
Copy link

try check_range=False

@CARASO do you think that putting check_range=False will degrade the performance?

If there are a lot if instances where the bounding box is in the wrong spot, then yah the performance will definitely suffer. If its just 1 image it shouldn't be too much of an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat:awaiting response Waiting on input from the contributor type:support
Projects
None yet
Development

No branches or pull requests