-
Notifications
You must be signed in to change notification settings - Fork 45.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API ObjectDetection size of input images issues #1876
Comments
Hi @chenyuZha - These resolutions are problematic because we keep an entire queue of images in memory, not just the batch that you are currently training on. See e.g. the Though note that we typically resize in SSD immediately to 300x300. So given this, it makes sense to just resize your input images to be smaller. |
@jch1 |
Hi @jch1 .
And I did some experiments, it's true. So, I changed resize API to resize_image_with_crop_or_pad(). My question is did you take resize distortion into consideration during the development of the object detection API? With thanks and regards! |
@yeephycho you should first resize image and keep original aspect ratio to e.g. 300px by larger side and then you shoudl pad image to square by resizing it with resize_image_with_crop_or_pad(). Code of aspect ratio preserving resize with TF's ops you can find in magenta repository, where neural style transfer is done |
Hi @yeephycho, Thanks!
|
Hi @scotthong,
With regards! |
Hi all, Maybe someone can help me with this issue: Regards, |
Hi @yeephycho @scotthong , Regards, |
Hi @syndec |
Hello @yeephycho and @syndec , Thank you for precious time on my question. |
I am facing similar issues but only when I use the ssd_random_crop data augmentation option in the pipeline config. My total loss converges much lower when I don't use the crop preprocessing. Also, I'm finding that the position of the object in the image has a huge impact on score - the detector behaves like it has blind spots. My training images are 1920x1200 and I'm using the fixed_shape_resizer to 300x300. My training runs have about 100 images and it converges to a total loss of about 0.26 after 10K steps. When I add the ssd_random_crop the total loss is around 2.5 even after 20K steps. I've tried variations with random_crop_image as well with no luck. |
Hello @funkysandman , Would you mind to share what "good data" for SSD-InceptionV2 training in your opinion?
Because SSD-InceptionV2 has limitation to detect the small objects,
The total loss is close to 1.0 after 200k iteration. But after training with this kind of data, I am working on the root cause. P.S. Thank you. |
Make sure your pipeline.config is matched perfectly to the model you are using. I've accidentally run training with the wrong config file. Maybe share your pipeline.config file? Also, the detection program must be tailored to the specific model as well and the images normalized properly for the detector or you may get weird results |
Hello @funkysandman , |
Question. Say all the train and test images from the raw data is of size (512, 7000). Can anyone who works on the object detection api tell me if it is ok to leave the image_resizer set to the 300x300 setting? Or should i change it to the following to match the dimension of my images. I'm not sure what the purpose of that piece of code is. Currently training a model with the setting as: |
Hello @cmbowyer13 , If you use SSD from the model zoo, you need either resize your image to 300x300 before feeding to the model or let the model does it for you. Good luck! |
I don't know what youre saying at willSapgreen. Why can't i do as i am doing or why cannot the model accept arbitrary sized images? |
You can provide any size image for detection. I would not resize it so small that you cannot see the features you're after (this depends on what your objects are). During training the images are resized to 300x300. If the images you are using to train your model lose their features at that resolution, you can increase it to something like 600x600 in the pipeline.config file. This has an effect on training time as more memory is needed. Resizing images at detection time can help speed things up. |
But what does resizing to 300x300 in the config file do to all my images during training which are of size (512x7000)? You're saying it's fine to leave it as 300x300, and then be able to detect any size image I want as long as it contains the same class of objects? And I will need to be detection matrices of raw data in real time. Are there any guidelines for doing realtime object-detection with our trained object detection model? Any good links or references are appreciated. Also, do you know of the best way to convert the model to a C++ format once it has created and performs as I like? |
@cmbowyer13 I would guess you can leave it at 512x7000 for training, as long as you have the memory + gpu to handle it. As long as the model loss converges it should be ok. When it comes to using the trained model, you can give it pictures in their original size I believe. You can resize them if it is too slow. |
Hi. I have a huge set of images for training that ranges in size from 300x300 to 1184x1410 (different sizes). I'm training a fastRcnn model with the pipline configuration file resizing images to 600x1024 My question is: is it better to resize all the images to 600 x1024 before training using the preprocessing script found in
Also will this resizing affect the bounding boxes in the annotations or they will be resized accordingly? Thanks |
are you sure it resizes to 600x1024? I thought it was making sure it was resizing so that the longest side is no smaller than 600 and no bigger than 1024 . It the pipeline.config it just says min_dimension=600 max_dimension=1024 without being specific to width or height. So, your 300x300 would be resized to 600x600 and 1184x1410 would be resized to 864x1024. I could be wrong |
@funkysandman Thanks for your reply, and yes, you are right it's about the min and max dimension. |
I would leave the images as they are, training should respect the bounding boxes that you specify in xml. The bounding boxes are converted to percentages i think so they're not affected by image resize |
Thanks @funkysandman |
I'm also questioning if it is better to resize prior to creating the bounding boxes as opposed to using larger images, sometimes much larger images, create the bounding boxes, and then allow the training to resize the photos to 300x300. In my use case, the photos are also resized in the same manner to 300x300 prior to sending them through the model for object detection in production. |
I've tested this and it seems that whatever size I train it on is the size it works best on... when I cropped my large pics down to 300px the model was lousy at detecting in the larger images. I ended up including original and resized images to ensure better detection. I'm still experimenting. I've also found the reducing the batch size can impact the overall training accuracy. I've tried batch sizes of 5,10,12,20,24...5 seems to work pretty good for my data |
I am failing to wrap my head around why resizing would be an issue to begin with unless the object is too small to detect. In all other cases, won't the distortion due to resizing be taken care of because we would be resizing the test image too ? |
The original pics I'm using are 4,000 x 3,000 approximately. Drawing boundary boxes on such large images and then training works well but the training is very slow and requires a lot of memory. If I resize the same images to 300x300 and then draw the boundary boxes - like funkysandman - I found that the image detection was terrible. I ended up cropping the images to approximately 800x800, some smaller and some larger, that held my objects and then drew the boundary boxes. The objects I am looking for filled most if not all the 800x800 cropped pictures. This works well. However it seems like I need more negative space. I was thinking of including pics with just negative space (no known objects) just to help the training. |
"However it seems like I need more negative space. I was thinking of including pics with just negative space (no known objects) just to help the training." This is something that I was thinking about also, I would like to know the answer to this. |
i am using labellmg for annotating the images for traffic lights for training on tensorflow using ssd_inception_coco model, how can i add color of the traffic lights while creating the xml file as same option is not available on labellmg. kindly help |
@bharat77s when you create a rectangle(shortcut: w), a list of labels comes up. you can just add the name of your label there. So basically the color of the traffic light can be a label. |
But is this the only way as I have to create thousands of such labels to
train my desire model
…On Thu, 27 Sep 2018, 14:27 MittalNeha, ***@***.***> wrote:
@bharat77s <https://github.com/bharat77s> when you create a
rectangle(shortcut: w), a list of labels comes up. you can just add the
name of your label there. So basically the color of the traffic light can
be a label.
Alternatively, you can add your labels names in
data/predefined_classes.txt such that your labels will appear in the list
of labels in labelImg.
I hope that answers your question.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1876 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AMZzWmvPQzmpW3byfFl13tBDMqGf1G7Mks5ufJMGgaJpZM4OPr2I>
.
|
I have a query regarding the image size used for training.I try training SSD_inception_v2 using a train dataset. I have the original images(1280/720)´and cropped images (640/640) in the training set.My models seem not to converge and the losses fluctuate.Should i train the network only on the crops? |
@funkysandman @skhater it is my understanding from reading docs and forums, it resize image to min dimension and then makes sure other dimension is less than max. Ex:If keeping aspect ratio 300x300 would be resized to 600x600 and 1184x1410 would be resized to 600x714 |
@giridhar13 Would suggest having your training images as close to the dimension of the images you plan to run through the model. Ex: If you are going to test/utilize cropped images, train on cropped images. |
@bharat77s Yes would do 4 different labels red, yellow, green, off. Make sure you edit label map and config file to reflect changes. If this presents issues my train model to pull out stop lights and then retrain on the cropped images to detect color. |
Does anyone know how |
These SSD512 pre-trained would help you to start with larger images though. |
From all this I conclude that the original images to be labeled must be in proportion, close to 1: 1 (ssd_mobilenet model ... and their resolutions must be for example: 900x1000 or 1200x1300, etc.). |
Hi There, |
Hi everyone, was just wondering if the image resizer param from the config file had an impact only on the training phase or also on the inference phase ? By that I mean that do my model, once trained, will resize each images to 300*300 for example if I use SSD with it's first convolution layers, or not ? Thanks ! |
Hi everyone, I am facing a problem which could have a similar reason: tensorflow/tensorflow#45148 |
GPU : GeForce GTX 1080 Ti/PCIe/SSE2 (11 GB)
Tensorflow version: 1.1.0
Python version: 2.7.12
Model checkpoint : ssd_mobilenet_v1_coco_11_06_2017
Context :After read the tutorial of ObjectDetection, I converted my own image data sets to tfRecord files by create the *.xml file to each image. As the sizes of images of my own data sets are very large : about width=4000pixels and height=2000pixels for each , the train.tfRecord is about 55G.
Issues: When I began to train with train.py by using the lightest model ssd_mobilenet_v1_coco_11_06_2017, after 4-5 steps, it crashed by error OOM.
![125](https://user-images.githubusercontent.com/29950360/27914310-08056dd6-6263-11e7-83ca-8249a3ff0c9a.png)
The error message is below:
Il seems like that the OOM error happened when allocating tensor with shape[1,2969,3546,3],
As the capacity of my GPU is 11GB, I didn't understand why it causes this problem..
The text was updated successfully, but these errors were encountered: