New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training object detection model with multiple training sets #3031
Comments
+1, I was planning on building a script for this, but didn't think to ask if it's built-in already. |
+1, I want to train detection model with more than 1 tfrecord |
@FightForCS I know there is a way to read a series of tfrecords, you can take a look at the examples in the slim folder. It is possible to load a list of the file path by using slim.dataset.Dataset, but you may need to rewrite the script you are using. |
You can simply assign list of the file path by changing config file from
to
this change may only work when multiple tfrecord files use the same label_map. |
This issue is closed since I found answers in the code The object detection API use parallel reader to import your dataset, here are the comments by the developer
So basically you can define a list of input path as @byungjae89 mentioned above, or simply provide the input directory like
The reader will read the entire folder for you. |
Hello @izzrak and @byungjae89 , Thank you for sharing the approach to train multiple tfrecords. However, Do you confirm that the model really trains with those tfrecords, not just the first one? I follow @byungjae89 's approach to add three tfrecords in the config file, and intentionally put the incorrect names for the 2nd and 3rd tfrecords. ( I use the correct name for the 1st tfrecord ) The whole training(200k iteration) completes without any problem. But the error will pop up immediately if I put the incorrect name for the 1st tfrecord. It seems that only the 1st tfrecord is used in training. I am working on visualizing the training input image to confirm my suspicion. Please let me know if you have the similar experience. Thank you. |
If using protoc 3.5, import multiple tfrecord files in following format:
|
@willSapgreen, "surprisingly", those |
Hi @izzrak, Could you specify where you found this documentation please? Thanks,
|
If you want to read all the record files under a directory and these files are ended with the suffix record, the input configuration could be written like:
|
Now generate more tfrecord and config loading multiple tfrecord has been solved, but found that it does not help the memory (still OOM, I thought that more tfrecord can reduce the memory usage, increase the batch size parameters), I would like to ask How should I handle it? |
OMM is mainly caused by large input batch, rather than the number of
tfrecords. The tfrecords only provide the data source for training and
evaluating. If you don't want to reduce the batch size, add more GPUs cards
or try a smaller input size.
gzchenjiajun <notifications@github.com> 于2019年11月15日周五 上午11:54写道:
… Now generate more tfrecord and config loading multiple tfrecord has been
solved, but found that it does not help the memory (still OOM, I thought
that more tfrecord can reduce the memory usage, increase the batch size
parameters), I would like to ask How should I handle it?
@CasiaFan <https://github.com/CasiaFan> @kevin-apl
<https://github.com/kevin-apl> @failure-to-thrive
<https://github.com/failure-to-thrive> @willSapgreen
<https://github.com/willSapgreen>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3031?email_source=notifications&email_token=ACQ6CWGC7JGFCDF5LBWZPZ3QTYMO5A5CNFSM4EJAHJC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEEG25I#issuecomment-554200437>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACQ6CWG3NVKFFCKXVDTTDMTQTYMO5ANCNFSM4EJAHJCQ>
.
|
Meaning that when the hardware reaches the performance boundary, whether it is a tfrecord or multiple tfrecords, the batch size can no longer be expanded? |
Meaning that when the hardware reaches the performance boundary, whether it is a tfrecord or multiple tfrecords, the batch size can no longer be expanded? |
Yep, |
Ok, that means splitting multiple tfrecords and improving batch_size doesn't help.
|
Try grouped convlolution. @gzchenjiajun
gzchenjiajun <notifications@github.com> 于2019年11月21日周四 上午11:02写道:
… Ok, that means splitting multiple tfrecords and improving batch_size
doesn't help.
Then I have two more questions:
1. In addition to directly upgrading hardware / similar to
semi-precision reasoning, how can I improve batch_size?
2. Split tfrecord does not help, then can you change the batch read
tfrecord when reading? (Read, finish training, then discard, start next
one), is this helpful for memory?
@CasiaFan <https://github.com/CasiaFan>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3031?email_source=notifications&email_token=ACQ6CWFMJ35VAP45DP4ESPLQUX24DA5CNFSM4EJAHJC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEYXCKA#issuecomment-556888360>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACQ6CWEYZKY7HIX5APTRHUDQUX24DANCNFSM4EJAHJCQ>
.
|
@byungjae89 could you tell if you can pass several paths for train and test images? how to handle this? |
what if a other protoc version is used? dont know which is required for tensorflow 1.15 gpu, because i haven't set up my environment until now. im interested how you could add several image paths of this different tf.record files. if you have several tf.record files of different data locations, should the images all be only in one big train and test folder (all images of all locations) or in the same where the tf.record files? |
I have gone through the all the above solutions but sadly,no one worked for me. So, I got a simpler and easy solution, which I am going to share. In my case, I had to create coco tfrecord file, for which I used create_coco_tf_record.py file from tensorflow OD model zoo git repository, which has function named "_create_tf_record_from_coco_annotations" with parameter "num_shards", it asked for number of output files. here is link to file: https://github.com/tensorflow/models/blob/master/research/object_detection/dataset_tools/create_coco_tf_record.py |
In my opinion your post failed the topic. The discussion is about how to read from several input paths the TFRecord file and not how to create the TFRecord in another format like in your case |
I didn't find any description in the document shows I can assign multiple input path. Is there any method to train a model with two or more datasets without converting them into one big tfrecords file?
The text was updated successfully, but these errors were encountered: