Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to train the model on Custom dataset? #55

Open
PareshKamble opened this issue Apr 22, 2020 · 32 comments
Open

How to train the model on Custom dataset? #55

PareshKamble opened this issue Apr 22, 2020 · 32 comments

Comments

@PareshKamble
Copy link

Hello @ifzhang,
You did an impressive work on re-id.
The results are wonderful.
However, I could not see any option/code/method to train the model on a custom multiple person dataset.
May I know the procedure to train the model on such a custom dataset?
Thanking you in anticipation!

@ifzhang
Copy link
Owner

ifzhang commented Apr 22, 2020

For a custom dataset, you need the box label and id label. It is similar to the training strategy of MOT15 and MOT20. First you need to transform the custom dataset to the formation of our training data, like folder 'images' and 'labels with ids'. Then, you need to generate the label files. The code src/gen_labels_15.py may help you. Finally, you need to generate the paths of the training images like the files in src/data and add your json file for training in src/lib/cfg. Don't forget to change '--data_cfg' in the opts. Make sure all the steps are done and the custom dataset looks like our training data. You can train on your own dataset. I'm sorry the procedure is a little bit complex, but it will work if taking some time.

@PareshKamble
Copy link
Author

Thank you so much @ifzhang .
I shall try this out.
Kindly help me out in case of any discrepancy.

@tutu96177
Copy link

Hello @ifzhang,
You did an impressive work on re-id.
The results are wonderful.
However, I could not see any option/code/method to train the model on a custom multiple person dataset.
May I know the procedure to train the model on such a custom dataset?
Thanking you in anticipation!

Hello @ifzhang,
You did an impressive work on re-id.
The results are wonderful.
However, I could not see any option/code/method to train the model on a custom multiple person dataset.
May I know the procedure to train the model on such a custom dataset?
Thanking you in anticipation!

In the public mot dataset , "conf: this bbox contains the confidence degree of the object. It can be seen that it is not 0-1 in the traditional sense, and the higher the score is, the higher the confidence degree is."Or how do you make a custom data set? For example ,I want to track the car, How can I make the training datasets,how can I get the "conf "? thanks.

@PareshKamble
Copy link
Author

Hi @ifzhang
Hope you are doing well.
I did all the steps as you suggested for training FairMOT on custom dataset.
However, I have 2 doubts:

  1. Do we need to include the *.train files of all the datasets from src/data to the train dictionary of src/lib/cfg/data.json or all the datasets should have separate *.json files?
  2. In data.json file, test_emb and test dictionaries have "mot15":"./data/mot15.val" path. Is it ok to retain the same or it needs to be changed as per the datasets included in its corresponding train dictionary?

@ifzhang
Copy link
Owner

ifzhang commented May 11, 2020

  1. You can choose the .train files in src/data and write its name (xxx.train) in the src/lib/cfg/data.json. Now data.json contains all the training data I use except MOT15 and MOT20.
  2. test_emb and test are not important, we do not use them during training.

@PareshKamble
Copy link
Author

@ifzhang Thank you so much! :)

@dongziqi001
Copy link

dongziqi001 commented May 15, 2020

@ifzhang @PareshKamble
I have another 2 doubts.

  1. How can I change the training epoches in this code?
    2 .Where can I find the output training model when the train is finished?
    Thank you so much! :)

@PareshKamble
Copy link
Author

Hi @dongziqi001
Currently, I am training the model.
However, I did not come across this error.

I believe the error is caused due to passing of empty tensors to the loss function.
Kindly visit this link for detailed explanation and possible solution.

  1. You may change the default num_epochs here.
  2. I assume you are talking about trained model. I found it at FairMOT/exp/mot/all_dla34/.

@ifzhang please correct me if I am wrong.

@DonaldKam
Copy link

Hi @PareshKamble , as you trained on the costom dataset, did you reproduce the results in the paper and i did not. The loss value was high up to 3.5 when finishd training and the batch size and epoch were 2 and 40.

@PareshKamble
Copy link
Author

PareshKamble commented May 25, 2020

Hi @DonaldKam , I trained the FairMOT model on my custom dataset for 30 epochs with batch size of 4. However, the last loss value I got was 4.29. Though the model is performing well. I am going to train it again for more epochs when I have better computing power. Currently, working with single 2080 Ti GPU which took around 60 long hours to finish the training.
Shall update you with my new results later.

@DonaldKam
Copy link

@PareshKamble Looking forward to your good results.

@alberto139
Copy link

Are the any tools available to label MOT type data?

I've used tools like labelimg and labelme, but they don't seem to support object IDs, only classes.

Would it also be possible to track a number of different classes, like, pedestrian, car, traffic light, etc. ?

@PareshKamble
Copy link
Author

PareshKamble commented Jul 11, 2020

@alberto139 I use Intel's OpenVINO toolkits Computer Vision Annotation Tool.
https://github.com/opencv/cvat
Best tool I found till date.
It can also help assign tracking identities to any object over successive frames.

@alberto139
Copy link

alberto139 commented Jul 12, 2020

Thanks @PareshKamble! OpenVINO looks great!
I had another questions regarding the file structure for the training data to generate the label files.

The README says that the data needs to be:

MOT20
   |——————images
   |        └——————train
   |        └——————test
   └——————labels_with_ids
            └——————train(empty)

What exactly goes into each of those sub-folders?

I downloaded some data from the MOTChallenge just to test it out and the file structure is quite different.

MOT20
   |——————train
   |        └——————MOT20-01
   |                  └—————— det
   |                             └—————— det.txt
   |                  └—————— gt
   |                              └—————— gt.txt
   |                  └—————— img1
   |                               └——————000001.jpg
   |                               └—————— 000002.jpg ...
   |                  └—————— seqinfo.ini 
   |        └——————MOT20-02
   └——————test

@PareshKamble
Copy link
Author

Dear @alberto139
While downloading MOT20, you might have found two different folders train and test.
The folder structure is all correct, you just need to specify the image path as shown in mot20.train file here.

As per my folder structure:
I have

MOT20
   |——————images
   |         └—train
   |              L_______MOT20-01
   |                        L___________det
   |                                     L___________det.txt
   |                        L___________gt
   |                                     L____________gt.txt
   |                        L___________ img1
   |                                     L___________000001.jpg
   |                                     L___________000002.jpg
   |                                     L___________000003.jpg
   |                                     L___________......
   |                        L___________seqinfo.ini
   |               L_______MOT20-02
   |               L_______MOT20-03
   |               L_______MOT20-05
   |          └——test
   |               L_______MOT20-04
   |                               L___________det
   |                                             L___________det.txt
   |                               L___________img1
   |                                             L___________000001.jpg
   |                                             L___________000002.jpg
   |                                             L___________000003.jpg
   |                                             L___________......
   |                               L___________seqinfo.ini
   |               L_______MOT20-06
   |               L_______MOT20-07
   |               L_______MOT20-08
   |
   └——————labels_with_ids
                    └——train(empty)

You may use this file for automatically filling up the MOT20->labels_with_ids with the object identities over frames. Finally, the labels_with_ids folder would be populated like below:

MOT20
   |——————images
   |            └——train
   |            └——test
   └——————labels_with_ids
   |             └———train
   |                    L______MOT20-01
   |                            L______img1
   |                                      L______000001.txt
   |                                      L______000002.txt
   |                                      L______000003.txt
   |                                      L______000004.txt
   |                                      L______....
   |                    L______MOT20-02
   |                    L______MOT20-03
   |                    L______MOT20-05

In case of your own dataset, you need to convert the gen_labels_20.py as per your requirements or use CVAT to do that for you.
All the best! :)

@alberto139
Copy link

Thanks again for the help @PareshKamble !!
Got the model training on MOT20 and I'm now preparing my own data using CVAT.
I had to lower the batch size substantially to fit my GPU and the --K argument in opts.py from 128 to 500, but it's training :D

@alekjedrosz
Copy link

Hello @ifzhang, thank you for your amazing work! What would be the best way to handle training on videos that contain many frames without any class detections? I saw you skipping label files for frames without detections in MOT15 (sequence KITTI-13, frames 0-12), and so I tried to do the same - that resulted in a file not found error. I also tried saving empty label .txt files for these frames, but that resulted in RuntimeError: invalid argument 2: non-empty vector or matrix expected at /pytorch/aten/src/THCUNN/generic/ClassNLLCriterion.cu:31. Any help would be greatly appreciated, thanks!

@alberto139
Copy link

Hi @alekjedrosz
I ran into the same problem.
The issue is with the .train file. You wan't to make sure that you don't reference images that don't have a valid .txt file.

@alekjedrosz
Copy link

Hi @alberto139, thank you for your help! I indeed did not realise this model performs tracking similarly to the two-step methods, hence temporal information does not need to be included. Did you, by any chance, manage to solve the input image resolution problem, when training on a custom dataset? I see (1088, 608) hardcoded in a lot of places, but I'm not sure if this is a resolution which all images are resized to, or if it should be specified on a per-dataset basis.

@alberto139
Copy link

@alekjedrosz I think you are correct in that all images get resized to (1088, 608). It doesn't really matter what the image sizes of the training set are, since they'll all be resized once they go into the model. I've tried changing the size of the image the model takes as input but I haven't been successful yet.

Please let me know if you solve this.

@alekjedrosz
Copy link

@alberto139 Thanks for your reply. You can change the input image size at train time in train.py (line 33) - this will be the size that your model expects as input and all images will be resized to this resolution. Then, at inference time, you want to change the opt.img_size value in opts.py, so that line 103 in track.py passes the correct size to LoadImages dataset.

@alberto139
Copy link

@alekjedrosz Thank you for the suggestion.

I gave it a try but was unable to train with a different image size.
I get the following error:
File "/home/alberto/FairMOT/src/lib/models/networks/pose_dla_dcn.py", line 55, in forward out += residual RuntimeError: The size of tensor a (68) must match the size of tensor b (67) at non-singleton dimension 3

I think this might have to do with the fact that I'm trying to fine-tune a model that excepts a different image size.

I tired training with the following command:
python3 train.py mot --exp_id all_dla34 --gpus 0 --batch_size 2 --load_model '../models/ctdet_coco_dla_2x.pth'

Let me know if your training process is different.

@GuHuangAI
Copy link

@alekjedrosz Thank you for the suggestion.

I gave it a try but was unable to train with a different image size.
I get the following error:
File "/home/alberto/FairMOT/src/lib/models/networks/pose_dla_dcn.py", line 55, in forward out += residual RuntimeError: The size of tensor a (68) must match the size of tensor b (67) at non-singleton dimension 3

I think this might have to do with the fact that I'm trying to fine-tune a model that excepts a different image size.

I tired training with the following command:
python3 train.py mot --exp_id all_dla34 --gpus 0 --batch_size 2 --load_model '../models/ctdet_coco_dla_2x.pth'

Let me know if your training process is different.

if you set image_size to (h,w), and h/608 ==w/1088, it'll be fine.

@alberto139
Copy link

alberto139 commented Aug 6, 2020

Thanks for the reply @GuHuangAI.
That seems to work for image sizes larger than (608, 1088) but not for smaller image sizes, even when keeping the same aspect ratio.

For example I tried (304, 544), but get a similar error.

UPDATE
I found that the image dimensions need to be factors of 32. I've tested both training and inference, with different image sizes and they all work.

Got the answer from Zhongdao/Towards-Realtime-MOT#30

@GuHuangAI
Copy link

@alberto139 Thanks to your reply and you are right.

@geyu1998
Copy link

@PareshKamble
hi, when i train on custom dataset ,there is an error :IsADirectoryError: [Errno 21] Is a directory: '/FairMOT-master/datas/',can you solve this problem,thanks
i put the datasets in datas file

@PareshKamble
Copy link
Author

@geyu1998 what is the directory structure of yours?
Mine is e.g. FairMOT/dataset/MOT20.
The data structure inside MOT20 folder is shown below:

MOT20
   |——————images
   |         └—train
   |              L_______MOT20-01
   |                        L___________det
   |                                     L___________det.txt
   |                        L___________gt
   |                                     L____________gt.txt
   |                        L___________ img1
   |                                     L___________000001.jpg
   |                                     L___________000002.jpg
   |                                     L___________000003.jpg
   |                                     L___________......
   |                        L___________seqinfo.ini
   |               L_______MOT20-02
   |               L_______MOT20-03
   |               L_______MOT20-05
   |          └——test
   |               L_______MOT20-04
   |                               L___________det
   |                                             L___________det.txt
   |                               L___________img1
   |                                             L___________000001.jpg
   |                                             L___________000002.jpg
   |                                             L___________000003.jpg
   |                                             L___________......
   |                               L___________seqinfo.ini
   |               L_______MOT20-06
   |               L_______MOT20-07
   |               L_______MOT20-08
   |
   └——————labels_with_ids
                    └——train(empty)

How about yours?

@geyu1998
Copy link

@PareshKamble
the structure of mine is as the same as crowdhuman like this
crowdhuman
|——————images
| └——————train
| └——————val
└——————labels_with_ids
| └——————train(empty)
| └——————val(empty)
because i don't have id labels,i have made the labels acoording to gen_labels_crowd.py

@PareshKamble
Copy link
Author

@geyu1998 everything looks normal to me.
No clue about the problem, sorry.

@geyu1998
Copy link

@PareshKamble Thank you again for your response

@ayanasser
Copy link

@alberto139 I use Intel's OpenVINO toolkits Computer Vision Annotation Tool.
https://github.com/opencv/cvat
Best tool I found till date.
It can also help assign tracking identities to any object over successive frames.

Hi,
I've installed cvat and I dont know how to make it help me by assigning unique IDs over successive frames ?
Could you help me with that ?

@Town-151
Copy link

@geyu1998 IsADirectoryError: [Errno 21] Is a directory: '/FairMOT-master/datas/
Excuse me, has your question been resolved? If so, how was it resolved?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests