New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Network is not learning. mAP stays the same and Total Loss does not go below 6 #5
Comments
Hi @gustavz
Other Thoughts
Let me know how you resolve this. Thanks. |
Hey Victor,
Thanks for your quick reply!
I will compare the csv files on Monday when I’m back on my machine.
The training I was talking about was done only on the egohands dataset.
So no combining with oxford in this part.
The strange thing I had similar bad results on training solely on the Oxford dataset as this time on the egohands.
But training with all my other datasets runs as expected, that’s why I assumed there is something wrong with the csv files, as my tf 1.4 setup/config as well as my csv to tfrecord script appears to work normal on other sets...
… Am 19.01.2018 um 18:51 schrieb Victor D ***@***.***>:
Hi @gustavz
Zero mAP
This is quite unusual, and I agree with you that an mAP of 0 suggests there is something wrong with the labelling.
Batch Size
This was selected only because of memory issues.
Original CSV files
I am attaching my initial train/test label csv files, which you can compare to what is generated by the egohands script in the repo. I did some code cleanup after I built the original model so there is a slight possibility something changed slightly.
csvs.zip
Something you could do is verifying if training proceeds reasonably using only the tf records before combining it with the oxford dataset.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
I compared your csv files with mine and they are totally different. Here is an example for unequal values (see rest attached):
MINE:
But the thing is, i did not create the csv files by my own, i used your script to create them from the default dataset. How did you modify your code after the commit zu github? Could you update it? Would be very thankfull! |
Hmmm ... I actually did a similar check on Friday before I sent you my original csv's. I generated train and test csv's using the script and compared them and there were similar values for the same images. Can you retry the process? I'll suggest you pull a fresh copy of the repo and start from scratch (i.e in a new folder ... where the dataset gets downloaded from scratch). -V. |
I did a complete delete and re-clone of your repo. What could be the reason? I am using python 2.7 and openCV 3.3.1 if this could be important. |
Hey Victor, EDIT: maybe it is something with the random 10% test dataset creation? EDIT2: Ok the test and train datasets created by your dataset can't be equal each time you run the code as you dont use |
Are you sure you convert the bounding boxes correctly? Also i saw that You got some unused variables in your
idk if this was on purpose or forgotten for some value convertation? furthermore in your readme you write that you split the data in three sets, but actually you only split in 2two, that's a bit confusing. CLOSE REASON: sorting the image array solved all problems |
Hey Victor, i had the same idea to do a realtime capable hand detector.
First i also did it with the oxford dataset, but like you said it is not really good.
Then i saw your repo and tried the egohands dataset which looks really promising.
But with both datasets my mAp always stays at 0 and the total loss never goes below ~6.
If you look at the weights you see that they stayed around zero, so no learning at all.
Do you have an idea why this could be?
Here is a screenshot of my current training with the exact same config you are using.
(by the way, why do you use a batch size of only 6? Is this any good except of saving memory?)
I would really appreciate it if you could help me with this.
And yeah, your egohands_dataset_clean.py code is really nice 👍
EDIT: Are you sure the csv files that you create contain the correct bounding boxes? Could you share the csv files you used to create the tf record files? Thanks!
The text was updated successfully, but these errors were encountered: