Skip to content
This repository has been archived by the owner on Jul 5, 2021. It is now read-only.

Reproducibility of CamVid results #11

Closed
jeffreylutz opened this issue Feb 20, 2018 · 12 comments
Closed

Reproducibility of CamVid results #11

jeffreylutz opened this issue Feb 20, 2018 · 12 comments

Comments

@jeffreylutz
Copy link

George,

I'm having problems reproducing the results for training CamVid. I am trying the follow with no luck. I attempt to predict after training and confirmed the prediction is incorrect also.

TRAINING RESULTS:
Validation precision = 0.49989
Validation recall = 0.512134
Validation F1 = 0.50587
Validation IoU = 0.01776

TRAIN:
python main.py --mode train --dataset CamVid --model PSPNet-Res50 --batch_size 100000 --num_epoch 300

PREDICT:
python main.py --mode predict --dataset CamVid --model PSPNet-Res50 --image trash/in.png

@GeorgeSeif
Copy link
Owner

Hi Jeffrey,

Why is your batch size so huge?

@jeffreylutz
Copy link
Author

jeffreylutz commented Feb 20, 2018 via email

@GeorgeSeif
Copy link
Owner

GeorgeSeif commented Feb 21, 2018

Yes those were the exact settings.

Hmmm did you train for the full 300 epochs? And how does the accuracy look? If accuracy is good then it could be the precision, recall, and IoU calculations that are just wrong. How do the images look?

One big thing to note is that the accuracy I have in the README Results sections was done with an older research version of CamVid that had 12 classes. I haven't retrained fully on this one yet. I will once I get a chance. There could be something wrong when I made the transition for the new dataset, though it seemed to be training just fine when I did it for a few epochs.

@jeffreylutz
Copy link
Author

jeffreylutz commented Feb 21, 2018 via email

@GeorgeSeif
Copy link
Owner

Great observation. The original CamVid had dimensions of 360x480 (which I cropped to 352x480 because of downsampling in the networks) and only had 12 classes. So you're correct.

It should be in the git history because I had previously pushed it up here. But here's a link anyways:
https://github.com/alexgkendall/SegNet-Tutorial/tree/master/CamVid

I'll look into this and train it once I have a chance, plus upload a pretrained model. I'm just using my GPU for other things now.

What was the validation accuracy like? And can you upload some of the images here so I can take a look? How many epochs did you train for?

Thanks

@jeffreylutz
Copy link
Author

jeffreylutz commented Feb 21, 2018 via email

@Spritea
Copy link
Contributor

Spritea commented Mar 13, 2018

Hi, George,
I'm following this issue and try to reproduce the results for training CamVid. I'm using the version of commit id: 27b704e just like Jeff. I also use the same order as follows.
python main.py --mode train --dataset CamVid --model FC-DenseNet103 --batch_size 1 --num_epochs 300
However, every time my training goes to epoch=51, the program crashed like this
screenshot from 2018-03-13 09-24-54
Any idea will be welcome, thanks!

@GeorgeSeif
Copy link
Owner

Hi @Spritea

Hhmmm interesting. I've never even seen such an error. How many times has it happened?

@Spritea
Copy link
Contributor

Spritea commented Mar 14, 2018

George,
Uhh, twice. I'll try more to see what's the problem, and I guess maybe it's related to environment.
Besides, I switch to the HEAD version on my own data set, and it works correctly. Your code is really clear!
Thanks anyway~

@GeorgeSeif
Copy link
Owner

Could be. I'm actually running a few tests right now to add some new features. I'll see if something like that comes up, though it never did before!

@GeorgeSeif
Copy link
Owner

Hi there,

This issue has been resolved in the latest commit. There were two changes that really fixed things:

-- First of all, we should not do mean image subtraction before the pretrained ResNet. I removed that line for all networks that did that and it substantially improved the final prediction results

-- I fixed up the computations of precision, recall, and F1 score to use Scikit Learn's implementation.

Now one major thing to note: In Scikit Learn, one can select the different ways of computing precision, recall, and F1 score. They are:

micro --> Calculate metrics globally by counting the total true positives, false negatives and false positives.

macro --> Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.

weighted --> Calculate metrics for each label, and find their average, weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall.

You can select with message you want as a function argument now.

Now for the mean IoU there is something related to the above: Most papers actually use the weighted mean IoU, along with the weighted precision, recall, and F1 score. For example, check out the mean IoU on MIT's Scene Parsing Benchmark Repo.

https://github.com/CSAILVision/sceneparsing

The top unweighted mean IoU is 0.3490 where as the weighted mean IoU is 0.6108 which is very similar to the papers.

This repository currently computes the unweighted mean IoU. I have just tested all of the networks and found that they do produce good results similar to there paper. Perhaps I will add options on how the mean IoU is computed like Scikit Learn does for the scores above.

Closing this issue as it has been resolved.

Cheers!

@nooriahmed
Copy link

How would be increase Average mean IoU. Any productive receipe please?
Running test image 168 / 168Average test accuracy = 0.7722670918419248
Average per class test accuracies =

Animal = 0.982143
Archway = 0.910714
Bicyclist = 0.795735
Bridge = 0.982143
Building = 0.877125
Car = 0.779497
CartLuggagePram = 0.702381
Child = 0.937878
Column_Pole = 0.340315
Fence = 0.819580
LaneMkgsDriv = 0.414625
LaneMkgsNonDriv = 0.970238
Misc_Text = 0.516968
MotorcycleScooter = 0.976190
OtherMoving = 0.829898
ParkingBlock = 0.885432
Pedestrian = 0.590493
Road = 0.912155
RoadShoulder = 0.949243
Sidewalk = 0.792470
SignSymbol = 0.601385
Sky = 0.931321
SUVPickupTruck = 0.751057
TrafficCone = 0.994048
TrafficLight = 0.692748
Train = 1.000000
Tree = 0.809915
Truck_Bus = 0.884559
Tunnel = 1.000000
VegetationMisc = 0.823739
Void = 0.332217
Wall = 0.691263
Average precision = 0.8014890549886131
Average recall = 0.7722670918419248
Average F1 score = 0.7564816960037837
Average mean IoU score = 0.39641312733923195
Average run time = 0.08028133000646319

`

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants