Some issues about the reproducing results #12

yux94 · 2018-07-23T04:01:32Z

When I tried to reproduce your codes, there are less-than-perfect results.
For example, below is the raw test_001 tiff

And after the whole training with the ResNet18-CRF, I got the test prob map result as:

while the ground-truth mask is something like:(since the camelyon16 organizers didn't provide the test GT with the format of tiff anymore, I transferred the raw tiff test file and xml file to the tiff mask with the ASAP software manually.)

And I have just followed your test steps and evaluated the average FROC score for the whole test set, and got this:

However, the result is not at all satisfying.

And is there any other trick in your preprocess, postprocess, or the training process?

Here is the prob map of test_026:

yil8 · 2018-07-23T18:27:42Z

@yux94 Thanks for trying to reproduce my results. I actually feel what you got is already roughly the same as mine. First, I did not have any tricks for preprocessing/postprocessing. Everything is within the codebase. There are some important details when you try to sample the coordinates for training patches, but since I've already provided my sampled coordinates, it doesn't matter anyway. As for testing reproducibility, have you tried to use the ckpt I provided within the codebase to generate the probability map before you trained your own? If you use my ckpt, you should be able to get a probability map of Test_001.tif as this:

I would highly recommend using cmap=jet to plot the probability map as compared to black/white in your case, which does not quite differentiate numbers around 0.5

In [1]: import numpy as np

In [2]: from matplotlib import pyplot as plt

In [3]: probs_map = np.load('./Test_001.npy')

In [4]: plt.imshow(probs_map.transpose(), vmin=0, vmax=1, cmap='jet')
Out[4]: <matplotlib.image.AxesImage at 0x7f35e994ca58>

In [5]: plt.show()

And I would subjectively argue this figure matches the ground truth annotation pretty well. If you could reproduce this probability map and the corresponding FROC score of ~0.8 (as already achieved by one user), then at least there should be no problems in the postprocessing steps.

yux94 · 2018-07-24T02:14:03Z

Thank you so much for your generous help and suggestions！
After plotting the prob map with cmap=jet， I got the probability map of Test_001.tif with my ckpt as this:

Maybe I should train again and check my whole process, thank you so much!

yil8 · 2018-07-24T03:22:32Z

@yux94 This one does look worse than my result. In addition to try my ckpt, it would be helpful to also plot your training/validation curve that I can also compare with mine.

yux94 · 2018-07-24T09:18:42Z

@yil8 Many thanks!

yux94 · 2018-07-27T07:24:27Z

When I try to resample the training patches randomly by myself and train the network again, I got the prob map with test_084 like:

And below is the prob map with my previous reproduced ckpt:

And this is your result:

That's very confusing since you have said that one user have already achieved good performance. And I am working on retraining the network again. Besides, it would be very nice if you could provide the detailed process of your sampling with hard mining. (#14 )

yil8 · 2018-07-30T10:17:41Z

@yux94 When I said other users achieved good performance, I mean they used my provided ckpt and achieved 0.8+ FROC score. Your last heatmap plot based on my ckpt also looks good, and I guess if you calculate the FROC score, it probably will be around 0.8 as well. For the training part, again due to non-determinism of GPU convolution, it's almost impossible to achieve numerical identical results for retraining. But I would still suggest you plot your training curve, so that I can get some rough ideas. I'm currently traveling for business trip, and will try to find some time to implement the hard-negative sampling part once I'm back to US.

yux94 · 2018-07-30T12:32:43Z

Thank you so much for your patient and timely reply.
This is my training curve with 20 epoch, should I train with more epoch till the curve is stable?

yux94 · 2018-07-30T12:43:13Z

And this is my resampling training curve with 20 epoch.

yil8 · 2018-07-31T03:28:01Z

@yux94 your first curve looks very similar to mine, which converges to ~0.92 valid accuracy. I guess your second curve does not include hard negative examples, thus it converges to higher accuracy. For the curve with hard negative examples, did you train you model using exactly the same config/command I provided in the README?

yux94 · 2018-08-03T03:32:51Z

Yes... Pretty sure. I will check again , many thanks!

yil8 · 2018-08-03T13:37:18Z

@yux94 sorry I couldn't help more on the training side. BTW, what's your FROC score for each case?

yux94 · 2018-08-28T09:50:09Z

Sorry for bothering you again, we have tried to use the .ckpt you provided within the codebase to generate the probability map, and the final FROC score is not satisfing either.

FP	0.25	0.5	1	2	4	8	Avg
NCRF Model	0.5265	0.6106	0.6681	0.7257	0.7743	0.8053	0.6851

So we checked the probability maps. First, we generate coordinates of the detected tumor region with the nms.py. Next, we pick out normal cases (48 out pf 129) which is the false positive, and draw the histogram.

According to this histogram, a good FROC score might be achieved only if the threshold is set to ～0.9.

yux94 · 2018-08-28T10:15:12Z

Sorry , Test_049 and Test_114 are not excluded , thus I got the bad result.

yil8 · 2018-08-28T17:29:59Z

@yux94 did you obtain 0.80+ FROC score after excluding Test_049 and Test_114?

Hukongtao · 2018-09-08T01:29:49Z

@yux94 "I transferred the raw tiff test file and xml file to the tiff mask with the ASAP software manually". How did you do that?I need the test GT,too. I just used the ASAP look the tif.But I don't know how to produce the GT.

yux94 · 2018-09-08T02:17:56Z

@yil8 yes, I got the 0.80+ FROC score with your provided .ckpt after excluding Test_049 and Test_114, but my reproduced result is not satisfying either.

yux94 · 2018-09-08T02:32:48Z

@Hukongtao First you open the .tif file with the ASAP. Next load the .xml file and save it as .araw file. Then open the .tif and .araw file and save it(If I remember correctly). Here is my another solution by cv2.fillPoly: https://github.com/yux94/Pathology/blob/master/bin/xml2mask_2.py
Maybe the second method is more convenient.

Hukongtao · 2018-09-08T06:37:45Z

@yux94 嗯嗯，代码可以使用。我把生成的结果转化成黑白图像得到的是图很大，但是前景只有很小的一块诶。和你在上面展示的不一样

Hukongtao · 2018-09-08T06:38:32Z

@yux94 大神方便留个QQ或者微信么，有些问题还是想向您请教。或者您加我QQ：1821141394

yux94 · 2018-09-08T13:10:23Z

Excuse me, but I have one question again. Did you first train the resnet18 and then finetune the model with the crf model?

yil8 · 2018-09-08T21:19:41Z

@yux94 Not quite sure what do you mean your "reproduced result is not satisfying either" exactly. FROC of 0.8+ is pretty good as far as I know. Do you have some specific examples? I trained resnet18 together with crf from scratch without finetune.

yux94 closed this as completed Aug 28, 2018

yil8 mentioned this issue Sep 11, 2018

is the model in ckpt folder the trained model? #23

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some issues about the reproducing results #12

Some issues about the reproducing results #12

yux94 commented Jul 23, 2018 •

edited

yil8 commented Jul 23, 2018

yux94 commented Jul 24, 2018

yil8 commented Jul 24, 2018

yux94 commented Jul 24, 2018

yux94 commented Jul 27, 2018

yil8 commented Jul 30, 2018 •

edited

yux94 commented Jul 30, 2018

yux94 commented Jul 30, 2018

yil8 commented Jul 31, 2018

yux94 commented Aug 3, 2018

yil8 commented Aug 3, 2018

yux94 commented Aug 28, 2018

yux94 commented Aug 28, 2018

yil8 commented Aug 28, 2018

Hukongtao commented Sep 8, 2018

yux94 commented Sep 8, 2018

yux94 commented Sep 8, 2018

Hukongtao commented Sep 8, 2018

Hukongtao commented Sep 8, 2018

yux94 commented Sep 8, 2018

yil8 commented Sep 8, 2018

Some issues about the reproducing results #12

Some issues about the reproducing results #12

Comments

yux94 commented Jul 23, 2018 • edited

yil8 commented Jul 23, 2018

yux94 commented Jul 24, 2018

yil8 commented Jul 24, 2018

yux94 commented Jul 24, 2018

yux94 commented Jul 27, 2018

yil8 commented Jul 30, 2018 • edited

yux94 commented Jul 30, 2018

yux94 commented Jul 30, 2018

yil8 commented Jul 31, 2018

yux94 commented Aug 3, 2018

yil8 commented Aug 3, 2018

yux94 commented Aug 28, 2018

yux94 commented Aug 28, 2018

yil8 commented Aug 28, 2018

Hukongtao commented Sep 8, 2018

yux94 commented Sep 8, 2018

yux94 commented Sep 8, 2018

Hukongtao commented Sep 8, 2018

Hukongtao commented Sep 8, 2018

yux94 commented Sep 8, 2018

yil8 commented Sep 8, 2018

yux94 commented Jul 23, 2018 •

edited

yil8 commented Jul 30, 2018 •

edited