Unable to reproduce the 71+% accuracy of EVE-Image model. #7

HearyShen · 2020-04-23T09:50:38Z

Considering the paper's codes have not been released yet, I tried implementing the EVE-Image architecture according to your paper.

However, it looks difficult to reproduce the 71.16% top1 accuracy reported in your paper.

The standard implementation reaches no more than 68% top1 accuracy (with residual connection, dropout, LayerNorm optimizations);
By replacing the SDP attention with a Transformer Encoder Layer, the model reaches higher accuracy but still no more than 69%.

Would you please release your paper's codes?

farleylai · 2020-05-16T06:56:33Z

Hi,

We are currently short of hands to maintain past intern's work and cannot guarantee a release date. Nonetheless, there are following works based on a similar model and even transformer as listed on the leaderboard achieving better results. Those should serve as the SOTA baselines for your research on our dataset.

Good luck.

furukawayuan-Yao · 2020-11-10T09:02:28Z

Considering the paper's codes have not been released yet, I tried implementing the EVE-Image architecture according to your paper.

However, it looks difficult to reproduce the 71.16% top1 accuracy reported in your paper.

The standard implementation reaches no more than 68% top1 accuracy (with residual connection, dropout, LayerNorm optimizations);

By replacing the SDP attention with a Transformer Encoder Layer, the model reaches higher accuracy but still no more than 69%.

Would you please release your paper's codes?

Could you please share the epoches of model you've trained? The paper suggestes 100 epoches as maximum but the model I reproduced converges so few epoches. 😥

farleylai · 2020-11-16T01:48:31Z

@furukawayuan-Yao
100 epochs are just for reference.
We have seen converged results between 3x - 7x epochs.
However, this highly depends on the architecture and implementation.
While we cannot comment on what to expect from your model without the details, you may refer to recent SOTA baselines on the leaderboard.

HearyShen · 2021-01-19T15:38:01Z

FYI, by extracting image feature map from raw image and bilinear interpolating to fixed size, I recently got a 70.86% test accuracy in the same experiment setting mentioned in your paper, which is similar to the 71.16% announced in your paper.

farleylai · 2021-01-21T00:49:02Z

@HearyShen Glad to hear that!
It is expectable to have some difference within a reasonable range given the feature extractor and object detector used.
One may feed the detection boxes to a fine-tuned feature extractor or simply train everything end-to-end.
This is indeed possible as long as you evaluate the related work in the same way for a fair comparison.
Good luck.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to reproduce the 71+% accuracy of EVE-Image model. #7

Unable to reproduce the 71+% accuracy of EVE-Image model. #7

HearyShen commented Apr 23, 2020

farleylai commented May 16, 2020 •

edited

Loading

furukawayuan-Yao commented Nov 10, 2020

farleylai commented Nov 16, 2020

HearyShen commented Jan 19, 2021

farleylai commented Jan 21, 2021

Unable to reproduce the 71+% accuracy of EVE-Image model. #7

Unable to reproduce the 71+% accuracy of EVE-Image model. #7

Comments

HearyShen commented Apr 23, 2020

farleylai commented May 16, 2020 • edited Loading

furukawayuan-Yao commented Nov 10, 2020

farleylai commented Nov 16, 2020

HearyShen commented Jan 19, 2021

farleylai commented Jan 21, 2021

farleylai commented May 16, 2020 •

edited

Loading