Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can you elaborate the difference between your work and cfnet #5

Closed
gongbudaizhe opened this issue Oct 8, 2017 · 1 comment
Closed

Comments

@gongbudaizhe
Copy link

Hi,

It seems that your work is closely related to CFNET, yet your performance is much better, 0.624 vs 0.568 in OTB100. Can you elaborate what makes such a big difference since CFNET also uses VID as training set and exponential decay learning rate schedule?

Thanks

@foolwood
Copy link
Owner

foolwood commented Oct 9, 2017

First of all, I have to admit that CFNet is the first official publication (in CVPR2017) of an end-to-end learning framework about CF.

Compared to CFNet that is end-to-end pre-trained on the same training dataset, DCFNet achieves a relative gain of 9.8% in AUC because it extracts features without resolution loss, carries out CF based appearance modeling and tracking consistently in the frequency domain.

Without Resolution Loss

The feature extractor of our DCFNet never reduce resolution (stride = 1).
In simple terms, this may be a simple difference in network architecture design.
I think that this is a very important factor for visual tracking, and if there is no border effect, a DCF operatation on a dense features can be interpreted as a approximation of Continuous Convolution.
Besides, I have do a lot of experiment about network architecture and resolution. From our experiments, we observe that decreasing feature spatial resolution can cause a large reduction in the AUC accuracy. (33<63<125<169)

Consistently in the frequency domain

The CFNet is a improved version of SiamFC.
The filter of CFNet learned is croped to a small size (17x17) for time-domain correlation, which will strongly harm the performance.
So far, I have not seen the CFNet source code. I guess the main reason for the crop operation is to be consistent with SiamFC.
(Just Imagine) Even if the training image and test image are the same image, the cropped filter may produce a bad response. For a normal CF (not SRDCF), there's no guarantee that the center part of filter are more effective.

In general, CFNet is a very good paper with perfect proofs and experimental controls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants