Training speed slower than Non-local attention #93

noobliang · 2021-01-11T02:17:11Z

I found add a module of Non-local attention which just add a little extra time cost about 0.4s each iter. But if I add a CC-attention take R==1 , the train time each iter about 0.7s, and 1.0s if R==2. It's not like the description in your paper. I dont know why. Can anyone explain it .

speedinghzl · 2021-01-11T05:09:42Z

@noobliang Thanks for your attention. The inefficient program implementation results in a slower speed than non-local attention. In term of computation cost and memory usage, the CCNet still have advantages mentioned in the paper. Looking forward to more efficient program implementation.

noobliang · 2021-01-11T06:20:53Z

@noobliang Thanks for your attention. The inefficient program implementation results in a slower speed than non-local attention. In term of computation cost and memory usage, the CCNet still have advantages mentioned in the paper. Looking forward to more efficient program implementation.

Well , thank you reply. I found it almost useless for my segmentation network， it may be the positive samples of the picture too few.

noobliang closed this as completed Jan 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training speed slower than Non-local attention #93

Training speed slower than Non-local attention #93

noobliang commented Jan 11, 2021

speedinghzl commented Jan 11, 2021

noobliang commented Jan 11, 2021

Training speed slower than Non-local attention #93

Training speed slower than Non-local attention #93

Comments

noobliang commented Jan 11, 2021

speedinghzl commented Jan 11, 2021

noobliang commented Jan 11, 2021