You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.
It seems the current implementation of concat operator is based on mshadow. And if the input of concat has multiple NDArray, on gpu, it will launch kernel for many times. Tensorflow has customized kernel for concat operator, it will do kernel launch only once.
Any plan to optimize this?
The text was updated successfully, but these errors were encountered:
intoraw
changed the title
concat operator implementation
concat operator optimization
Aug 10, 2017
This issue is closed due to lack of activity in the last 90 days. Feel free to ping me to reopen if this is still an active issue. Thanks!
Also, do please check out our forum (and Chinese version) for general "how-to" questions.
It seems the current implementation of
concat
operator is based on mshadow. And if the input of concat has multiple NDArray, on gpu, it will launch kernel for many times. Tensorflow has customized kernel for concat operator, it will do kernel launch only once.Any plan to optimize this?
The text was updated successfully, but these errors were encountered: