Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

concat operator optimization #7407

Closed
intoraw opened this issue Aug 10, 2017 · 1 comment
Closed

concat operator optimization #7407

intoraw opened this issue Aug 10, 2017 · 1 comment

Comments

@intoraw
Copy link

intoraw commented Aug 10, 2017

It seems the current implementation of concat operator is based on mshadow. And if the input of concat has multiple NDArray, on gpu, it will launch kernel for many times. Tensorflow has customized kernel for concat operator, it will do kernel launch only once.
Any plan to optimize this?

@intoraw intoraw changed the title concat operator implementation concat operator optimization Aug 10, 2017
@szha
Copy link
Member

szha commented Nov 9, 2017

This issue is closed due to lack of activity in the last 90 days. Feel free to ping me to reopen if this is still an active issue. Thanks!
Also, do please check out our forum (and Chinese version) for general "how-to" questions.

@szha szha closed this as completed Nov 9, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants