concat operator optimization #7407

intoraw · 2017-08-10T01:30:42Z

It seems the current implementation of concat operator is based on mshadow. And if the input of concat has multiple NDArray, on gpu, it will launch kernel for many times. Tensorflow has customized kernel for concat operator, it will do kernel launch only once.
Any plan to optimize this?

The text was updated successfully, but these errors were encountered:

szha · 2017-11-09T12:26:26Z

This issue is closed due to lack of activity in the last 90 days. Feel free to ping me to reopen if this is still an active issue. Thanks!
Also, do please check out our forum (and Chinese version) for general "how-to" questions.

intoraw changed the title ~~concat operator implementation~~ concat operator optimization Aug 10, 2017

szha closed this as completed Nov 9, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

concat operator optimization #7407

concat operator optimization #7407

intoraw commented Aug 10, 2017

szha commented Nov 9, 2017

concat operator optimization #7407

concat operator optimization #7407

Comments

intoraw commented Aug 10, 2017

szha commented Nov 9, 2017