Reduce the number of concatenation for 10% inference time reduction #1093

pommedeterresautee · 2019-09-13T10:18:21Z

Instead of concatenating token representations at the token level (row approach) and then perform the sentence tokenization, we make the process lazy by retrieving each token representation before concat, reorganize them in nested lists and perform the concat per column (all Word embeddings are concatenated in one operation, then all LM are concatenated in one op), then columns (each column being a kind of representation) are concatenated together.
The idea is because there are much more tokens per sentence than different kind of representation per token, there are less concatenation operations performed.

On Conll2003, 40 -> 36s.
GPU use whole time over 70% (when it reaches 100 there may be some improvements remaining, but the main bottleneck will be the model itself).

Let me know if your measures match :-)

FWIW, on French dataset, it s a 20% improvement, I have stopped to try to guess why.

Nb. : I downloaded Connl 2003 from https://github.com/synalp/NER/tree/master/corpus/CoNLL-2003

alanakbik · 2019-09-13T12:26:16Z

Wow this is a great idea to save cat operations! :) I get the same speed improvements, i.e. CoNLL-03 inference is down from ~34 to ~31 seconds with 'gpu' storage and from ~32.7 to ~29.8 seconds with 'none'. Awesome!

adizdari · 2019-09-13T12:32:16Z

👍

alanakbik · 2019-09-13T12:38:50Z

👍

pommedeterresautee added 5 commits September 8, 2019 20:08

reduce calls to concatenate

f7cc05d

Merge remote-tracking branch 'upstream/master' into remove_cat_call

23ba5c2

Merge remote-tracking branch 'upstream/master' into remove_cat_call

52bcc92

don't try to move if not required

31c3bf7

extend the cat optimization

e767cd1

alanakbik merged commit 1bf72db into flairNLP:master Sep 13, 2019

pommedeterresautee deleted the remove_cat_call branch September 13, 2019 13:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce the number of concatenation for 10% inference time reduction #1093

Reduce the number of concatenation for 10% inference time reduction #1093

pommedeterresautee commented Sep 13, 2019 •

edited

alanakbik commented Sep 13, 2019

adizdari commented Sep 13, 2019

alanakbik commented Sep 13, 2019

Reduce the number of concatenation for 10% inference time reduction #1093

Reduce the number of concatenation for 10% inference time reduction #1093

Conversation

pommedeterresautee commented Sep 13, 2019 • edited

alanakbik commented Sep 13, 2019

adizdari commented Sep 13, 2019

alanakbik commented Sep 13, 2019

pommedeterresautee commented Sep 13, 2019 •

edited