[Converter] TFLite size is larger than expected #320

Juelianqvq · 2024-05-29T08:36:19Z

peterjc123 · 2024-05-30T02:24:38Z

The model you posted has been inspected, in which the size of buffers is only 36KB. So, the only way to lower the size of the model is to rewrite the GRU operation using tfl.UnidirectionalGRU or subgraph related ops like tfl.While and tfl.Call. Also, the implementation of separated_rnn_gate_calc=False for GRU may help.

Juelianqvq · 2024-05-30T13:13:00Z

The model you posted has been inspected, in which the size of buffers is only 36KB. So, the only way to lower the size of the model is to rewrite the GRU operation using tfl.UnidirectionalGRU or subgraph related ops like tfl.While and tfl.Call. Also, the implementation of separated_rnn_gate_calc=False for GRU may help.

In contrast of LSTM, since nt relies on rt, it looks unlikely to put GRU's weights together because you need to split rt from the output(Maybe I'm wrong). Will it take effect? I'm worried about the performance of optimizing this part.
It seems that developing tfl.UnidirectionalGRU as a custom op needs another compilation, which means the new integrated op cannot be executed on others' PC and it brings me a heavy burden since we only use TFLite as an intermediate format.
With above concerns, what's your suggestion and what's the fastest way as you think? Looking forward to your reply.

peterjc123 · 2024-05-30T13:58:45Z

In contrast of LSTM, since nt relies on rt, it looks unlikely to put GRU's weights together because you need to split rt from the output(Maybe I'm wrong). Will it take affect? I'm worried about the performance of optimizing this part.

# separated_rnn_gate_calc=False
rzt_left = FC_i{r,z}(x)
rzt_right = FC_h{r,z}(h)
rzt_sum = rzt_left + rzt_right
rzt = sigmoid(rzt_sum)
rt, zt = split(rzt, 2)

# separated_rnn_gate_calc=True
rt_left = FC_ir(x)
rt_right = FC_hr(h)
rt_sum = rt_left + rt_right
rt = sigmoid(rzt_sum)
zt_left = FC_zr(x)
zt_right = FC_hz(h)
zt_sum = zt_left + zt_right
zt = sigmoid(zt_sum)

So it will be optimized from 8 ops (10 tensors) to 5 ops (8 tensors) for each time step.

It seems that developing tfl.UnidirectionalGRU as a custom op needs another compilation, which means the new integrated op cannot be executed on others' PC and it brings me a heavy burden since we only use TFLite as an intermediate format.

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/core/kernels/register.cc

Unfortunately, it is a custom op as of now (May 30, 2024).

With above concerns, what's your suggestion and what's the fastest way as you think? Looking forward to your reply.

I don't know. It depends on your needs. It the target model with its size around 80-100K is desired, I guess separated_rnn_gate_calc=False should be enough. But if you want something lower than that, then subgraph related things are your only hope.

Juelianqvq · 2024-05-30T14:46:48Z

So it will be optimized from 8 ops (10 tensors) to 5 ops (8 tensors) for each time step.

Take a glance at the previous implementation of AtenGRUOperator, it definitely has some space(6 FC) for further optimization(4FC or even 2FC). And it sounds very easy for me to realize the separated_rnn_gate_calc=False option because I have realized it before(though failed to pass bidirectional test).

I don't know. It depends on your needs. It the target model with its size around 80-100K is desired, I guess separated_rnn_gate_calc=False should be enough. But if you want something lower than that, then subgraph related things are your only hope.

I'm quite interested in translating such challengable builtin operators. Based on my current ability, may I ask what the procedure it will be to support tfl.while, I have no answer now due to my limitation of understanding.

peterjc123 · 2024-05-30T15:00:40Z

I'm quite interested in translating such challengable builtin operators. Based on my current ability, may I ask what the procedure it will be to support tfl.while, I have no answer now due to my limitation of understanding.

Well, it is doable and not hard at all if only GRU is involved, but for a better design, it takes some time.

Juelianqvq · 2024-05-30T15:12:44Z

Well, it is doable and not hard at all if only GRU is involved, but for a better design, it takes some time.

One week is enough? I'm available 24/7 as long as it can be realized incrementally, lol. You can focus on your business first and any instruction is helpful to me when you free.

P.S. not only float structure but also quantized gru with while can be supported

peterjc123 · 2024-05-31T02:19:53Z

One week is enough? I'm available 24/7 as long as it can be realized incrementally, lol. You can focus on your business first and any instruction is helpful to me when you free.

Well, I cannot guarantee on that. But I can do QA and guide you throughout the process.

P.S. not only float structure but also quantized gru with while can be supported

This is just copy paste I think. The main difficulty is to add a new subgraph for the GRU operation, so doesn't bother.

Juelianqvq changed the title ~~[Converter] TFLite size is much bigger than ONNX~~ [Converter] TFLite size is larger than expected May 29, 2024

peterjc123 added bug Something isn't working enhancement New feature or request and removed bug Something isn't working labels May 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Converter] TFLite size is larger than expected #320

[Converter] TFLite size is larger than expected #320

Juelianqvq commented May 29, 2024 •

edited

Loading

peterjc123 commented May 30, 2024 •

edited

Loading

Juelianqvq commented May 30, 2024 •

edited

Loading

peterjc123 commented May 30, 2024 •

edited

Loading

Juelianqvq commented May 30, 2024

peterjc123 commented May 30, 2024

Juelianqvq commented May 30, 2024 •

edited

Loading

peterjc123 commented May 31, 2024

[Converter] TFLite size is larger than expected #320

[Converter] TFLite size is larger than expected #320

Comments

Juelianqvq commented May 29, 2024 • edited Loading

peterjc123 commented May 30, 2024 • edited Loading

Juelianqvq commented May 30, 2024 • edited Loading

peterjc123 commented May 30, 2024 • edited Loading

Juelianqvq commented May 30, 2024

peterjc123 commented May 30, 2024

Juelianqvq commented May 30, 2024 • edited Loading

peterjc123 commented May 31, 2024

Juelianqvq commented May 29, 2024 •

edited

Loading

peterjc123 commented May 30, 2024 •

edited

Loading

Juelianqvq commented May 30, 2024 •

edited

Loading

peterjc123 commented May 30, 2024 •

edited

Loading

Juelianqvq commented May 30, 2024 •

edited

Loading