[onert-micro] how to support on-device training on model with GRU ? #13365

chunseoklee · 2024-07-09T04:23:54Z

GRU operation in circle can be defined in two ways. During conversion from Keras, it may be converted into :

Single "Custom" GRU operation as in (onert-micro) [DRAFT] Support custom GRU #11840 + (ONE compiler) [compiler FE/onert-micro] Support CircleGRU #12263
multi subgraph based on While op as in [onert-micro] Support models with GRU #10465

IMHO, onert-micro is not ready to handle training on multi subgraph.

chunseoklee · 2024-07-09T04:24:10Z

BalyshevArtem · 2024-07-09T09:04:37Z

Single "Custom" GRU operation as in (onert-micro)

I think better to use custom GRU. It also will have better latency and memory consumption effect. And in my opinions easier to support (maybe I am wrong).

chunseoklee · 2024-08-07T06:30:10Z

Here is a reference GRU model and fused GRU model by #13602

gru_fused.zip

tflite model is generated by the following code :

  import tensorflow as tf
  from tensorflow import keras
  from tensorflow.keras import regularizers
  import numpy as np

  adapt_data = np.array([[0., 7., 4. , 0.5],
                         [2., 9., 6. , -0.5],
                         [0., 7., 4. , -0.5],
                         [2., 9., 6. , 0.5]], dtype='float32')
  #normalization_layer.adapt(adapt_data)
  classes = 4
  activation = 'tanh'
  model = tf.keras.models.Sequential([
      tf.keras.Input(shape=(10,4)),
      normalization_layer,
      tf.keras.layers.GRU(units=20, activation=activation, use_bias=True, bias_initializer="ones"),
      tf.keras.layers.Dense(classes, activation='softmax')
  ])

  model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001))

  model.summary()

  run_model = tf.function(lambda x: model(x))

  # This is important, let's fix the input size.
  BATCH_SIZE = 1
  X = 10
  Y = 4
  concrete_func = run_model.get_concrete_function(
      tf.TensorSpec([BATCH_SIZE, X,Y], model.inputs[0].dtype))

  # model directory.
  MODEL_DIR = "keras_model"
  model.save(MODEL_DIR, save_format="tf", signatures=concrete_func)

  converter = tf.lite.TFLiteConverter.from_saved_model(MODEL_DIR)
  converter.experimental_new_converter = True
  converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS,
                                         ]
  #converter.optimizations = [tf.lite.Optimize.DEFAULT]
  converted_model = converter.convert()
  save_to = "GRU.tflite"
  if save_to is not None:
      with open(save_to, 'wb') as tf_lite_file:
          tf_lite_file.write(converted_model)

and apply #13625

chunseoklee · 2024-08-09T06:21:45Z

Let's try to train GRU operation with model at #13365 (comment)

chunseoklee · 2024-08-13T01:36:01Z

inferencing GRU
- draft : [DRAFT][onert-micro] Support GRU #13651
training GRU
- draft : [Draft][onert-micro] Introduce GRU training #13737

BalyshevArtem · 2024-08-21T15:17:56Z

Training result

There is training result for #13737

Model obtained from:

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import regularizers
import numpy as np

classes = 4
activation = 'tanh'
model = tf.keras.models.Sequential([
  tf.keras.Input(shape=(60,3)),
  tf.keras.layers.GRU(units=60, activation=activation),
  tf.keras.layers.Dense(classes, activation='softmax')
])

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001))
model.summary()
run_model = tf.function(lambda x: model(x))

# This is important, let's fix the input size.
BATCH_SIZE = 1
X = 60
Y = 3
concrete_func = run_model.get_concrete_function(
  tf.TensorSpec([BATCH_SIZE, X,Y], model.inputs[0].dtype))

# model directory.
MODEL_DIR = "keras_model"
model.save(MODEL_DIR, save_format="tf", signatures=concrete_func)

converter = tf.lite.TFLiteConverter.from_saved_model(MODEL_DIR)
converter.experimental_new_converter = True
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS,
                                     ]
#converter.optimizations = [tf.lite.Optimize.DEFAULT]
converted_model = converter.convert()
save_to = "gru_stick.tflite"
if save_to is not None:
  with open(save_to, 'wb') as tf_lite_file:
      tf_lite_file.write(converted_model)

Training data is data for a targeted model. In this experiment, 1000 random samples were used for training and 150 for testing from the original training data. Task is a classification task. I used cross entropy as loss and accuracy as metric.
In order to make sure that the GRU layer is learning, we first train only the last FullyConnected layer in the initial model, and then we train both the FullyConnected layer and the GRU layer.

Initial values:

Test Average ACCURACY = 0.34
Test Average CROSS ENTROPY = 2.54871

Train only last (FullyConnected) layer:

Test Average ACCURACY = 0.61
Test Average CROSS ENTROPY = 0.898501

Train last FullyConnected + GRU:

Test Average ACCURACY = 0.72
Test Average CROSS ENTROPY = 0.751191

Thus, it can be seen that the GRU layer is trained and helps to achieve better results in this task.

chunseoklee added FEATURE_REQUEST A formal request for a new or advanced feature. type/discussion We need discussion. Discussion itself can help. Even without conclusions! labels Jul 9, 2024

hseok-oh changed the title ~~[onert-miro] how to support on-device training on model with GRU ?~~ [onert-micro] how to support on-device training on model with GRU ? Jul 10, 2024

BalyshevArtem mentioned this issue Aug 6, 2024

[DRAFT][compiler] Introduce FuseGRUPass #13602

Open

chunseoklee mentioned this issue Aug 8, 2024

[luci/pass] this pattern(stridedslice has strange begin(-1)/end(0)) can be supported by luci/pass ? #13620

Open

BalyshevArtem mentioned this issue Aug 8, 2024

[DRAFT][compiler] Introduce EliminateDeadSubgraphPass #13628

Open

chunseoklee mentioned this issue Aug 9, 2024

[onert-micro] TODO list for Sep Release #13634

Open

27 tasks

BalyshevArtem mentioned this issue Aug 21, 2024

[Draft][onert-micro] Introduce GRU training #13737

Closed

chunseoklee mentioned this issue Aug 27, 2024

[onert-micro] Add GRU backward execution #13757

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[onert-micro] how to support on-device training on model with GRU ? #13365

[onert-micro] how to support on-device training on model with GRU ? #13365

chunseoklee commented Jul 9, 2024

chunseoklee commented Jul 9, 2024

BalyshevArtem commented Jul 9, 2024

chunseoklee commented Aug 7, 2024 •

edited

Loading

chunseoklee commented Aug 9, 2024

chunseoklee commented Aug 13, 2024 •

edited

Loading

BalyshevArtem commented Aug 21, 2024 •

edited

Loading

[onert-micro] how to support on-device training on model with GRU ? #13365

[onert-micro] how to support on-device training on model with GRU ? #13365

Comments

chunseoklee commented Jul 9, 2024

chunseoklee commented Jul 9, 2024

BalyshevArtem commented Jul 9, 2024

chunseoklee commented Aug 7, 2024 • edited Loading

chunseoklee commented Aug 9, 2024

chunseoklee commented Aug 13, 2024 • edited Loading

BalyshevArtem commented Aug 21, 2024 • edited Loading

Training result

chunseoklee commented Aug 7, 2024 •

edited

Loading

chunseoklee commented Aug 13, 2024 •

edited

Loading

BalyshevArtem commented Aug 21, 2024 •

edited

Loading