Support nlp inference and on-device tokenization #17

jchalatu · 2023-12-08T20:19:47Z

This PR primarily adds support in the SavedModelRunner for text/string based inputs during NLP model inference. A new class was also added to device_modules named NetworkTokenizer, which allows users to make network calls to a device with the SentencePiece CLI installed.

github-actions · 2023-12-08T20:43:13Z

Coverage report

The coverage rate went from 100% to 100% ➡️

100% of new lines are covered.

Diff Coverage details (click to unfold)

nengo_edge/network_runner.py

100% of new lines are covered (100% of the complete file).

nengo_edge/version.py

100% of new lines are covered (100% of the complete file).

nengo_edge/saved_model_runner.py

100% of new lines are covered (100% of the complete file).

nengo_edge/device_modules/network_tokenizer.py

100% of new lines are covered (100% of the complete file).

drasmuss

Looks good, just minor comments.

drasmuss · 2023-12-18T16:12:56Z

nengo_edge/saved_model_runner.py

+            # Process string inputs
+            assert self.model_params["type"] == "nlp"
+            assert self.tokenizer is not None
+            return self._run_model(self.tokenizer.tokenize(inputs))


Would this work if we did inputs = self.tokenizer.tokenizer(inputs).numpy()? It'd be nice to share the _run_model call if possible, just to have as much of the pipeline shared as possible.

We should do inputs.lower() here too, since that's what we're currently assuming in the tokenizer training.

I had it where the pipeline was shared at one point, but I'll have to test it out again because I'm pretty sure something wasn't working.

Also, I have adjusted our model/tokenizer training pipeline to more closely match how we are doing it elsewhere (in past projects). We can discuss that further, as I'm not sure which is more correct or accurate for the vocabulary sizes we are considering.

drasmuss · 2023-12-18T16:16:37Z

nengo_edge/tests/test_saved_model_runner.py

+        ]
+
+        ragged_in0 = [inputs[0]]  # type: ignore
+        ragged_in1 = [inputs[1]]  # type: ignore


What was it complaining about with these type errors?

I think it was because ragged_in0/1 is already defined in the asr block above, so it was a re-assignment problem if I recall.

drasmuss · 2023-12-18T16:53:56Z

nengo_edge/device_modules/network_tokenizer.py

+        Parameters
+        ----------
+        input_text: str
+            Input string to be tokenized.


detokenize takes a batch of tokens, but tokenize just takes a single string. We should probably make them both the same, for consistency. I think we could basically rename decode_ids to detokenize (supporting non-batched inputs), and then the batch loop could go in .run (same as tokenize). So in both cases tokenize/detokenize are for single inputs, and run handles the batching.

drasmuss · 2023-12-18T17:06:52Z

nengo_edge/saved_model_runner.py

-        # Detokenize asr outputs using greedy decoding
+        # Detokenize asr outputs by applying greedy decoding, removing
+        # blank tokens and merging repeats before feeding values into the
+        # sentencepiece tokenizer


Do you remember why we didn't use keras.backend.ctc_decode here? Seems like we could here, and it'd take care of this for us.

Yeah that's fair, I just forgot about that function when I was making this fixup. I think we also initially forgot to include it in this step and our original test didn't include blanks/repeats (or at least it didn't impact the outcome of the test if that were to occur).

drasmuss · 2023-12-18T17:08:03Z

nengo_edge/device_modules/network_tokenizer.py

+        )
+        output = subprocess.run(
+            cmd.split(),
+            input=input_text,


Should do an input_text.lower() here too.

jchalatu

I'm fine not to include these changes if they don't seem that important. Looks good overall though.

jchalatu · 2024-01-04T15:16:12Z

nengo_edge/saved_model_runner.py

-            outputs = np.array(_outputs)
+            # Apply CTC decoding
+            outputs = tf.keras.backend.ctc_decode(
+                tf.nn.softmax(ragged.to_masked(outputs)),


I'm not sure we actually need the softmax here, despite the model outputting logits. The internal argmax during Keras' ctc_decode should work the same with or without it. It's probably fine to leave it though, not a big difference.

I though that too, but for some reason it doesn't. If you just pass in the logits you get a very different and incorrect decoding (this is the bug that Lisa was running into a while back). I never really investigated why though, my guess is that somewhere in there they're assuming these sum to 1 or something like that.

jchalatu · 2024-01-04T15:27:52Z

nengo_edge/tests/test_saved_model_runner.py

@@ -73,8 +73,8 @@ def test_runner_ragged(
    _, tokenizer_path = new_tokenizer(tmp_path)

    if model_type == "asr":
-        assert isinstance(pipeline.post[0], TokenizerDesc)
-        pipeline.post[0].tokenizer_file = tokenizer_path
+        assert isinstance(pipeline.post[1], TokenizerDesc)


Can we reference these as -1 indexing, since we know the tokenizer will always be at the end of a pipeline? Just so we avoid having to open a PR to change the indexing if the other layers change.

jchalatu force-pushed the nlp branch 5 times, most recently from 55eeb88 to 0386f0e Compare December 13, 2023 18:00

drasmuss reviewed Dec 18, 2023

View reviewed changes

drasmuss force-pushed the nlp branch 2 times, most recently from 2b84ee7 to 6760b0f Compare December 22, 2023 14:54

drasmuss force-pushed the nlp branch 2 times, most recently from 7f19807 to ebb9a68 Compare January 4, 2024 13:31

jchalatu commented Jan 4, 2024

View reviewed changes

jchalatu added 2 commits January 4, 2024 12:03

Support string inputs to SavedModelRunner

562340e

Add NetworkTokenizer for on-device tokenization

340d2d4

drasmuss force-pushed the nlp branch from 7a55f7d to 647f3c3 Compare January 4, 2024 16:05

jchalatu and others added 3 commits January 4, 2024 12:49

Rework CI dependencies

2d892ca

Fix asr decoding in SavedModelRunner

c9b87a4

Update copyright year

7feec8b

drasmuss force-pushed the nlp branch from 647f3c3 to 7feec8b Compare January 4, 2024 16:49

drasmuss merged commit 7feec8b into main Jan 4, 2024
1 check failed

drasmuss deleted the nlp branch January 4, 2024 16:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support nlp inference and on-device tokenization #17

Support nlp inference and on-device tokenization #17

jchalatu commented Dec 8, 2023

github-actions bot commented Dec 8, 2023 •

edited

nengo_edge/network_runner.py

nengo_edge/version.py

nengo_edge/saved_model_runner.py

nengo_edge/device_modules/network_tokenizer.py

drasmuss left a comment

drasmuss Dec 18, 2023

drasmuss Dec 18, 2023

jchalatu Dec 18, 2023

drasmuss Dec 18, 2023

jchalatu Dec 18, 2023

drasmuss Dec 18, 2023

drasmuss Dec 18, 2023

jchalatu Dec 18, 2023 •

edited

drasmuss Dec 18, 2023

jchalatu left a comment

jchalatu Jan 4, 2024

drasmuss Jan 4, 2024

jchalatu Jan 4, 2024

Support nlp inference and on-device tokenization #17

Support nlp inference and on-device tokenization #17

Conversation

jchalatu commented Dec 8, 2023

github-actions bot commented Dec 8, 2023 • edited

Coverage report

nengo_edge/network_runner.py

nengo_edge/version.py

nengo_edge/saved_model_runner.py

nengo_edge/device_modules/network_tokenizer.py

drasmuss left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jchalatu Dec 18, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jchalatu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Dec 8, 2023 •

edited

jchalatu Dec 18, 2023 •

edited