In [1]:
#@title License information { display-mode: "form" }
#@markdown Copyright 2024 The MediaPipe Authors.
#@markdown Licensed under the Apache License, Version 2.0 (the "License");
#@markdown
#@markdown you may not use this file except in compliance with the License.
#@markdown You may obtain a copy of the License at
#@markdown
#@markdown https://www.apache.org/licenses/LICENSE-2.0
#@markdown
#@markdown Unless required by applicable law or agreed to in writing, software
#@markdown distributed under the License is distributed on an "AS IS" BASIS,
#@markdown WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#@markdown See the License for the specific language governing permissions and
#@markdown limitations under the License.

In [2]:
#@title Setup { display-mode: "form" }
import ipywidgets as widgets
from IPython.display import display
from google.colab import files
install_out = widgets.Output()
display(install_out)
with install_out:
  !pip install mediapipe
  from mediapipe.tasks.python.genai import bundler

install_out.clear_output()
with install_out:
  print("Setup done.")

Output()

# Why do we need task bundles?

Executing a text generation pipeline in its entirety requires more than merely employing the core transformer model. It requires preparing the input text to align with the model's required format, running the model autoregressively, and sampling during each iteration. Consequently, once the user has converted their model into `tflite` format (comprising the model's graph and parameters), it becomes essential to augment it with additional metadata to ensure successful end-to-end execution of the model.

Task bundler offers a practical approach to creating such bundles. The example below illustrates how a task bundle can be generated for a converted Gemma model.

In [5]:
tflite_model="distilbert-qa.tflite" # @param {type:"string"}
tokenizer_model="PATH/tokenizer.model" # @param {type:"string"}
start_token="<bos>" # @param {type:"string"}
stop_token="<eos>" # @param {type:"string"}
output_filename="PATH/gemma.task" # @param {type:"string"}
enable_bytes_to_unicode_mapping=False # @param ["False", "True"] {type:"raw"}

config = bundler.BundleConfig(
    tflite_model=tflite_model,
    tokenizer_model=tokenizer_model,
    start_token=start_token,
    stop_tokens=[stop_token],
    output_filename=output_filename,
    enable_bytes_to_unicode_mapping=enable_bytes_to_unicode_mapping,
)
bundler.create_bundle(config)

ValueError: Failed to load tokenizer model from PATH/tokenizer.model. Please ensure you are passing a valid SentencePiece model.

**Notes:**

* The current task pipeline only supports SentencePiece tokenizer models.
* Certain models (e.g., phi-2) use bytes to unicode mapping. Use `enable_bytes_to_unicode_mapping` flag accordingly. Such information, including `start_token` and `stop_tokens` are often provided along with model artifacts.
* The generated output bundle file must end with `.task`. If not, `create_bundle` automatically adds the extension. Do not remove or change that.

In [None]:
#@title Download the Bundle { display-mode: "form" }
#@markdown Run this cell to download the generated `.task` file.
files.download(output_filename)

In [6]:
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-cased-distilled-squad")
tokenizer.save_pretrained("bert_tokenizer/")


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/49.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/473 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/436k [00:00<?, ?B/s]

('bert_tokenizer/tokenizer_config.json',
 'bert_tokenizer/special_tokens_map.json',
 'bert_tokenizer/vocab.txt',
 'bert_tokenizer/added_tokens.json',
 'bert_tokenizer/tokenizer.json')

In [7]:
# build_fake_corpus.py
with open("bert_tokenizer/vocab.txt", "r") as fin, open("fake_corpus.txt", "w") as fout:
    for line in fin:
        token = line.strip()
        if token and not token.startswith("["):
            fout.write(token.replace("##", "") + "\n")


In [8]:
!pip install sentencepiece

# Train SentencePiece tokenizer
!spm_train \
  --input=fake_corpus.txt \
  --model_prefix=distilbert_tokenizer \
  --vocab_size=30522 \
  --character_coverage=1.0 \
  --model_type=word

/bin/bash: line 1: spm_train: command not found


In [None]:
!find / -name "spm_train" 2>/dev/null

In [10]:
!apt-get update && apt-get install -y cmake build-essential pkg-config libgoogle-perftools-dev
!git clone https://github.com/google/sentencepiece.git
%cd sentencepiece
!mkdir build && cd build && cmake .. && make -j $(nproc) && make install && ldconfig
%cd /content

0% [Working]            Hit:1 http://archive.ubuntu.com/ubuntu jammy InRelease
0% [Waiting for headers] [Waiting for headers] [Connected to cloud.r-project.or                                                                               Get:2 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [128 kB]
                                                                               Hit:3 http://archive.ubuntu.com/ubuntu jammy-backports InRelease
Get:4 http://security.ubuntu.com/ubuntu jammy-security InRelease [129 kB]
Hit:5 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  InRelease
Get:6 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease [3,632 B]
Get:7 https://r2u.stat.illinois.edu/ubuntu jammy InRelease [6,555 B]
Get:8 https://ppa.launchpadcontent.net/deadsnakes/ppa/ubuntu jammy InRelease [18.1 kB]
Hit:9 https://ppa.launchpadcontent.net/graphics-drivers/ppa/ubuntu jammy InRelease
Hit:10 https://ppa.launchpadcontent.net/ubuntugis

In [11]:
!spm_train --version

sentencepiece 0.2.1


In [13]:
!spm_train \
      --input=fake_corpus.txt \
      --model_prefix=distilbert_tokenizer \
      --vocab_size=26598 \
      --character_coverage=1.0 \
      --model_type=word

sentencepiece_trainer.cc(78) LOG(INFO) Starts training with : 
trainer_spec {
  input: fake_corpus.txt
  input_format: 
  model_prefix: distilbert_tokenizer
  model_type: WORD
  vocab_size: 26598
  self_test_sample_size: 0
  character_coverage: 1
  input_sentence_size: 0
  shuffle_input_sentence: 1
  seed_sentencepiece_size: 1000000
  shrinking_factor: 0.75
  max_sentence_length: 4192
  num_threads: 16
  num_sub_iterations: 2
  max_sentencepiece_length: 16
  split_by_unicode_script: 1
  split_by_number: 1
  split_by_whitespace: 1
  split_digits: 0
  pretokenization_delimiter: 
  treat_whitespace_as_suffix: 0
  allow_whitespace_only_pieces: 0
  required_chars: 
  byte_fallback: 0
  vocabulary_output_piece_score: 1
  train_extremely_large_corpus: 0
  seed_sentencepieces_file: 
  hard_vocab_limit: 1
  use_all_vocab: 0
  unk_id: 0
  bos_id: 1
  eos_id: 2
  pad_id: -1
  unk_piece: <unk>
  bos_piece: <s>
  eos_piece: </s>
  pad_piece: <pad>
  unk_surface:  ⁇ 
  enable_differential_privacy: 0

In [2]:
from transformers import TFAutoModelForQuestionAnswering
model = TFAutoModelForQuestionAnswering.from_pretrained("distilbert-base-cased-distilled-squad")
model.save("distilbert_tf_model")

import tensorflow as tf
converter = tf.lite.TFLiteConverter.from_saved_model("distilbert_tf_model")
tflite_model = converter.convert()


#tflite_model.save()

# Save the model to a file
with open("distilbert-qa.tflite", "wb") as f:
    f.write(tflite_model)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/473 [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/261M [00:00<?, ?B/s]

All PyTorch model weights were used when initializing TFDistilBertForQuestionAnswering.

All the weights of TFDistilBertForQuestionAnswering were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFDistilBertForQuestionAnswering for predictions without further training.


In [2]:
#!pip install --upgrade --force-reinstall mediapipe-model-maker==0.2.1.4
!pip install --upgrade  --user mediapipe-model-maker==0.2.1.4
!python3 -m mediapipe_model_maker.genai.task_converter \
  --tflite_model distilbert-qa.tflite \
  --tokenizer_model distilbert_tokenizer.model \
  --output_file distilbert_qa.task

2025-04-11 16:51:43.499445: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2025-04-11 16:51:43.499565: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2025-04-11 16:51:43.503000: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-04-11 16:51:43.520378: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.

TensorFlow Addons (TFA) has ended development and in

In [3]:
pip show mediapipe-model-maker

Name: mediapipe-model-maker
Version: 0.2.1.4
Summary: MediaPipe Model Maker is a simple, low-code solution for customizing on-device ML models
Home-page: https://github.com/google/mediapipe/tree/master/mediapipe/model_maker
Author: The MediaPipe Authors
Author-email: mediapipe@google.com
License: Apache 2.0
Location: /usr/local/lib/python3.11/dist-packages
Requires: absl-py, mediapipe, numpy, opencv-python, tensorflow, tensorflow-addons, tensorflow-datasets, tensorflow-hub, tensorflow-model-optimization, tensorflow-text, tf-models-official
Required-by: 


In [6]:
!pip install tensorflow text
!pip install tensorflow_hub

[31mERROR: Could not find a version that satisfies the requirement text (from versions: none)[0m[31m
[0m[31mERROR: No matching distribution found for text[0m[31m


In [7]:
import tensorflow as tf; print(tf.__version__)

2.15.1
