From 18b33fb9cce2c23dee61e8ca222b2ea2d6e23780 Mon Sep 17 00:00:00 2001 From: Mengwei Liu Date: Wed, 9 Apr 2025 12:00:32 -0700 Subject: [PATCH] [doc] Fix tokenizer related documentation `extension.llm.tokenizer.tokenizer` -> `pytorch_tokenizers.tools.llama2c.convert` --- .../android/LlamaDemo/docs/delegates/qualcomm_README.md | 2 +- examples/models/llama2/README.md | 2 +- examples/models/phi-3-mini/README.md | 2 +- examples/qualcomm/oss_scripts/llama/README.md | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/examples/demo-apps/android/LlamaDemo/docs/delegates/qualcomm_README.md b/examples/demo-apps/android/LlamaDemo/docs/delegates/qualcomm_README.md index f6952df97ad..e8bc1dc1d26 100644 --- a/examples/demo-apps/android/LlamaDemo/docs/delegates/qualcomm_README.md +++ b/examples/demo-apps/android/LlamaDemo/docs/delegates/qualcomm_README.md @@ -135,7 +135,7 @@ You may also wonder what the "--metadata" flag is doing. This flag helps export Convert tokenizer for Llama 2 ``` -python -m extension.llm.tokenizer.tokenizer -t tokenizer.model -o tokenizer.bin +python -m pytorch_tokenizers.tools.llama2c.convert -t tokenizer.model -o tokenizer.bin ``` Rename tokenizer for Llama 3 with command: `mv tokenizer.model tokenizer.bin`. We are updating the demo app to support tokenizer in original format directly. diff --git a/examples/models/llama2/README.md b/examples/models/llama2/README.md index 92ddbf74d94..615ad3948fc 100644 --- a/examples/models/llama2/README.md +++ b/examples/models/llama2/README.md @@ -41,7 +41,7 @@ You can export and run the original Llama 2 7B model. ``` 4. Create tokenizer.bin. ``` - python -m extension.llm.tokenizer.tokenizer -t -o tokenizer.bin + python -m pytorch_tokenizers.tools.llama2c.convert -t -o tokenizer.bin ``` Pass the converted `tokenizer.bin` file instead of `tokenizer.model` for subsequent steps. diff --git a/examples/models/phi-3-mini/README.md b/examples/models/phi-3-mini/README.md index ba878d42a3f..f52f2a3a06d 100644 --- a/examples/models/phi-3-mini/README.md +++ b/examples/models/phi-3-mini/README.md @@ -13,7 +13,7 @@ pip uninstall -y transformers ; pip install transformers==4.44.2 ``` cd executorch wget -O tokenizer.model "https://huggingface.co/microsoft/Phi-3-mini-128k-instruct/resolve/main/tokenizer.model?download=true" -python -m extension.llm.tokenizer.tokenizer -t tokenizer.model -o tokenizer.bin +python -m pytorch_tokenizers.tools.llama2c.convert -t tokenizer.model -o tokenizer.bin ``` 2. Export the model. This step will take a few minutes to finish. ``` diff --git a/examples/qualcomm/oss_scripts/llama/README.md b/examples/qualcomm/oss_scripts/llama/README.md index cd468eebb26..9b6ec9574eb 100644 --- a/examples/qualcomm/oss_scripts/llama/README.md +++ b/examples/qualcomm/oss_scripts/llama/README.md @@ -41,7 +41,7 @@ wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt" wget "https://raw.githubusercontent.com/karpathy/llama2.c/master/tokenizer.model" # tokenizer.bin: -python -m extension.llm.tokenizer.tokenizer -t tokenizer.model -o tokenizer.bin +python -m pytorch_tokenizers.tools.llama2c.convert -t tokenizer.model -o tokenizer.bin # params.json: echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json