Skip to content

Sample Question - Error when Converting Phi 3.5 to QNN #2200

@pkbullock

Description

@pkbullock

I am running sample: https://github.com/microsoft/olive-recipes/tree/main/microsoft-Phi-3.5-mini-instruct/QNN

Running this on a Snapdragon device (Lenovo Slim 7x), to convert to a model that will run on the NPU. I have checked SenstencePiece and Tiktoken is installed.

Am i missing something here? I am relative new to Olive so trying to get a model working.
Do i need to login into HuggingFace and download the source model first, and then run this?

I am getting the following error:

Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 30.03it/s]
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast'>. This is expected, and simply means that the legacy (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set legacy=False. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in huggingface/transformers#24565 - if you loaded a llama tokenizer from a GGUF file you can ignore this message.
Traceback (most recent call last):
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\transformers\convert_slow_tokenizer.py", line 1737, in convert_slow_tokenizer
).converted()
^^^^^^^^^^^
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\transformers\convert_slow_tokenizer.py", line 1631, in converted
tokenizer = self.tokenizer()
^^^^^^^^^^^^^^^^
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\transformers\convert_slow_tokenizer.py", line 1624, in tokenizer
vocab_scores, merges = self.extract_vocab_merges_from_model(self.vocab_file)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\transformers\convert_slow_tokenizer.py", line 1600, in extract_vocab_merges_from_model
bpe_ranks = load_tiktoken_bpe(tiktoken_url)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\tiktoken\load.py", line 162, in load_tiktoken_bpe
contents = read_file_cached(tiktoken_bpe_file, expected_hash)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\tiktoken\load.py", line 52, in read_file_cached
cache_key = hashlib.sha1(blobpath.encode()).hexdigest()
^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'encode'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in run_code
File "C:\Users\PaulBullock\miniconda3\envs\model\Scripts\olive.exe_main
.py", line 6, in
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\olive\cli\launcher.py", line 66, in main
service.run()
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\olive\cli\run.py", line 59, in run
workflow_output = olive_run(
^^^^^^^^^^
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\olive\workflows\run\run.py", line 178, in run
return run_engine(package_config, run_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\olive\workflows\run\run.py", line 139, in run_engine
return engine.run(
^^^^^^^^^^^
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\olive\engine\engine.py", line 224, in run
run_result = self.run_accelerator(
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\olive\engine\engine.py", line 306, in run_accelerator
output_footprint = self._run_no_search(input_model_config, input_model_id, accelerator_spec, output_dir)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\olive\engine\engine.py", line 350, in _run_no_search
should_prune, signal, model_ids = self._run_passes(input_model_config, input_model_id, accelerator_spec)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\olive\engine\engine.py", line 641, in _run_passes
model_config, model_id = self._run_pass(
^^^^^^^^^^^^^^^
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\olive\engine\engine.py", line 736, in _run_pass
output_model_config = host.run_pass(p, input_model_config, output_model_path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\olive\systems\local.py", line 45, in run_pass
output_model = the_pass.run(model, output_model_path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\olive\passes\olive_pass.py", line 242, in run
output_model = self._run_for_config(model, self.config, output_model_path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\torch\utils_contextlib.py", line 120, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\olive\passes\pytorch\rotate.py", line 253, in _run_for_config
model.save_metadata(output_model_path)
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\olive\model\handler\mixin\hf.py", line 107, in save_metadata
get_tokenizer(output_dir)
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\olive\common\hf\utils.py", line 205, in get_tokenizer
tokenizer = from_pretrained(AutoTokenizer, model_name_or_path, "tokenizer", **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\olive\common\hf\utils.py", line 94, in from_pretrained
return cls.from_pretrained(get_pretrained_name_or_path(model_name_or_path, mlflow_dir), **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\transformers\models\auto\tokenization_auto.py", line 1069, in from_pretrained
return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\transformers\tokenization_utils_base.py", line 2014, in from_pretrained
return cls._from_pretrained(
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\transformers\tokenization_utils_base.py", line 2260, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\transformers\models\llama\tokenization_llama_fast.py", line 154, in init
super().init(
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\transformers\tokenization_utils_fast.py", line 139, in init
fast_tokenizer = convert_slow_tokenizer(self, from_tiktoken=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\PaulBullock\miniconda3\envs\model\Lib\site-packages\transformers\convert_slow_tokenizer.py", line 1739, in convert_slow_tokenizer
raise ValueError(
ValueError: Converting from SentencePiece and Tiktoken failed, if a converter for SentencePiece is available, provide a model path with a SentencePiece tokenizer.model file.Currently available slow->fast converters: ['AlbertTokenizer', 'BartTokenizer', 'BarthezTokenizer', 'BertTokenizer', 'BigBirdTokenizer', 'BlenderbotTokenizer', 'CamembertTokenizer', 'CLIPTokenizer', 'CodeGenTokenizer', 'ConvBertTokenizer', 'DebertaTokenizer', 'DebertaV2Tokenizer', 'DistilBertTokenizer', 'DPRReaderTokenizer', 'DPRQuestionEncoderTokenizer', 'DPRContextEncoderTokenizer', 'ElectraTokenizer', 'FNetTokenizer', 'FunnelTokenizer', 'GPT2Tokenizer', 'HerbertTokenizer', 'LayoutLMTokenizer', 'LayoutLMv2Tokenizer', 'LayoutLMv3Tokenizer', 'LayoutXLMTokenizer', 'LongformerTokenizer', 'LEDTokenizer', 'LxmertTokenizer', 'MarkupLMTokenizer', 'MBartTokenizer', 'MBart50Tokenizer', 'MPNetTokenizer', 'MobileBertTokenizer', 'MvpTokenizer', 'NllbTokenizer', 'OpenAIGPTTokenizer', 'PegasusTokenizer', 'Qwen2Tokenizer', 'RealmTokenizer', 'ReformerTokenizer', 'RemBertTokenizer', 'RetriBertTokenizer', 'RobertaTokenizer', 'RoFormerTokenizer', 'SeamlessM4TTokenizer', 'SqueezeBertTokenizer', 'T5Tokenizer', 'UdopTokenizer', 'WhisperTokenizer', 'XLMRobertaTokenizer', 'XLNetTokenizer', 'SplinterTokenizer', 'XGLMTokenizer', 'LlamaTokenizer', 'CodeLlamaTokenizer', 'GemmaTokenizer', 'Phi3Tokenizer']

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions