Cannot load moondream2 model in colab

**This is the error when I run the code for moondream2 model in colab :**

llama_model_loader: loaded meta data with 19 key-value pairs and 245 tensors from /root/.cache/huggingface/hub/models--vikhyatk--moondream2/snapshots/92d3d73b6fd61ab84d9fe093a9c7fd8c04bf2c0d/./moondream2-text-model-f16.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = phi2
llama_model_loader: - kv   1:                               general.name str              = moondream2
llama_model_loader: - kv   2:                        phi2.context_length u32              = 2048
llama_model_loader: - kv   3:                      phi2.embedding_length u32              = 2048
llama_model_loader: - kv   4:                   phi2.feed_forward_length u32              = 8192
llama_model_loader: - kv   5:                           phi2.block_count u32              = 24
llama_model_loader: - kv   6:                  phi2.attention.head_count u32              = 32
llama_model_loader: - kv   7:               phi2.attention.head_count_kv u32              = 32
llama_model_loader: - kv   8:          phi2.attention.layer_norm_epsilon f32              = 0.000010
llama_model_loader: - kv   9:                  phi2.rope.dimension_count u32              = 32
llama_model_loader: - kv  10:                          general.file_type u32              = 1
llama_model_loader: - kv  11:               tokenizer.ggml.add_bos_token bool             = false
llama_model_loader: - kv  12:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  13:                      tokenizer.ggml.tokens arr[str,51200]   = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv  14:                  tokenizer.ggml.token_type arr[i32,51200]   = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  15:                      tokenizer.ggml.merges arr[str,50000]   = ["Ġ t", "Ġ a", "h e", "i n", "r e",...
llama_model_loader: - kv  16:                tokenizer.ggml.bos_token_id u32              = 50256
llama_model_loader: - kv  17:                tokenizer.ggml.eos_token_id u32              = 50256
llama_model_loader: - kv  18:            tokenizer.ggml.unknown_token_id u32              = 50256
llama_model_loader: - type  f32:  147 tensors
llama_model_loader: - type  f16:   98 tensors
llm_load_vocab: missing pre-tokenizer type, using: 'default'
llm_load_vocab:                                             
llm_load_vocab: ************************************        
llm_load_vocab: GENERATION QUALITY WILL BE DEGRADED!        
llm_load_vocab: CONSIDER REGENERATING THE MODEL             
llm_load_vocab: ************************************        
llm_load_vocab:                                             
llama_model_load: error loading model: error loading model vocabulary: Failed to process regex
llama_load_model_from_file: failed to load model

ValueError                                Traceback (most recent call last)
[<ipython-input-25-857a72dc5c1e>](https://localhost:8080/#) in <cell line: 9>()
      7 )
      8 
----> 9 llm = Llama.from_pretrained(
     10   repo_id="vikhyatk/moondream2",
     11   filename="*text-model*",

2 frames
[/usr/local/lib/python3.10/dist-packages/llama_cpp/_internals.py](https://localhost:8080/#) in __init__(self, path_model, params, verbose)
     53 
     54         if self.model is None:
---> 55             raise ValueError(f"Failed to load model from file: {path_model}")
     56 
     57         def free_model():

ValueError: Failed to load model from file: /root/.cache/huggingface/hub/models--vikhyatk--moondream2/snapshots/92d3d73b6fd61ab84d9fe093a9c7fd8c04bf2c0d/./moondream2-text-model-f16.gguf

**The code I run is provided in the readme in this repo :**

from llama_cpp import Llama
from llama_cpp.llama_chat_format import MoondreamChatHandler

chat_handler = MoondreamChatHandler.from_pretrained(
  repo_id="vikhyatk/moondream2",
  filename="*mmproj*",
)

llm = Llama.from_pretrained(
  repo_id="vikhyatk/moondream2",
  filename="*text-model*",
  chat_handler=chat_handler,
  n_ctx=2048, # n_ctx should be increased to accommodate the image embedding
)

response = llm.create_chat_completion(
    messages = [
        {
            "role": "user",
            "content": [
                {"type" : "text", "text": "What's in this image?"},
                {"type": "image_url", "image_url": {"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg" } }
            ]
        }
    ]
)
print(response["choices"][0]["text"])

**please fix the code in readme.**

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot load moondream2 model in colab #1760

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Cannot load moondream2 model in colab #1760

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions