KeyError: 'shortest_edge' when loading Kosmos-2 model from local files #30522

Charizhardt · 2024-04-27T20:06:50Z

System Info

transformers version: 4.40.1
Platform: Windows-10-10.0.22631-SP0
Python version: 3.10.14
Huggingface_hub version: 0.20.3
Safetensors version: 0.4.2
Accelerate version: not installed
Accelerate config: not found
PyTorch version (GPU?): 2.0.1+cu118 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?: No
Using distributed or parallel set-up in script?: No

Who can help?

@amyeroberts
@NielsRogge

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

Step 1: Import required libraries

from transformers import pipeline
from transformers import AutoProcessor, AutoModelForVision2Seq
from PIL import Image

model_path = "./models/transformers/"

Step 2: Download and save model to local directory

model_name = "microsoft/kosmos-2-patch14-224"

model = AutoModelForVision2Seq.from_pretrained(model_name)
processor = AutoProcessor.from_pretrained(model_name)

model.save_pretrained(model_path)
processor.save_pretrained(model_path)

Step 3: Test if model works

prompt = "<grounding>An image of"
image = Image.open('./images/snowman.png')

inputs = processor(text=prompt, images=image, return_tensors="pt")

generated_ids = model.generate(
    pixel_values=inputs["pixel_values"],
    input_ids=inputs["input_ids"],
    attention_mask=inputs["attention_mask"],
    image_embeds=None,
    image_embeds_position_mask=inputs["image_embeds_position_mask"],
    use_cache=True,
    max_new_tokens=128,
)

generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]

# Specify `cleanup_and_extract=False` in order to see the raw model generation.
processed_text = processor.post_process_generation(generated_text, cleanup_and_extract=False)

print(processed_text)
# `<grounding> An image of<phrase> a snowman</phrase><object><patch_index_0044><patch_index_0863></object> warming himself by<phrase> a fire</phrase><object><patch_index_0005><patch_index_0911></object>.`

Step 4: Load model from local directory and test if it works

model = AutoModelForVision2Seq.from_pretrained(model_path, local_files_only=True)
print("-----------  model loaded from local dir ------------")
processor = AutoProcessor.from_pretrained(model_path, local_files_only=True)
print("-----------  processor loaded from local dir ------------")

generated_ids = model.generate(
    pixel_values=inputs["pixel_values"],
    input_ids=inputs["input_ids"],
    attention_mask=inputs["attention_mask"],
    image_embeds=None,
    image_embeds_position_mask=inputs["image_embeds_position_mask"],
    use_cache=True,
    max_new_tokens=128,
)

generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]

# Specify `cleanup_and_extract=False` in order to see the raw model generation.
processed_text = processor.post_process_generation(generated_text, cleanup_and_extract=False)

print(processed_text)

Expected behavior:

Step 4 should load the model from the local directory and output the same processed_text as step 3.

Actual behavior:

When executing the last step a KeyError is thrown.

Loading checkpoint shards: 100%|██████████| 2/2 [00:01<00:00,  1.16it/s]
-----------  model loaded from local dir ------------
Traceback (most recent call last):
  File "C:\Users\user\AppData\Roaming\Python\Python310\site-packages\IPython\core\interactiveshell.py", line 3577, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-3-5db51f16f851>", line 3, in <module>
    processor = AutoProcessor.from_pretrained(model_path, local_files_only=True)
  File "C:\Users\user\anaconda3\envs\kosmos2\lib\site-packages\transformers\models\auto\processing_auto.py", line 314, in from_pretrained
    return processor_class.from_pretrained(
  File "C:\Users\user\anaconda3\envs\kosmos2\lib\site-packages\transformers\processing_utils.py", line 465, in from_pretrained
    args = cls._get_arguments_from_pretrained(pretrained_model_name_or_path, **kwargs)
  File "C:\Users\user\anaconda3\envs\kosmos2\lib\site-packages\transformers\processing_utils.py", line 511, in _get_arguments_from_pretrained
    args.append(attribute_class.from_pretrained(pretrained_model_name_or_path, **kwargs))
  File "C:\Users\user\anaconda3\envs\kosmos2\lib\site-packages\transformers\image_processing_utils.py", line 207, in from_pretrained
    return cls.from_dict(image_processor_dict, **kwargs)
  File "C:\Users\user\anaconda3\envs\kosmos2\lib\site-packages\transformers\image_processing_utils.py", line 413, in from_dict
    image_processor = cls(**image_processor_dict)
  File "C:\Users\user\anaconda3\envs\kosmos2\lib\site-packages\transformers\models\clip\image_processing_clip.py", line 145, in __init__
    self.size = {"height": size["shortest_edge"], "width": size["shortest_edge"]}
KeyError: 'shortest_edge'

This issue may relate to: #27690

preprocessor_config.json from .models/transformers:

{
  "_valid_processor_keys": [
    "images",
    "do_resize",
    "size",
    "resample",
    "do_center_crop",
    "crop_size",
    "do_rescale",
    "rescale_factor",
    "do_normalize",
    "image_mean",
    "image_std",
    "do_convert_rgb",
    "return_tensors",
    "data_format",
    "input_data_format"
  ],
  "crop_size": {
    "height": 224,
    "width": 224
  },
  "do_center_crop": true,
  "do_convert_rgb": true,
  "do_normalize": true,
  "do_rescale": true,
  "do_resize": true,
  "image_mean": [
    0.48145466,
    0.4578275,
    0.40821073
  ],
  "image_processor_type": "CLIPImageProcessor",
  "image_std": [
    0.26862954,
    0.26130258,
    0.27577711
  ],
  "processor_class": "Kosmos2Processor",
  "resample": 3,
  "rescale_factor": 0.00392156862745098,
  "size": {
    "height": 224,
    "width": 224
  },
  "use_square_size": true
}

The text was updated successfully, but these errors were encountered:

amyeroberts · 2024-04-29T09:38:30Z

cc @ydshieh

ydshieh · 2024-04-30T08:26:47Z

Hi @Charizhardt

Thank you for reporting this issue. I confirmed it is reproducible:

from transformers import AutoProcessor

model_path = "./models/transformers/"
model_name = "microsoft/kosmos-2-patch14-224"

processor = AutoProcessor.from_pretrained(model_name)
processor.save_pretrained(model_path)
processor = AutoProcessor.from_pretrained(model_path, local_files_only=True)
print("-----------  processor loaded from local dir ------------")

with model_name = "openai/clip-vit-large-patch14" there is no such issue.

ydshieh self-assigned this Apr 29, 2024

ydshieh mentioned this issue Apr 30, 2024

Remove use_square_size after loading #30567

Merged

ydshieh closed this as completed in #30567 Apr 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KeyError: 'shortest_edge' when loading Kosmos-2 model from local files #30522

KeyError: 'shortest_edge' when loading Kosmos-2 model from local files #30522

Charizhardt commented Apr 27, 2024 •

edited

amyeroberts commented Apr 29, 2024

ydshieh commented Apr 30, 2024 •

edited

KeyError: 'shortest_edge' when loading Kosmos-2 model from local files #30522

KeyError: 'shortest_edge' when loading Kosmos-2 model from local files #30522

Comments

Charizhardt commented Apr 27, 2024 • edited

System Info

Who can help?

Information

Tasks

Reproduction

Step 1: Import required libraries

Step 2: Download and save model to local directory

Step 3: Test if model works

Step 4: Load model from local directory and test if it works

Expected behavior:

Actual behavior:

amyeroberts commented Apr 29, 2024

ydshieh commented Apr 30, 2024 • edited

Charizhardt commented Apr 27, 2024 •

edited

ydshieh commented Apr 30, 2024 •

edited