Shieldgemma2 #36678

RyanMullins · 2025-03-12T15:43:16Z

What does this PR do?

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

github-actions · 2025-03-12T15:43:29Z

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. When it is ready for review, please click the Ready for review button (at the bottom of the PR page).

ArthurZucker

Thanks, small nits but good to go otherwise!

docs/source/en/model_doc/shieldgemma2.md

src/transformers/models/shieldgemma2/modeling_shieldgemma2.py

src/transformers/models/shieldgemma2/processing_shieldgemma2.py

ain-soph · 2025-03-17T18:22:13Z

@RyanMullins Submit a PR for this branch as bug fixes.
RyanMullins#2

Fix a bug in readme example snippet (return_tensors="pt" for Processor call)
Use public image link as example (from https://huggingface.co/google/shieldgemma-2-4b-it)
Fix typo

ain-soph · 2025-03-17T18:35:04Z

There is still an logging issue that outputs incorrect info:

from PIL import Image
import requests
from transformers import AutoProcessor, ShieldGemma2ForImageClassification

model_id = "google/shieldgemma-2-4b-it"
model = ShieldGemma2ForImageClassification.from_pretrained(model_id, device_map="auto")
processor = AutoProcessor.from_pretrained(model_id)

url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/bee.jpg"
image = Image.open(requests.get(url, stream=True).raw)

custom_policies = {
    "key_a": "descrition_a",
    "key_b": "descrition_b",
}

inputs = processor(
    images=[image],
    custom_policies=custom_policies,
    policies=["dangerous", "key_b"],
    return_tensors="pt",
).to(model.device)

output = model(**inputs)
print(output.probabilities)

$ python test.py
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:06<00:00,  3.12s/it]
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.50, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Keyword argument `custom_policies` is not a valid argument for this processor and will be ignored.
Keyword argument `policies` is not a valid argument for this processor and will be ignored.
tensor([[5.3613e-11, 1.0000e+00],
        [3.7379e-09, 1.0000e+00]], device='cuda:1', grad_fn=<ToCopyBackward0>)

As you can see above, the argument custom_policies and policies are already functioning (the output tensor is only 2 classes now), but still get

Keyword argument `custom_policies` is not a valid argument for this processor and will be ignored.
Keyword argument `policies` is not a valid argument for this processor and will be ignored.

ain-soph · 2025-03-17T18:40:13Z

And currently it's confusing about the output tensor.

What's the meaning of each dimension for the output tensor shape?
What's the meaning of each number and which harmful category is it mapped to? I assume the second column is not necessary since it seems to be 1 - first_column

It would be nice if we have something for explanation in the readme / API doc.

RyanMullins · 2025-03-18T15:24:49Z

@ain-soph thanks for the contribs. For some reason I can't reply to your comments, but I've added docstrings to address your questions about the shape of the output tensors, and I added an explicit ShieldGemma2ProcessorKwargs to address the logs you saw.

ghunkins · 2025-03-18T20:04:48Z

When running the default example, an IndexError is raised. Passing in custom_policies fixes that.

Example

from PIL import Image
import requests
from transformers import AutoProcessor, ShieldGemma2ForImageClassification

model_id = "google/shieldgemma-2-4b-it"
model = ShieldGemma2ForImageClassification.from_pretrained(model_id, device_map="auto")
processor = AutoProcessor.from_pretrained(model_id)

url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/bee.jpg"
image = Image.open(requests.get(url, stream=True).raw)

inputs = processor(images=[image], return_tensors="pt").to(model.device)

output = model(**inputs)
print(output.probabilities)

Stack Trace

Loading checkpoint shards: 100%
 2/2 [00:06<00:00,  2.86s/it]
WARNING:accelerate.big_modeling:Some parameters are on the meta device because they were offloaded to the cpu.
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
[<ipython-input-26-290947fc3325>](https://localhost:8080/#) in <cell line: 0>()
     10 image = Image.open(requests.get(url, stream=True).raw)
     11 
---> 12 inputs = processor(images=[image], return_tensors="pt").to(model.device)
     13 
     14 output = model(**inputs)

1 frames
[/usr/local/lib/python3.11/dist-packages/transformers/processing_utils.py](https://localhost:8080/#) in apply_chat_template(self, conversation, chat_template, **kwargs)
   1326 
   1327         if isinstance(conversation, (list, tuple)) and (
-> 1328             isinstance(conversation[0], (list, tuple)) or hasattr(conversation[0], "content")
   1329         ):
   1330             is_batched = True

IndexError: list index out of range

ArthurZucker

LGTM! THanks for iterating!

src/transformers/models/shieldgemma2/processing_shieldgemma2.py

RyanMullins · 2025-03-19T20:53:51Z

@ghunkins thanks for the bug report!

The problem was that the files on the Hub were out of date and the processor.json was missing the policy_definitions property. I was using updated files locally, which is why I didn't see this until I repro'd your example in Colab. The files are now updated on the Hub.

@ArthurZucker accidentally suggested a local solution to the problem by providing a default set of policies. So this should be fully fixed now and ready for merging.

ghunkins · 2025-03-19T21:12:52Z

Brilliant, thanks for the amazing work on this @RyanMullins !

…2.py

ArthurZucker

@RyanMullins let's just fix cis! 🤗

ArthurZucker · 2025-03-20T13:24:01Z

Do you need help on my side for the last remaining tests?

ain-soph · 2025-03-20T17:58:49Z

nit: I just tested and there is the same warning for policy_definitions.
Some kwargs in processor config are unused and will not have any effect: policy_definitions.

github-actions bot marked this pull request as draft March 12, 2025 15:43

RyanMullins marked this pull request as ready for review March 12, 2025 18:22

github-actions bot requested review from ArthurZucker and Rocketknight1 March 12, 2025 18:23

ArthurZucker approved these changes Mar 17, 2025

View reviewed changes

shethaadit approved these changes Mar 17, 2025

View reviewed changes

ArthurZucker approved these changes Mar 19, 2025

View reviewed changes

src/transformers/models/shieldgemma2/processing_shieldgemma2.py Outdated Show resolved Hide resolved

ArthurZucker and others added 13 commits March 20, 2025 13:09

single commit

8c4fcd1

correct config

f261333

fixup

5dc670d

dummy pt

7965976

Use ShieldGemma2Config in conversion script

623dc67

Update src/transformers/models/shieldgemma2/configuration_shieldgemma…

5f89e82

…2.py

Adding shieldgemma2 to models.__init__.py

d7ccaa2

Adding ShieldGemma2 to main __init__.py

dde795b

Update shieldgemma2.md

5b7f8c9

Update shieldgemma2.md

bd0837d

Adding tests. Addressing review feedback.

89e1c33

Minor docs update

c3a4a14

Fixing code quality feedback from CI

48ac9e9

RyanMullins force-pushed the shieldgemma2 branch from 9b861f5 to 3967fd8 Compare March 20, 2025 13:09

ArthurZucker reviewed Mar 20, 2025

View reviewed changes

RyanMullins force-pushed the shieldgemma2 branch 3 times, most recently from 03efe5e to 36135cf Compare March 20, 2025 13:54

Fixing empty messages bug reported by ghunkins

c070a1a

RyanMullins force-pushed the shieldgemma2 branch from 36135cf to c070a1a Compare March 20, 2025 14:02

ArthurZucker merged commit 487dab1 into huggingface:main Mar 20, 2025
19 of 21 checks passed

ArthurZucker added the New model label Mar 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shieldgemma2 #36678

Shieldgemma2 #36678

RyanMullins commented Mar 12, 2025

github-actions bot commented Mar 12, 2025

ArthurZucker left a comment

ain-soph commented Mar 17, 2025 •

edited

Loading

ain-soph commented Mar 17, 2025

ain-soph commented Mar 17, 2025 •

edited

Loading

RyanMullins commented Mar 18, 2025

ghunkins commented Mar 18, 2025

ArthurZucker left a comment

RyanMullins commented Mar 19, 2025

ghunkins commented Mar 19, 2025

ArthurZucker left a comment

ArthurZucker commented Mar 20, 2025

ain-soph commented Mar 20, 2025

Shieldgemma2 #36678

Shieldgemma2 #36678

Conversation

RyanMullins commented Mar 12, 2025

What does this PR do?

Before submitting

Who can review?

github-actions bot commented Mar 12, 2025

ArthurZucker left a comment

Choose a reason for hiding this comment

ain-soph commented Mar 17, 2025 • edited Loading

ain-soph commented Mar 17, 2025

ain-soph commented Mar 17, 2025 • edited Loading

RyanMullins commented Mar 18, 2025

ghunkins commented Mar 18, 2025

Example

Stack Trace

ArthurZucker left a comment

Choose a reason for hiding this comment

RyanMullins commented Mar 19, 2025

ghunkins commented Mar 19, 2025

ArthurZucker left a comment

Choose a reason for hiding this comment

ArthurZucker commented Mar 20, 2025

ain-soph commented Mar 20, 2025

ain-soph commented Mar 17, 2025 •

edited

Loading

ain-soph commented Mar 17, 2025 •

edited

Loading