Open
Description
hi,
it seems when i deploy the model
huggingface_model = HuggingFaceModel(
model_data=model_s3_uri,
role=role,
transformers_version="4.49.0",
pytorch_version="2.6.0",
py_version="py312",
)
predictor = huggingface_model.deploy(
instance_type="ml.g5.48xlarge",
initial_instance_count=1,
endpoint_name="gemma-27b-inference",
container_startup_health_check_timeout=900
)
response = predictor.predict({
"inputs": "what can i do?"
})
print(response)
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400)
from primary with message "{
"code": 400,
"type": "InternalServerException",
"message": "The checkpoint you are trying to load has model type gemma3_text but Transformers does not
recognize this architecture. This could be because of an issue with the checkpoint, or because your version of
Transformers is out of date.\n\nYou can update Transformers with the command pip install --upgrade transformers.
now i know HuggingFaceModel doesnt support anything above 4.49.0 so if i try to run 4.50.0 it will give an error saying please use this version. the thing is gemma3 is not available in 4.49 so how to fix this? i have the model in my bucket trained just cant deploy it due to the versions of transformers. is there a way to override the container inside the huggingface that takes a more advanced transformer?
I did this, but the issue now is in sagemaker, cuz i cannot use this for the huggingface version as it doesn't support it
pip install git+https://github.com/huggingface/transformers@v4.49.0-Gemma-3
Metadata
Metadata
Assignees
Labels
No labels