Skip to content

Conversation

@hengtaoguo
Copy link
Collaborator

@hengtaoguo hengtaoguo commented Nov 20, 2025

Description

Tests

Tested by decode forward pass.

python -m MaxText.decode MaxText/configs/base.yml model_name=gemma3-4b tokenizer_type=huggingface tokenizer_path=google/gemma-3-4b-it load_parameters_path=gs://maxtext-gemma/unified/gemma3/4b/unscanned/2025-08-09-01-17/0/items per_device_batch_size=1 run_name=ht_test max_prefill_predict_length=272 max_target_length=372 steps=1 async_checkpointing=false scan_layers=false use_multimodal=true prompt=\'Describe\ image\ \<start_of_image\>\' image_path=\'/home/hengtaoguo_google_com/projects/maxtext/src/MaxText/test_assets/test_image.jpg\' attention=\'dot_product\' hf_access_token=xxx

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

@hengtaoguo hengtaoguo marked this pull request as ready for review November 20, 2025 23:58
@hengtaoguo hengtaoguo changed the title Update attention for Gemma3 ViT Update attention layer for Gemma3 ViT Nov 21, 2025
Copy link
Collaborator

@richjames0 richjames0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@copybara-service copybara-service bot merged commit b53bf3b into main Nov 25, 2025
38 checks passed
@copybara-service copybara-service bot deleted the hengtaoguo-kvcache branch November 25, 2025 19:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants