Skip to content

feat(lazy_vlm): implement normal prefill and decode logic#424

Merged
chenghuaWang merged 2 commits intoUbiquitousLearning:v2from
chenghuaWang:v2
Sep 4, 2025
Merged

feat(lazy_vlm): implement normal prefill and decode logic#424
chenghuaWang merged 2 commits intoUbiquitousLearning:v2from
chenghuaWang:v2

Conversation

@chenghuaWang
Copy link
Copy Markdown
Collaborator

feat(lazy_vlm): implement normal prefill and decode logic

  • Add forward_normal_prefill and forward_normal_decode functions to Qwen2_5VLAttention, Qwen2_5VLDecoder, and Qwen2_5VLText modules
  • Implement causal mask creation and initialization
  • Update hidden states and position embeddings for normal prefill and decode
  • Modify KV cache update logic
  • Refactor main function to use new normal prefill and decode logic
  • Update run.py to enable debug build

- Update submodule versions:
  - fmt: removed branch specification
  - googletest: removed branch specification
  - benchmark: removed branch specification
  - pybind11: removed branch specification
  - kleidiai: removed branch specification
  - cccl: removed branch specification and updated to v3.0.2
  - cutlass: removed branch specification and updated to v4.1.0
- Fix lazy VLM cache issue by updating hidden state cache correctly
- Improve makeWindowIndex function formatting
- Add TODO markers for lazy VLM decoding
- Add forward_normal_prefill and forward_normal_decode functions to Qwen2_5VLAttention, Qwen2_5VLDecoder, and Qwen2_5VLText modules
- Implement causal mask creation and initialization
- Update hidden states and position embeddings for normal prefill and decode
- Modify KV cache update logic
- Refactor main function to use new normal prefill and decode logic
- Update run.py to enable debug build
@chenghuaWang chenghuaWang merged commit df4b32a into UbiquitousLearning:v2 Sep 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant