Skip to content

Remove initial_state from gated_deltanet chunk_fwd_h problem#110

Merged
msaroufim merged 1 commit intogpu-mode:mainfrom
yf225:simplify_chunk_fwd_h
Mar 6, 2026
Merged

Remove initial_state from gated_deltanet chunk_fwd_h problem#110
msaroufim merged 1 commit intogpu-mode:mainfrom
yf225:simplify_chunk_fwd_h

Conversation

@yf225
Copy link
Copy Markdown
Contributor

@yf225 yf225 commented Mar 6, 2026

initial_state is an inference-only feature (multi-turn/streaming) and not used during training. Simplify the problem to always start from zeros, matching the typical training workload.

initial_state is an inference-only feature (multi-turn/streaming) and
not used during training. Simplify the problem to always start from
zeros, matching the typical training workload.
@msaroufim msaroufim merged commit 6c7120f into gpu-mode:main Mar 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants