You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
But token_seq*_ comes from x_tensor that was Z-scored using the entire window (lookback_window + predict_window + 1).
This makes each token_in position depend on statistics that include future points in the same window, which can cause distribution mismatch vs. real-time inference.
So, I opened a PR with the fix and a small comment tying the change back to the paper rationale: #83
What I changed (minimal fix):
Add some code comments to better locating the paper.
Compute Z-score using only the lookback segment.
Start training from the lookback boundary (predict future steps only), to better simulate deployment.
If you’d like, I can follow up with a fully causal sliding/online normalization variant as well.