**Is your feature request related to a problem? Please describe.** Context parallelism to scale HSTU training. **Describe the solution you'd like** TBD **Describe alternatives you've considered** TBD **Additional context** See [Context Parallelism](https://docs.nvidia.com/megatron-core/developer-guide/latest/api-guide/context_parallel.html)