Add unit tests for Multi-Head Latent Attention (MLA).
Source: custom_ops/gpu_ops/ — look for multi_head_latent_attention
Registration: custom_ops/gpu_ops/cpp_extensions.cc
Test file: tests/operators/test_multi_head_latent_attention.py
MLA is used in DeepSeek-style models. Compare against a reference using standard attention + low-rank projection. Cover different head configs, sequence lengths, and KV compression ratios.
Branch: task/053-multi-head-latent-attention-test