1. * [ ] Support both source and destination offsets in `NetIbQp::stage_send()` 2. * [ ] Offsets of importing/exporting tensors are not properly handled 3. * [x] Use Kahan sum for layernorm (#159)