Why torch.nn.Linear is split into Transpose and Gemm layers in torch.onnx.export()?

I looked into the output of `torch.onnx.export()` and found that every layers declared as `torch.nn.Linear()` was split into two layers; `Transpose` then `Gemm`. I think it is redundant, because `Gemm` operator of ONNX has `transB` attribute, which transposes the second argument.
Why wouldn't you use the attribute and simply translate it to `Gemm` only?