I have discovered a symbolic Jacobian matrix assembly method that can achieve a speed-up of around 1.4x on PEGASE 9241. It was considered a minor improvement before I implemented this, but it seems worthy. Meanwhile, an easier optimization on dSbusdV is provided as a reference, which can also provide a decent speed-up, but it is surpassed by direct J matrix assembly.
Benchmark results (10 loops, PEGASE9241, release mode,intel 10700K ,5 iterations):
| Variant | rsparse | KLU |
|---|---|---|
| no opt (original CSC path) | 216.92 ms | 184.72 ms |
| half opt (element-wise dSbus_dV) | 175.50 ms (1.24×) | 139.11 ms (1.33×) |
| opt (fill_jacobian_ultimate) | 152.34 ms (1.42×) | 119.19 ms (1.55×) |