Tweak how we use spans #42

airbreather · 2019-03-08T14:23:01Z

Base implementations now have virtual methods that can be overridden as appropriate to provide optimized versions for any of the following layouts:
- [x0, y0, z0, x1, y1, z1, ..., xn, yn, zn]
- [x0, y0, x1, y1, ..., xn, yn, z0, z1, ..., zn]
- [x0, x1, ..., xn, y0, y1, ..., yn, z0, z1, ..., zn]

All built-in algorithms now only override the one-by-one variants. The point of allowing subclasses to override for spans (and allowing different layouts) is to allow implementations that use SIMD and/or auxiliary DLLs for their implementations. There ought not to be a particularly significant advantage to using spans the way we were doing before.

- Base implementations now have virtual methods that can be overridden as appropriate to provide optimized versions for any of the following layouts: - [x0, y0, z0, x1, y1, z1, ..., xn, yn, zn] - [x0, y0, x1, y1, ..., xn, yn, z0, z1, ..., zn] - [x0, x1, ..., xn, y0, y1, ..., yn, z0, z1, ..., zn] All built-in algorithms now only override the one-by-one variants. The point of allowing subclasses to override for spans (and allowing different layouts) is to allow implementations that use SIMD and/or auxiliary DLLs for their implementations. There ought not to be a **particularly** significant advantage to using spans the way we were doing before.

airbreather · 2019-03-08T14:25:15Z

When looking at this, make sure to hide whitespace-only changes:

…y bit cheaper on the crusty ol' .NET Framework with basically no extra development effort

FObermaier

We should come up with a performance test to help find a conclusion on this

airbreather · 2019-03-08T16:23:18Z

We should come up with a performance test to help find a conclusion on this

Perhaps, but it's not just performance (especially not performance of ProjNet4GeoAPI itself) that prompted my reply, but a combination of how easy it is for dependents to use it, how easy it is to implement new projections, and what I'm going to call "ideal performance".

"Ideal performance"...

I don't know if there's a term for this already, but when I say "ideal performance", I mean the best performance that the library could possibly achieve (without breaking changes) when the caller provides the system with all the information that they have, without having to circumvent too many abstractions.

In other words, if someone has a PackedDoubleCoordinateSequence with Dimension = 2 and Measures = 0 and they want to transform it in-place, then they can do:

MapProjection theProjection = GetTheProjection();
PackedDoubleCoordinateSequence seq = GetTheSequence();
Span<XY> xys = MemoryMarshal.Cast<double, XY>(seq.GetRawCoordinates());
theProjection.Transform(xys, xys);

and then we have the option (if we choose to optimize this path) to call out to a native DLL that implements the routine as efficiently as the platform allows... but we probably won't, because the built-in is probably good enough.

If someone has a coordinate sequence that stores the points laid out SoA-style (because, perhaps, they do other SIMD operations), then they should be able to do:

MapProjection theProjection = GetTheProjection();
SOACoordinateSequence seq = GetTheSequence();
Span<double> xs = seq.GetXs();
Span<double> ys = seq.GetYs();
theProjection.Transform(xs, ys, xs, ys);

and then we have the option (again, if we choose to optimize it) to call out to a different native DLL (or use vector intrinsics coming in .NET Core 3.0 😁) to transform multiple points at a time... but again, we probably won't, because the built-in is still probably good enough.

BUT... suppose (purely hypothetically, of course) that someone has been passing around MapProjection instances throughout their application, and they just so happen to find that there could be some throughput benefits by dispatching to a different implementation upon seeing a particular set of parameter values.

This different implementation might have pre-computed some of the calculations that become invariant after doing so, rewritten the logic flow to maximize ILP (including, perhaps, unrolling loops), and even perhaps used advanced SIMD intrinsics that somehow only Intel's C++ compiler even bothers to offer (though who would be crazy enough to do that?).
Such a hypothetical individual might very much appreciate the opportunity to override a method on MapProjection that accepts an input in a compatible shape and uses the optimized code path when it sees that it can, falling back to the not-as-optimized version in all other cases.

This kind of thing is what I'm referring to when I talk about performance (which is why the comment was originally logged in GeoAPI since IMathTransform is an even better place for this)... basically giving people:

something that's usable out-of-the-box and probably fast enough most of the time and
the tools that they would need in order to selectively eliminate bottlenecks

airbreather · 2019-03-08T17:16:12Z

we have the option (if we choose to optimize this path) to call out to a native DLL that implements the routine as efficiently as the platform allows

heh... it wouldn't even be just to optimize performance: I think all of the "bulk" methods I added can translate to efficient calls to PROJ.4's proj_trans_generic function if we wanted to make a P/Invoke wrapper around that to expand the list of projections we support 🤓.

FObermaier · 2019-03-11T15:31:42Z

@airbreather, I've confirmed that your approach does not imply decrease of performance so I'm happy with persuing your approach.

Let's clean it up:

What do you think needs to be added to GeoAPI / GeoAPI.CoordinateSystems.

airbreather · 2019-03-11T15:43:40Z

What do you think needs to be added to GeoAPI / GeoAPI.CoordinateSystems.

That depends on which of NetTopologySuite/GeoAPI#67 or NetTopologySuite/GeoAPI#68 (or neither) we choose to do for 2.0... my ideal would be along the lines of NetTopologySuite/GeoAPI#68: replace IMathTransform with an abstract base class that does the same thing I did for MathTransform in here, namely, one abstract "one-by-one" method and lots of virtual methods calling it.

FObermaier · 2019-03-11T16:03:33Z

SequenceCoordinateConverterBase should be changed to SequenceTransformerBase so that it has access to the MathTransform object and can decide on its own which of the Transform overloads suits its data structure best.

airbreather · 2019-03-11T16:26:00Z

SequenceCoordinateConverterBase should be changed to SequenceTransformerBase so that it has access to the MathTransform object and can decide on its own which of the Transform overloads suits its data structure best.

Ahh, yeah, I was thinking about that type too... my thought on it was that it should have something like (edit: "it" = the SequenceCoordinateConverterBase):

public virtual SomeNewEnum GetIdealLayout(ICoordinateSequence coordinateSequence)
{
    return SomeNewEnum.DoesntMatter;

    // DoesntMatter: we pick based on what's most efficient for us (or arbitrarily)
    // other defined values tell us which of the variants is most efficient for the sequence
}

And then expose methods to copy to/from the different layouts.

The idea being that it will work for all different possible sequences, but almost all sequences will naturally copy to/from a particular layout more efficiently than any other (because the coordinates are already in that order to begin with, so it will very often be possible to use the optimized Span<T> copy methods), and so we will want to know which of the different variants would be best for each sequence.

Edit edit: but yeah, you're right that it should also take into account the math transform object so that it can also consider which method(s) actually are optimized. Heh.

* Tweak how we use spans. - Base implementations now have virtual methods that can be overridden as appropriate to provide optimized versions for any of the following layouts: - [x0, y0, z0, x1, y1, z1, ..., xn, yn, zn] - [x0, y0, x1, y1, ..., xn, yn, z0, z1, ..., zn] - [x0, x1, ..., xn, y0, y1, ..., yn, z0, z1, ..., zn] All built-in algorithms now only override the one-by-one variants. The point of allowing subclasses to override for spans (and allowing different layouts) is to allow implementations that use SIMD and/or auxiliary DLLs for their implementations. There ought not to be a **particularly** significant advantage to using spans the way we were doing before. * Bleh, this was supposed to use the arrays, not the spans, to be a tiny bit cheaper on the crusty ol' .NET Framework with basically no extra development effort

airbreather changed the base branch from develop to perf/Transform_with_Span March 8, 2019 14:23

airbreather mentioned this pull request Mar 8, 2019

[Enhancement] Performance-Friendly IMathTransform Alternative NetTopologySuite/GeoAPI#32

Open

airbreather requested a review from FObermaier March 8, 2019 14:26

Bleh, this was supposed to use the arrays, not the spans, to be a tin…

3d34270

…y bit cheaper on the crusty ol' .NET Framework with basically no extra development effort

FObermaier reviewed Mar 8, 2019

View reviewed changes

Merge branch 'perf/Transform_with_Span' into perf/span-tweaks

aff1f00

FObermaier merged commit 22d5cc8 into perf/Transform_with_Span Mar 11, 2019

FObermaier added this to Done in ProjNet V2 Mar 11, 2019

FObermaier mentioned this pull request Mar 11, 2019

Settle on MathTransform.Transform method signature #43

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tweak how we use spans #42

Tweak how we use spans #42

airbreather commented Mar 8, 2019

airbreather commented Mar 8, 2019

FObermaier left a comment

airbreather commented Mar 8, 2019

airbreather commented Mar 8, 2019

FObermaier commented Mar 11, 2019

airbreather commented Mar 11, 2019

FObermaier commented Mar 11, 2019

airbreather commented Mar 11, 2019 •

edited

Loading

Tweak how we use spans #42

Tweak how we use spans #42

Conversation

airbreather commented Mar 8, 2019

airbreather commented Mar 8, 2019

FObermaier left a comment

Choose a reason for hiding this comment

airbreather commented Mar 8, 2019

"Ideal performance"...

airbreather commented Mar 8, 2019

FObermaier commented Mar 11, 2019

airbreather commented Mar 11, 2019

FObermaier commented Mar 11, 2019

airbreather commented Mar 11, 2019 • edited Loading

airbreather commented Mar 11, 2019 •

edited

Loading