Analysis re-reworking #34

philipbelesky · 2019-09-20T08:15:25Z

Hey, this is just a tracking PR for some of the ongoing changes that cut across #30 #33 #13; will post highlights as I go and signal when a full review is ready. As of now:

Analysis.cs was getting quite bloated so it has been split into two files: one for the base classes and one for the specific analysis options
Coloring now has a separate per-pixel loop to accommodate the need for this initial slope/elevation coloring to use neighboring pixels that have not been assigned yet when initially building the point cloud
There is a hook for the MeshGeometryAnalysis class of analyses to output geometry after the point cloud/mesh items have been made; e.g. for the water plane or contouring.
The performance of the prototype slope analysis (using just 4 neighbors) is about 50ms in total. I'm sure the function itself can be micro-optimized, but I suspect a better overall method will definitely be needed. The calculate-along-edges approach should be tried and I'd imagine the coloring loop in general would also be quite amenable to multi-threading

Keeping this in means we will need to keep it updated with the equivalent methods in the component, which is unnecessary churn. Better to use these methods adhoc and not in git

…loses mariuszhermansdorfer#29 This includes earlier optimisations that were not present (e.g. the color table as a dictionary) and reworks the way analysis is applied so that mariuszhermansdorfer#33 can be implemented. Future tweaks will be benchmarked and sent in PRs

…r levels; contours) + split up Analysis.cs Analysis.cs was getting a little large; the new file contains just the implementations of each analysis option

mariuszhermansdorfer · 2019-09-22T08:59:51Z

SandWorm/Analytics.cs

+                for (int i = 0; i < neighbours.Count; i++)
+                {
+                    double rise = vertex.Z - neighbours[i].Z;
+                    double run = Math.Sqrt(Math.Pow(vertex.X - neighbours[i].X, 2) + Math.Pow(vertex.Y - neighbours[i].Y, 2));


Since we are going with the D4 approach, half of this equation is always going to be zero. In the N-S direction, deltaX = 0, for E-W, deltaY = 0.
I think this whole line could then be reduced to a simple subtraction. Especially given, that you are checking whether the neighbors exist in the main component anyway.

if (i >= trimmedWidth) neighbourPixels.Add(pointCloud[i - trimmedWidth]); // North neighbour if ((i + 1) % (trimmedWidth) != 0) neighbourPixels.Add(pointCloud[i + 1]); // East neighbour if (i < trimmedWidth * (trimmedHeight - 1)) neighbourPixels.Add(pointCloud[i + trimmedWidth]); // South neighbour if (i % trimmedWidth != 0) neighbourPixels.Add(pointCloud[i - 1]); // West neighbour

mariuszhermansdorfer · 2019-09-22T09:09:46Z

Looking forward to seeing this implemented!
50ms sounds like a relatively big overhead for what we are doing. Essentially it is the same logic as in the GaussianBlurProcessor -> iterate over the entire array of pixels and define a value for each of them based on their neighbors. Blurring takes 1-2ms, we should be able to achieve a similar execution time with slope as well.

Maybe @Sergio0694 has some ideas on how to approach this.

Sergio0694 · 2019-09-22T10:54:32Z

Hey @mariuszhermansdorfer - I'd be happy to help but I'm not 100% sure about what you're trying to do here. A couple small tweaks right off the bat, from what I'm seeing:

Replace foreach loops with for loops (saves creating an iterator)
Replace double variables with float, and use MathF instead of Math, if you don't need the extra 64 bit precision

Other than that, what kind of effect are you trying to apply here? If it's a 2D convolution with some kernel that is linearly separable, then yeah you can replicate my gaussian blur effect by using two 1D convolutions instead of one. This of course gets the computation cost down from O(n^2) to O(n).
If instead you're doing some other additional work on the surrounding pixels, it might be trickier to optimize.

mariuszhermansdorfer · 2019-09-22T12:55:25Z

Thanks @Sergio0694! Sorry for not providing a clearer explanation of what we are trying to accomplish. The goal is to visualize slope of the terrains created in the sandbox. To achieve that, we need to calculate a value for each pixel and store it in an array. This is then passed to our meshing function as vertexColors information.
Slope is defined as rise/run. rise is equal to the difference in elevation and run is the distance in the XY plane between points. Since we operate on a defined grid of pixels run is constant (its three constants to be precise, one for the X, one for the Y, and for the XY direction). Since each pixel can have up to 8 neighbors, we calculate an average of up to 8 individual comparisons and assign this value to the analyzed pixel.
This logic has to be repeated for every single one of our 217088 pixels.

Does this sound like a good candidate for linear separation?

Sergio0694 · 2019-09-22T13:41:07Z

I'm sorry, I still don't really understand the exact operation you're describing.

What is the type of the source image? Is it just an array of numeric values (eg. int), or is something like an array of RGB vectors?
"rise is equal to the difference in elevation" - elevation between what?

Anyway, if the total area being processed for a single pixel is just the 3x3 square centered in that pixel, even if you could come up with a linearly separable kernel to represent that operation it likely wouldn't bring a performance increase, as the kernel is so small already.
What I suggest is to:

Parallelize the loop on a per-row basis, just like I did in my gaussian blur
Unroll the loop and make all the possible values constants.

As for point 2, as an example: imagine that for each pixel you were just summing its value and the one of the 2 following pixels, if within bounds (completely made up example, bear with me).

Classic implementation (with no optimizations):

const int width = 640, height = 380;
int[] image = new int[width * height];

for (int y = 0; y < height; y++)
{
    for (int x = 0; x < width; x++)
    {
        int sum = 0;
        for (int i = 0; i < 3; i++)
        {
            if (x + i == width) break;
            sum += image[y * width + x + i];
        }

        image[y * width + x] = sum;
    }
}

Unrolled implementation (with no other optimizations):

for (int y = 0; y < height; y++)
{
    for (int x = 0; x < width; x++)
    {
        int sum = image[y * width + x];
        if (x + 1 < width) sum += image[y * width + x + 1];
        if (x + 2 < width) sum += image[y * width + x + 2];

        image[y * width + x] = sum;
    }
}

Note though that the second is more of a micro-optimization, so I'd suggest to just making the code run in parallel, that might very well be enough for you already. Following up from the previous example, the parallelized version could look something like this:

Parallel.For(0, height, y =>
{
    for (int x = 0; x < width; x++)
    {
        int sum = image[y * width + x];
        if (x + 1 < width) sum += image[y * width + x + 1];
        if (x + 2 < width) sum += image[y * width + x + 2];

        image[y * width + x] = sum;
    }
});

Hope this helps!

P.S.: bonus example if you really want to squeeze out as much performance as possible:

Parallel.For(0, height, y =>
{
    ref int row = ref image[y * width];
    for (int x = 0; x < width; x++)
    {
        int sum = 0;
        if (x + 1 < width) sum += Unsafe.Add(ref row, x + 1);
        if (x + 2 < width) sum += Unsafe.Add(ref row, x + 2);

        Unsafe.Add(ref row, x) += sum;
    }
});

But again, first just make the code run in parallel 👍

mariuszhermansdorfer · 2019-09-22T18:45:55Z

Thanks, @Sergio0694! To answer your questions:

The source image is a double[], which is a result of the Gaussian blurring function you wrote for us earlier.
By elevation I meant the actual values of each element in an array. So rise is a difference between two values in the above double[]. It really is as simple as that.

Following your suggestion I wrote a very verbose slopeCalculator. It first takes care of the edge cases for individual pixels in the corners of the image (NW, NE, SW, SE), rows (first and last) and columns (first and last) and then iterates over the main array. Results are pretty satisfying, but also quite surprising.
@philipbelesky, I went with a D8 approach just to see how much of an overhead it would cause.

Sequential for loop - 2ms

Parallel for loop - 14 ms

I suspect it is because the logic needs to read pixel values from other rows which may be locked by other threads writing to it.

Any ideas on how to optimize it from there? Maybe some of your unsafe mojo would help :)

Sergio0694 · 2019-09-22T19:32:54Z

@mariuszhermansdorfer So, by quickly looking at your code, a few suggestions come to mind:

Absolutely swap your double for float values. The loss in precision shouldn't really be noticeable at all, but it should speed up calculations by a fair bit. I mean to replace all double values, so both in the gaussian blur processor and in this new code your wrote.
After you're done with the previous point, remember to replace all your Math calls with MathF calls (same exact methods, just working on float values and not double
Especially if this code you wrote is being called repeatedly (possibly multiple times a second), do this change in this line:

// Replace this
ushort[] slopeValues = new ushort[width * height];

// With this
ushort[] slopeValues = ArrayPool<ushort>.Shared.Rent(width * height);

ArrayPool<T> (see this blog post for more details, and the docs page here) is a high-performance pool of arrays, which makes it so that you can reuse arrays instead of forcing the runtime to allocate a new one every time. Just be wary of a couple caveats:

When renting arrays, you'll have to manually return them to the pool when you're done, you can't just throw them away as usual. To do so, just use ArrayPool<T>.Shared.Return(T[] array).
The length of the returned array might be greater than the requested length, so make sure to avoid iterating on the Length property of your array (since you're always iterating with your previous width and height values, that should be taken care of already

At this line you're accessing a variable from an outer scope, within a parallel loop. Since you only need a temp variable within that parallel loop code, avoid doing that and just declare another local variable for that method block, and use it from there.

After you're done with all of these I might also do a pass replacing those array accesses with Unsafe.Add accesses, but that's a micro-optimizations which might not be that necessary here.

Hope this helps!

mariuszhermansdorfer · 2019-09-22T19:59:26Z

@Sergio0694, you're awesome! Point number 4 made an enormous difference! Localizing the variable took it down to less than 1 ms. Most satisfying :)

I will implement the rest of your suggestions tomorrow to make it even snappier. I have two questions, though:

MathF is part of the System namespace. It isn't recognized, though. Am I missing a reference? Also, since it is not the first time I run into this issue, where can I find this information in the future. It seems to be missing in the docs.
When would I return the rented array? Would you do this after returning the function result?

SandWorm/SandWorm/SlopeCalculator.cs

Line 125 in c1c867a

return slopeValues;

Sergio0694 · 2019-09-22T23:25:30Z

@mariuszhermansdorfer Ahahah thanks, happy to help! 😊
It's great that you got such a performance improvement just by changing that variable ahahah

As for your questions:

My bad, I just double checked the docs and apparently MathF is not available for .NET Framework, but just for .NET Core. Unless you move your whole project to .NET Core 3.0 (which would also be faster in general, though you'd have to test to see if all the packages you're using support it), unfortunately you won't be able to use that API.
No, you can't return it from that method, otherwise the caller would receive an invalid array. The way you'd structure that is the following:

// Somewhere in your code when you need this
var slopes = SlopeCalculator.CalculateSlope(...);

// Do stuff with slopes...

ArrayPool<double>.Shared.Return(slopes);

That is, return that array when you're sure you won't be using it again. For instance, if you're getting that array within some other method, which is using it to do some other work and then just discarding it (ie. not returning it or saving it somewhere else), you could just return that array to the pool at the end of that method.

philipbelesky · 2019-09-23T00:08:26Z

Thanks also for your help @Sergio0694! I think we are stuck on .NET 4.5 as this is what our host environment provides, although it is worth testing if we can get around that.

@mariuszhermansdorfer the new CalculateSlope function shifts the coloring process outside of the per-pixel loop and operates with the entire point cloud as the input rather than on a per-point basis. To integrate this approach with the existing analysis classes, it looks like GetPixelIndexForAnalysis should be replaced with a method along the lines of GetPixelColorsForAnalysis that can can accomodate this 'all at once' approach.

We have a few different branches floating around that should probably be resolved first before pursuing that option. Is it OK to merge this branch in? I can then merge/integrate the new setup component, and then rework the analysis classes to integrate the method from performance/slope_parallel?

mariuszhermansdorfer · 2019-09-23T07:24:13Z

@philipbelesky, yes, please go ahead with the merging. I wasn't pushing any other commits to the master just to make sure I don't make your life harder with integrating the analysis part.

mariuszhermansdorfer · 2019-09-23T07:38:34Z

Thanks for clarifying things, @Sergio0694. As far as I can understand, we are stuck with .NET 4.5 unless we explicitly make our users install newer versions of the framework.

I must admit, I don't understand this explanation fully, but it seems we can't use the .NET Core for this particular project.

# Conflicts: # SandWorm/Analysis.cs # SandWorm/SandWormComponent.cs

Sergio0694 · 2019-09-23T12:57:24Z

No problem! As for .NET Core 3.0, if these changes have already brought the time it takes to perform this section of the pipeline down to below 1ms I'd say you can just forget about that then.
You might just want to throw in that ArrayPool<T> optimization just if you want to really go all the way, but if you're at 1ms already that might not even be noticeable anyway.

mariuszhermansdorfer · 2019-09-24T20:24:50Z

Is this one ready to be merged into master @philipbelesky? Feel free to do it when you think we're good to go.

philipbelesky · 2019-09-24T22:43:55Z

Great, done. I’ll have some time later this week to integrate the new structure of the slope algorithm and will try to replicate it for aspect at the same time.

philipbelesky added 4 commits September 17, 2019 18:51

Remove TestA/TestB code from Tests

f57cebc

Keeping this in means we will need to keep it updated with the equivalent methods in the component, which is unnecessary churn. Better to use these methods adhoc and not in git

Add structure to allow analysis classes to output geometry (i.e. wate…

925fc2d

…r levels; contours) + split up Analysis.cs Analysis.cs was getting a little large; the new file contains just the implementations of each analysis option

Finish first actually-working prototype of the slope calculation

111f14f

mariuszhermansdorfer reviewed Sep 22, 2019

View reviewed changes

philipbelesky added 3 commits September 22, 2019 23:42

Simplify and tweak color span provision; better tune elevation colors

b8fa16b

Finish implementing water plane draw

b097855

Add basic contouring (not optimised)

4a57174

mariuszhermansdorfer mentioned this pull request Sep 22, 2019

Implement water flow analysis #14

Closed

philipbelesky referenced this pull request in philipbelesky/SandWorm Sep 22, 2019

Add prototype slope calc

e1136ce

Add array bounds check to the color lookup

f40393f

philipbelesky force-pushed the analysis-re-rework branch from aa743e8 to f40393f Compare September 22, 2019 23:45

Merge branch 'master' into analysis-re-work

7dd6875

# Conflicts: # SandWorm/Analysis.cs # SandWorm/SandWormComponent.cs

Run code cleanup

f350e64

philipbelesky force-pushed the analysis-re-rework branch from 7bfca33 to f350e64 Compare September 23, 2019 13:11

philipbelesky merged commit 13dc46e into mariuszhermansdorfer:master Sep 24, 2019

philipbelesky mentioned this pull request Sep 24, 2019

Draw waterLevel as a plane rather than part of the mesh color #33

Closed

philipbelesky deleted the analysis-re-rework branch September 25, 2019 11:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Analysis re-reworking #34

Analysis re-reworking #34

philipbelesky commented Sep 20, 2019

mariuszhermansdorfer Sep 22, 2019 •

edited

Loading

mariuszhermansdorfer commented Sep 22, 2019

Sergio0694 commented Sep 22, 2019

mariuszhermansdorfer commented Sep 22, 2019

Sergio0694 commented Sep 22, 2019 •

edited

Loading

mariuszhermansdorfer commented Sep 22, 2019 •

edited

Loading

Sergio0694 commented Sep 22, 2019 •

edited

Loading

mariuszhermansdorfer commented Sep 22, 2019

Sergio0694 commented Sep 22, 2019

philipbelesky commented Sep 23, 2019 •

edited

Loading

mariuszhermansdorfer commented Sep 23, 2019

mariuszhermansdorfer commented Sep 23, 2019

Sergio0694 commented Sep 23, 2019

mariuszhermansdorfer commented Sep 24, 2019

philipbelesky commented Sep 24, 2019

Analysis re-reworking #34

Analysis re-reworking #34

Conversation

philipbelesky commented Sep 20, 2019

mariuszhermansdorfer Sep 22, 2019 • edited Loading

Choose a reason for hiding this comment

mariuszhermansdorfer commented Sep 22, 2019

Sergio0694 commented Sep 22, 2019

mariuszhermansdorfer commented Sep 22, 2019

Sergio0694 commented Sep 22, 2019 • edited Loading

mariuszhermansdorfer commented Sep 22, 2019 • edited Loading

Sergio0694 commented Sep 22, 2019 • edited Loading

mariuszhermansdorfer commented Sep 22, 2019

Sergio0694 commented Sep 22, 2019

philipbelesky commented Sep 23, 2019 • edited Loading

mariuszhermansdorfer commented Sep 23, 2019

mariuszhermansdorfer commented Sep 23, 2019

Sergio0694 commented Sep 23, 2019

mariuszhermansdorfer commented Sep 24, 2019

philipbelesky commented Sep 24, 2019

mariuszhermansdorfer Sep 22, 2019 •

edited

Loading

Sergio0694 commented Sep 22, 2019 •

edited

Loading

mariuszhermansdorfer commented Sep 22, 2019 •

edited

Loading

Sergio0694 commented Sep 22, 2019 •

edited

Loading

philipbelesky commented Sep 23, 2019 •

edited

Loading