Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[2.0.x] Use 'float' instead of 'double' maths #11178

Merged
merged 4 commits into from Jul 6, 2018

Conversation

@ejtagle
Copy link
Contributor

ejtagle commented Jul 1, 2018

  1. Replace 'double' with 'float'. Under AVR both types map to the same (float). On 32bit CPUs, they don't. The extra precision double costs more time, and is unnecessary. (On FPU enabled 32bit-boards, double is not accelerated but float is.)
  2. Remove Quake III fast inverse square root. Profiling shows it is faster to simply call sqrtf (because it's implemented in assembler). The same profiling showed that if multiplication is 1x, division is 2x, square root is 4x, so it makes sense to replace them.
  3. Replace several divisions by multiplications by the reciprocal, when the divisor can be reused (in vector normalization).
  4. Optimize Delta forward kinematics formulae (please, @thinkyhead , check if the optimizations are OK, just to be sure... I don't have a delta to test them)
  5. Replace some sqrt by division.
@p3p

This comment has been minimized.

Copy link
Member

p3p commented Jul 2, 2018

A quick perusal shows there are still a fair few double literals in areas you modified, there are a lot everywhere else in Marlin but I thought I'd mention it as you updated those parts, I had the choice between disabling double literals in the compiler or going through Marlin and changing them all... I disabled double literals.

@ejtagle

This comment has been minimized.

Copy link
Contributor Author

ejtagle commented Jul 2, 2018

@p3p : I know there are still doubles. Specifically in the serial.print() routines. I didn´t remove them because that would break inheritance from the Stream class.

But, those functions aren´t used, so i hope the linker is able to remove them.

If there is consensus, i can also remove all those functions 👍

@p3p

This comment has been minimized.

Copy link
Member

p3p commented Jul 2, 2018

Sorry, I said constant I meant literal, there are many double literals everywhere in Marlin not sure why, well.. its probably because it didn't matter with AVR.

@ejtagle

This comment has been minimized.

Copy link
Contributor Author

ejtagle commented Jul 2, 2018

Yes, i agree. Codebase is full of double literal values. Technically, that could force promotion to double of some calculations...

@thinkyhead

This comment has been minimized.

Copy link
Member

thinkyhead commented Jul 2, 2018

probably because it didn't matter with AVR.

Basically, and converting all the values to float by appending f is a lot of work.

Some MCUs might actually prefer double if that's the native type that they use in their FPUs. Maybe we need a decimal type which refers to the format preferred by the FPU, if there is one.

@thinkyhead

This comment has been minimized.

Copy link
Member

thinkyhead commented Jul 2, 2018

Some of these changes look like they'd be worthwhile for bugfix-1.1.x too.

@ejtagle

This comment has been minimized.

Copy link
Contributor Author

ejtagle commented Jul 2, 2018

@thinkyhead : There is no embedded MPU supporting hardware double precision floating point (except STM CortexM7, but we don´t support those boards). The other exception would be Raspberry Pi or a PC

There are options to force GCC to use floats for everything, -fsingle-precision-constant . That would probably be the best choice, but Arduino does not allow to specify that compilation option...

What i am mostly interested in, is the optimizations to the Delta kinematics... Probably 10% faster o maybe more...

#undef M_PI
#define M_PI 3.14159265358979323846f

#define RADIANS(d) ((d)*float(M_PI)/180.0f)

This comment has been minimized.

Copy link
@thinkyhead

thinkyhead Jul 3, 2018

Member

Since M_PI is now defined with f on the end, casting it might be redundant.

This comment has been minimized.

Copy link
@ejtagle

ejtagle Jul 3, 2018

Author Contributor

Theoretically, yes

@@ -1252,7 +1252,7 @@
// last half of the mesh (when every unprobed mesh point is one index
// from a probed location).

d1 = HYPOT(i - k, j - l) + (1.0 / ((millis() % 47) + 13));
d1 = HYPOT(i - k, j - l) + RECIPROCAL((millis() % 47) + 13);

This comment has been minimized.

Copy link
@thinkyhead

thinkyhead Jul 3, 2018

Member

RECIPROCAL is safer, but also a little slower. If we know the parameter won't ever be 0.0 then it's better to just use 1.0 / x.

This comment has been minimized.

Copy link
@ejtagle

ejtagle Jul 3, 2018

Author Contributor

You are right here. That is why a review is always welcome !! 👍

@@ -185,7 +185,7 @@ static float std_dev_points(float z_pt[NPP + 1], const bool _0p_cal, const bool
S2 += sq(z_pt[rad]);
N++;
}
return round(SQRT(S2 / N) * 1000.0) / 1000.0 + 0.00001;
return LROUND(SQRT(S2 / N) * 1000.0) / 1000.0 + 0.00001;

This comment has been minimized.

Copy link
@thinkyhead

thinkyhead Jul 3, 2018

Member

Is LROUND actually better here, since the return type is float?

This comment has been minimized.

Copy link
@ejtagle

ejtagle Jul 3, 2018

Author Contributor

On AVR is does not matter. On ARM, round would force a conversion from double->float, and the parameter of round() is double, so it also forces a conversion from float->double

float Xnew = (delta_diagonal_rod_2_tower[A_AXIS] - delta_diagonal_rod_2_tower[B_AXIS] + sq(d)) / (d * 2),
Ynew = ((delta_diagonal_rod_2_tower[A_AXIS] - delta_diagonal_rod_2_tower[C_AXIS] + HYPOT2(i, j)) / 2 - i * Xnew) / j,
float Xnew = (delta_diagonal_rod_2_tower[A_AXIS] - delta_diagonal_rod_2_tower[B_AXIS] + d2) * inv_d * 0.5,
Ynew = ((delta_diagonal_rod_2_tower[A_AXIS] - delta_diagonal_rod_2_tower[C_AXIS] + sq(i) + j2) * 0.5 - i * Xnew) * inv_j,

This comment has been minimized.

Copy link
@thinkyhead

thinkyhead Jul 3, 2018

Member

Ah, I see. These would definitely be faster.

This comment has been minimized.

Copy link
@ejtagle

ejtagle Jul 3, 2018

Author Contributor

👍

@ejtagle

This comment has been minimized.

Copy link
Contributor Author

ejtagle commented Jul 3, 2018

(btw: the compilation is failing because the success variable is not defined ... 👍 🌊 ... )

@thinkyhead

This comment has been minimized.

Copy link
Member

thinkyhead commented Jul 4, 2018

Marlin loves me.
Yes I know.
Travis CI
told me so.

@thinkyhead thinkyhead changed the title [2.0.x] Misc fixes and improvements [2.0.x] Use 'float' instead of 'double' maths Jul 4, 2018
@thinkyhead

This comment has been minimized.

Copy link
Member

thinkyhead commented Jul 5, 2018

The delta optimizations seem good to me!

I've heard that it's more optimal on AVR to use const float & arguments to functions rather than passing a float because a pointer is smaller than a float on this architecture. Since this doesn't help with processors that have 32-bit addresses, maybe we should use const float & only AVR and simply float on others. Or maybe the compiler can be made to optimize these parameters automatically.

@ejtagle

This comment has been minimized.

Copy link
Contributor Author

ejtagle commented Jul 5, 2018

@thinkyhead : And they gave you good advice. On AVR pointers are 16bit, requiring 2 registers to be passed by reference. floats are 32bits, so passing them by reference uses 2 registers, and passing them by value uses 4 registers. So passing by reference is 2x faster...
But, dereferencing the pointer (or reference) to read the float value adds as much time as passing the float itself, so, it depends... If the function will be inlined, there is no difference... If a float will be passed several levels below, then passing them by reference is faster...

@thinkyhead thinkyhead merged commit dde009e into MarlinFirmware:bugfix-2.0.x Jul 6, 2018
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
fiveangle added a commit to fiveangle/Marlin that referenced this pull request Jul 9, 2018
-normalize `env` and `board` to lowercase naming convention
-make board `name` follow descriptive convention
-implement `-fsingle-precision-constant` compile optimization per MarlinFirmware#11178 (comment)
fiveangle added a commit to fiveangle/Marlin that referenced this pull request Jul 9, 2018
-normalize `env` and `board` to lowercase naming convention
-make board `name` follow descriptive convention
-implement `-fsingle-precision-constant` compile optimization per MarlinFirmware#11178 (comment)
fiveangle added a commit to fiveangle/Marlin that referenced this pull request Jul 9, 2018
-normalize `env` and `board` to lowercase naming convention
-make board `name` follow descriptive convention
-implement `-fsingle-precision-constant` compile optimization per MarlinFirmware#11178 (comment)
fiveangle added a commit to fiveangle/Marlin that referenced this pull request Jul 9, 2018
-normalize `env` and `board` to lowercase naming convention
-make board `name` follow descriptive convention
-implement `-fsingle-precision-constant` compile optimization per MarlinFirmware#11178 (comment)
-fix typo in 5DPRINT entry
fiveangle added a commit to fiveangle/Marlin that referenced this pull request Jul 9, 2018
-normalize `env` and `board` to lowercase naming convention
-make board `name` follow descriptive convention
-implement `-fsingle-precision-constant` compile optimization per MarlinFirmware#11178 (comment)
-fix typo in 5DPRINT entry
fiveangle added a commit to fiveangle/Marlin that referenced this pull request Jul 14, 2018
-normalize `env` and `board` to lowercase naming convention
-make board `name` follow descriptive convention
-implement `-fsingle-precision-constant` compile optimization per MarlinFirmware#11178 (comment)
-fix typo in 5DPRINT entry
thinkyhead added a commit that referenced this pull request Jul 26, 2018
-normalize `env` and `board` to lowercase naming convention.
-make board `name` follow descriptive convention.
-implement `-fsingle-precision-constant` compile optimization per #11178 (comment)
-fix typo in 5DPRINT entry.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.