Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for 32-bit fixed point types - fixed, fixed2, fixed3, fixed4 #88

Open
james7132 opened this issue Oct 31, 2019 · 14 comments
Open

Comments

@james7132
Copy link

I know that Burst is promising cross-platform floating point determinism in the future, but that's not only a high bar to cross (it's a problem almost nobody has solved before), it also limits the usability of deterministic computation to Burst compiled jobs.

Having general access to fixed point computations, be it 32 (Q16) or 64 (Q32) bit, could make it much easier to have universally available deterministic decimal computations.

@unpacklo
Copy link
Collaborator

unpacklo commented Apr 2, 2020

I agree that having a fixed point format could be useful, the main question is which format to actually support and ensuring it is performant. Do you have specific use cases in mind for the fixed point formats?

@nxrighthere
Copy link

nxrighthere commented Apr 6, 2020

My colleagues are extensively use Q48.16 for deterministic multiplayer games/simulations, they have tried several formats, and this one has a really good balance of performance and accuracy. For example, Q32.32 is about 2-3X slower at multiplication/division because of the implementation constraints.

@nxrighthere
Copy link

My advice is to go with Q48.16 or make it switchable between Q48.16 and Q32.32 where one can be used for fast multiplication/division, and another for a higher precision where it's required. In most cases, Q48.16 gives enough of the depth.

@unpacklo
Copy link
Collaborator

unpacklo commented Apr 7, 2020

Is it necessary for you to have the 48 bits of range in the whole portion?

@nxrighthere
Copy link

nxrighthere commented Apr 8, 2020

Not really, it's primarily used because of performance reasons. I asked my colleagues for the actual code example:

Q48.16

public static FP operator *(FP a, FP b) {
	a.RawValue = (a.RawValue * b.RawValue) >> FPLut.PRECISION;

	return a;
}

Q32.32

public static FP operator *(FP a, FP b) {
	var al = a.m_rawValue;
	var bl = b.m_rawValue;

	var alo = (ulong)(al & 0x00000000FFFFFFFF);
	var ahi = al >> FRACTIONAL_PLACES;
	var blo = (ulong)(bl & 0x00000000FFFFFFFF);
	var bhi = bl >> FRACTIONAL_PLACES;

	var lolo = alo * blo;
	var lohi = (long)alo * bhi;
	var hilo = ahi * (long)blo;
	var hihi = ahi * bhi;

	var loResult = lolo >> FRACTIONAL_PLACES;
	var midResult1 = lohi;
	var midResult2 = hilo;
	var hiResult = hihi << FRACTIONAL_PLACES;

	bool overflow = false;
	var sum = AddOverflowHelper((long)loResult, midResult1, ref overflow);

	sum = AddOverflowHelper(sum, midResult2, ref overflow);
	sum = AddOverflowHelper(sum, hiResult, ref overflow);

	bool opSignsEqual = ((al ^ bl) & MIN_VALUE) == 0;

	if (opSignsEqual) {
		if (sum < 0 || (overflow && al > 0))
			return MaxValue;
	} else {
		if (sum > 0)
			return MinValue;
	}

	var topCarry = hihi >> FRACTIONAL_PLACES;

	if (topCarry != 0 && topCarry != -1) {
		return opSignsEqual ? MaxValue : MinValue;
	}

	if (!opSignsEqual) {
		long posOp, negOp;

		if (al > bl) {
			posOp = al;
			negOp = bl;
		} else {
			posOp = bl;
			negOp = al;
		}

		if (sum > negOp && negOp < -ONE && posOp > ONE)
			return MinValue;
	}

	return new FP(sum);
}

@nxrighthere
Copy link

nxrighthere commented Apr 8, 2020

So in practice with Q48.16, the top 32 bits are used as a buffer for multiplication/division. Q32.32 gives you more accuracy on decimals, but practically it doesn't carry much benefits and costs performance.

Q48.16 is, in general, the best compromise in comparison to other formats in terms of LUT sizes, performance, accuracy, and so on.

Note that the actual usable space in this case is -Q15.16 to +Q15.16.

@james7132
Copy link
Author

james7132 commented Jul 6, 2020

Following up on this, @nxrighthere, is there currently an open source version for this Q48.16 implementation you're talking about? FixedMath.NET seems to use a Q32.32 implementation. My main concern is propagating that loss of precision for things like the repeated matrix multiplication in animation.

@unpacklo: The main use case is deterministic simulations for netcode that relies on consistent simulation across platforms. Paradigms like lockstep require exactly the same simulation to run on multiple machines down to the bit. Other paradigms like rollback netcode are likewise strictly dependent on a deterministic game simulation to minimize bandwidth use. For realtime strategy and fighting games, these are either commonplace or strictly required due to bandwidth or latency constraints.

This would likely be used to create deterministic physics for most games, and for some, like fighting games with rollback netcode, deterministic animation.

I understand Burst was promising deterministic floating point computation, but that's currently not available, and will be heavily restricted to Burst-compiled jobs. If we need to ensure determinism even on the main thread, or lack the toolchain to build Burst compiled games, we'd need to retool our entire game to work in a paradigm that utilizes only Burst compiled jobs for our deterministic simulation.

@nxrighthere
Copy link

@james7132 The one that I'm talking about is an in-house proprietary library. I doubt that you will find an open-source C# implementation of Q48.16, this format is about performance primarily, those who need fixed-point math are don't care about it usually and they use what commonly used.

@unpacklo
Copy link
Collaborator

unpacklo commented Jul 7, 2020

@james7132 I understand, thanks.

When you say that it would be nice to have it be universally available, you're referring to being able to just start any Unity project and being assured that fixed point types are ready to use? My expectation is that fixed point types would be implemented by companies that require them and those libs would be used internally on their own projects.

@nxrighthere
Copy link

@unpacklo I think it's more about that you have chosen a suboptimal path instead of using the best available tool for the job. Enforced deterministic cross-platform floating-points through compilation means a lot of tradeoffs, it most likely means IEEE 754 compliance, which means trading performance and accuracy by disabling floating-point optimizations. So you might end up with a situation where a particular fixed-point format based on integers will be faster, just because of that.

The second problem is the engineering effort. This feature is in development for more than two years already.

@nxrighthere
Copy link

And my personal design preference (as a network programmer): when you work with abstracted types like FP instead of float with enforced (per method?) compilation, you have a type-safe, explicit facility to distinguish the two to not turn it into an unmaintainable mess with deterministic and non-deterministic floats. But if this compilation option will be available per project basis instead of per method, then you effectively killing performance where it shouldn't.

@james7132
Copy link
Author

james7132 commented Jul 9, 2020

@unpacklo I mean that in the sense that determinism isn't bound to a specific compiler. I understand that Unity.Mathematics is best used in tandem with Burst, but that should not be it's only priority. Burst, AFAIK, is only bound for C# jobs, and cannot be easily utilized outside of those contexts due to strict invariants about it's core assumptions. If this has changed, please correct my understanding. The use of fixed point computation is to have deterministic non-integer computations regardless of what compiler, execution context, etc. As @nxrighthere mentioned, it's difficult to tell when a float operation is deterministic since the state of determinism is bound to the compiler, not a specific datatype. The same method called by a C# job and a similar operation on the main thread may actually result in different results. If I pull out the results of a C# job compiled with Burst then run even one non-deterministic operation on it from the main thread, all the effort in keeping it deterministic is lost, whereas with fixed-point I can strictly guarantee that no matter what operation I am doing on it, regardless of CPU architecture or compiler, the output is exactly the same.

@unpacklo
Copy link
Collaborator

@james7132 thanks for the clarification, I understand now.

I created an issue in our internal issue tracker so we can evaluate this feature!

@IronWarrior
Copy link

Reviving this a bit, this repo is an example of 48.16 FP in C#. I have a small library that provides Unity integration (property drawers etc). An example of a game engine using 48.16 is Photon Quantum (for deterministic networking), which has a 2D and 3D physics engine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants