Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimized and added additional constructors to Color #5117

Merged
merged 3 commits into from Aug 21, 2016

Conversation

Projects
None yet
5 participants
@polsys
Copy link
Contributor

commented Aug 16, 2016

Fixes #4824. Added the Color(byte, byte, byte, byte) constructor and made Color(uint) public. Also added some simple unit tests for the constructors and corrected a minor typo.

In the issue, there was some discussion about a separate Color.FromBytes() method, but I opted for another constructor overload, since the parameters must be bytes in either case. I did not add a Color(byte, byte, byte) overload with constant alpha to avoid excessive overloads - this is a more "advanced" overload after all. This can be changed if it makes the API cleaner, of course.

As expected, the new byte overload is clearly (40-60%) faster than the int version in a quick microbenchmark.

polsys added some commits Aug 16, 2016

@polsys

This comment has been minimized.

Copy link
Contributor Author

commented Aug 16, 2016

Implemented the short-circuiting of clamping per the discussion. This improved the int overload performance a lot (almost halved the time), though the byte overload is still slightly (about 20% in my tests) faster. Also added some clamping tests which I had missed.

@MichaelDePiazzi

This comment has been minimized.

Copy link
Contributor

commented Aug 17, 2016

Looks good! 👍
Just a thought while these optimisations are being done - Would it maybe also be worthwhile setting _packedvalue in 1 hit, rather than setting each property individually (which performs additional unnecessary operations)? For example:

public Color(byte r, byte g, byte b, byte alpha)
{
    _packedValue = CalculatePackedValue(r, g, b, alpha);
}

private static uint CalculatePackedValue(byte r, byte g, byte b, byte a)
{
    return r | ((uint)g << 8) | ((uint)b << 16) | ((uint)a << 24);
}
/// <param name="g"></param>
/// <param name="b"></param>
/// <param name="alpha"></param>
public Color(byte r, byte g, byte b, byte alpha)

This comment has been minimized.

Copy link
@tomspilman

tomspilman Aug 17, 2016

Member

So isn't there an issue with both a byte and float constructors in this type?

https://blogs.msdn.microsoft.com/shawnhar/2010/03/31/color-constructors-in-xna-game-studio-4-0/

This comment has been minimized.

Copy link
@polsys

polsys Aug 17, 2016

Author Contributor

Do you mean this quote?

When you pass integer values to a method that has byte and float overloads, C# method resolution rules choose the float version, because int -> float is considered a better conversion than int -> byte.

That is not a problem because we have the int overload, which is chosen by overload resolution since it has no conversions at all. The only way to get the byte or float overload is explicitly having all the parameters as the respective type (either by variable definition or casting).

This comment has been minimized.

Copy link
@tomspilman

tomspilman Aug 17, 2016

Member

Ah... I see what you mean. I wonder why the XNA team didn't just leave the byte ctor when adding the int ctor then.

This comment has been minimized.

Copy link
@KonajuGames

KonajuGames Aug 17, 2016

Contributor

I think it will only use the byte ctor when the parameters are explicitly bytes. Any operation without casts will expand to int.

This comment has been minimized.

Copy link
@tomspilman

tomspilman Aug 17, 2016

Member

You are suggesting that new Color(128,128,128,128) will always call the int version of the constructor. This would mean you would have to either put the values into a temporary or explicitly cast like new Color((byte)128,(byte)128,(byte)128,(byte)128).

If this is correct... and I think it is... should we instead consider Color.FromBytes(byte,byte,byte,byte) for this optimized construction method?


This comment has been minimized.

Copy link
@polsys

polsys Aug 17, 2016

Author Contributor

The Color.FromBytes version has the advantage that it will never fall back to the int implementation, but rather raises a compiler error. On the other hand, it adds additional API surface, and with the other optimizations here the performance advantage might actually be quite small.

This comment has been minimized.

Copy link
@KonajuGames

KonajuGames via email Aug 17, 2016

Contributor
R = r;
G = g;
B = b;
A = alpha;

This comment has been minimized.

Copy link
@tomspilman

tomspilman Aug 17, 2016

Member

I recommend directly packing the value here and avoid the overhead of 4 property assignments that may or may not be inlined depending on platform and build settings.

@tomspilman

This comment has been minimized.

Copy link
Member

commented Aug 17, 2016

@MichaelDePiazzi - I suggest an even more direct approach:

public Color(byte r, byte g, byte b, byte alpha)
{
    _packedValue = r | ((uint)g << 8) | ((uint)b << 16) | ((uint)alpha<< 24);
}

There is little benefit in a shared packing function... it may or may not inline depending on the platform and build settings... and it isn't like we will be changing our packing format anytime soon.

In the end bugs are reduced by the unit tests in this type... sharing more code here just makes it potentially slower for no real gain.

@MichaelDePiazzi

This comment has been minimized.

Copy link
Contributor

commented Aug 17, 2016

@tomspilman While from a clean code point of view this makes me shudder a little, from a performance point of view I have to agree with your points. And I also agree with you about unit tests. Having good test coverage is most beneficial for helping reduce bugs and regressions.

By the way, here are some benchmarks I did to get an idea of how much faster setting the packed value directly is. (Measuring time taken to create 20 million Color's)

  • Original int constructor = 3174ms
  • New optimised int constructor = 1802ms (43% less time, 76% faster)
  • New byte constructor = 1512ms (52% less time, 110% faster)
  • New optimised int constructor using direct set = 1203ms (62% less time, 164% faster)
  • New byte constructor using direct set = 760ms (76% less time, 318% faster)
@tomspilman

This comment has been minimized.

Copy link
Member

commented Aug 17, 2016

While from a clean code point of view this makes me shudder a little

I agree... I hate seeing code duplicated like this very much, but in this sort of type as well as the vector classes performance is first priority. The only reason I can tolerate it is knowing the unit tests will alert us to any bugs.

@MichaelDePiazzi

This comment has been minimized.

Copy link
Contributor

commented Aug 17, 2016

While we're on the subject of optimisations, the Color comparison operators could also potentially be improved.

Current code:

public static bool operator ==(Color a, Color b)
{
    return (a.A == b.A &&
        a.R == b.R &&
        a.G == b.G &&
        a.B == b.B);
}

public static bool operator !=(Color a, Color b)
{
    return !(a == b);
}

My suggestion:

public static bool operator ==(Color a, Color b)
{
    return (a.PackedValue == b.PackedValue);
}

public static bool operator !=(Color a, Color b)
{
    return (a.PackedValue != b.PackedValue);
}

Quick benchmark on the equals operator shows it to be ~150% faster.

@tomspilman

This comment has been minimized.

Copy link
Member

commented Aug 17, 2016

the Color comparison operators could also potentially be improved.

Absolutely! It is dumb of us to not be taking advantage of direct access to _packedValue. We should prefer that over the PackedValue property and certainly the individual component properties.

@MichaelDePiazzi

This comment has been minimized.

Copy link
Contributor

commented Aug 17, 2016

We should prefer that over the PackedValue property

I thought that also... but for some strange reason, it's slower for me when comparing _packedValue directly!? I have no idea why either... 😕

My benchmarks: (Measuring time taken to do 20 million equals comparisons of equal Color's)

  • Original (individual property comparison) = 954ms
  • PackedValue compare = 447ms
  • _packedValue compare = 708ms

Could it be something to do with the [CLSCompliant(false)] attribute on the PackedValue property?

@polsys

This comment has been minimized.

Copy link
Contributor Author

commented Aug 17, 2016

I completely agree, the optimizations are worth duplicating some code. Working on it right now...

@tomspilman

This comment has been minimized.

Copy link
Member

commented Aug 17, 2016

it's slower for me when comparing _packedValue directly!?

I suggest checking the JIT optimized assembly code directly to see why.

@MichaelDePiazzi

This comment has been minimized.

Copy link
Contributor

commented Aug 17, 2016

I suggest checking the JIT optimized assembly code directly to see why.

Hmmm... seems like the property version is being optimised to do a direct compare:

019A65C0  cmp         ecx,edx  
019A65C2  sete        al  
019A65C5  movzx       eax,al  
019A65C8  ret  

Whereas the field version is not for some reason:

019A65E0  push        ebp  
019A65E1  mov         ebp,esp  
019A65E3  sub         esp,8  
019A65E6  mov         dword ptr [ebp-4],ecx  
019A65E9  mov         dword ptr [ebp-8],edx  
019A65EC  mov         eax,dword ptr [ebp-4]  
019A65EF  cmp         eax,dword ptr [ebp-8]  
019A65F2  sete        al  
019A65F5  movzx       eax,al  
019A65F8  mov         esp,ebp  
019A65FA  pop         ebp  
019A65FB  ret  

No idea why...? 😕

@MichaelDePiazzi

This comment has been minimized.

Copy link
Contributor

commented Aug 17, 2016

Update: If I run it in 64-bit mode, it optimises the field version the same as it does for the property version. So for some odd reason, in 32-bit mode the field comparison is not optimised.

@polsys

This comment has been minimized.

Copy link
Contributor Author

commented Aug 17, 2016

After the optimizations, the constructors run about 2-3 times faster. The float and vector constructors just offload work to the int version, and are still a lot faster. The equality operators also benefited nicely. I did use _packedValue for consistency until we have hard data on the JIT behavior - if that is necessary.

I also noted an inconsistency in our FromNonPremultiplied behavior. Added unit tests and changed to match XNA.

@KonajuGames

This comment has been minimized.

Copy link
Contributor

commented Aug 17, 2016

@polsys

This comment has been minimized.

Copy link
Contributor Author

commented Aug 17, 2016

How does that work? The int version takes a range of 0-255 and the float
version takes 0.0-1.0.

Just deduplicating the old code:

public Color(float r, float g, float b)
    : this((int)(r * 255), (int)(g * 255), (int)(b * 255)) 
{
}
@tomspilman

This comment has been minimized.

Copy link
Member

commented Aug 21, 2016

@KonajuGames

This comment has been minimized.

Copy link
Contributor

commented Aug 21, 2016

@MichaelDePiazzi

This comment has been minimized.

Copy link
Contributor

commented Aug 21, 2016

@tomspilman Looks good to me 👍

@tomspilman

This comment has been minimized.

Copy link
Member

commented Aug 21, 2016

Great.... merging!

@tomspilman tomspilman changed the title Add new Color constructors Optimized and added additional constructors to Color Aug 21, 2016

@tomspilman tomspilman merged commit e8fc3cf into MonoGame:develop Aug 21, 2016

7 checks passed

Build Mac, iOS, and Linux Finished TeamCity Build MonoGame :: Build Mac : Running
Details
Build Windows, Web, Android, and OUYA Finished TeamCity Build MonoGame :: Build Windows : Running
Details
Generate Documentation Finished TeamCity Build MonoGame :: Generate Documentation : Running
Details
Package Mac and Linux Finished TeamCity Build MonoGame :: Package Mac and Linux : Running
Details
Package NuGet Finished TeamCity Build MonoGame :: Package NuGet : Running
Details
Package Windows SDK Finished TeamCity Build MonoGame :: Package Windows : Running
Details
Test Windows Finished TeamCity Build MonoGame :: Test Windows : Tests passed: 851, ignored: 7
Details

@tomspilman tomspilman added this to the 3.6 Release milestone Aug 21, 2016

@polsys polsys deleted the polsys:ColorCtor branch Aug 21, 2016

@dellis1972

This comment has been minimized.

Copy link
Contributor

commented on MonoGame.Framework/Color.cs in f4de15d Aug 28, 2016

@polsys this new constructor breaks compatibility with existing XNA code.. It is quite common to have
new Color (0, 255, 0, (byte)alpha)
Its all over the samples. This new constructor now results in an
Error CS0121: The call is ambiguous between the following methods or properties: Microsoft.Xna.Framework.Color.Color(int, int, int, int)' and Microsoft.Xna.Framework.Color.Color(byte, byte, byte, byte)' (CS0121)

Not sure if its something we need to worry about.. but it does me people might have to fix their code when they upgrade....

@tomspilman @KonajuGames

This comment has been minimized.

Copy link
Member

replied Aug 28, 2016

I was worried about stuff like this.

However why would people do new Color (0, 255, 0, (byte)alpha)? Why are they casting alpha to byte? That seems like an unnecessary cast... especially since there was no byte overload in XNA.

Still this is why I felt adding a new static method like Color Color.FromBytes(byte r, byte g, byte b, byte alpha) was safer and equivalent functionality wise.

This comment has been minimized.

Copy link
Contributor

replied Aug 28, 2016

This comment has been minimized.

Copy link
Member

replied Aug 28, 2016

Because those samples originated from XNA 3 when the constructor took bytes.

Ahh... that makes sense then. So this might mostly occur if someone moving XNA 3 code over to MonoGame.

I'm not too worried about that incompatibility then. @dellis1972 @KonajuGames ?

This comment has been minimized.

Copy link
Contributor

replied Aug 29, 2016

I think it should be ok. It's a simple fix otherwise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.