Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply the mask on the image, not tile, convert the image to pixmap, not ... #577

Closed
wants to merge 1 commit into from

Conversation

alphaonex86
Copy link
Contributor

...the tile, greate performance improvement

@alphaonex86
Copy link
Contributor Author

Here it's about 3x of performance on this function

@bjorn
Copy link
Member

bjorn commented Dec 20, 2013

Ah, sorry I hadn't noticed you had already done measurements when commenting on issue #576. What operating system are you using? I'll try it as well.

@bjorn
Copy link
Member

bjorn commented Jan 3, 2014

I've measured the difference on Linux with both Qt 4.8 and 5.2 and with your patch I'm experiencing it to be 15% slower. This basically confirms my suspicion that there should be an additional overhead to this approach as noted at issue #576.

If you really measured a 3x speedup, please tell me on which operating system and if you can provide me with the files you're using to test it. Closing this issue for now.

@bjorn bjorn closed this Jan 3, 2014
@alphaonex86
Copy link
Contributor Author

I use Linux gentoo, gcc 4.7, with kernel 3.10.25, Qt 5.2. It's very strange, because my patch reduce the number of call, and is better because apply on large image, then can be scalled better.
I have use hardware core i5 750, then sse2, sse3, ssse3, sse4. Very strange this difference of performance. Do you use ram disk to prevent disk slow down?
I can provide you my 2 callgrind profile to see the difference.

@bjorn
Copy link
Member

bjorn commented Jan 8, 2014

The number of calls is not very important, what matters is the time the code takes to execute. Also, maybe you need to read again what I pointed out at #576. I'm sure applying the mask to the big image could be a little faster than applying the mask separately to each tile, however using QPixmap::copy instead of QImage::copy is slower because the QPixmap is converted to a QImage for performing this operation each time.

@alphaonex86
Copy link
Contributor Author

The number of calls is important because it imply allocation/desallocation of temporary variable/buffer, and then time.
With my patch on very simple map (include png save and other stuff):
real 0m0.458s
user 0m0.168s
Your code:
real 0m0.567s
user 0m0.268s
When I use QImage to prevent this double conversion mentionned at #576:
real 0m0.557s
user 0m0.246s
You are in right in the logic part, but it's due to the implementation. Into the code, the multiple conversion QImage<->QPixmap is not explained and seam not logic at the first look.
In my case I see well QRasterPlatformPixmap::toImage called, like you say.
https://github.com/alphaonex86/CatchChallenger/tree/master/tools/map2pngGUI used, mesuremente at the second lauch.
My trace: http://files.first-world.info/temp/callgrind.out.tar.xz
x11-libs/pixman-0.32.4 with sse2 ssse3

If you see why we have this difference...

@bjorn
Copy link
Member

bjorn commented Jan 8, 2014

Your callgrind analysis shows that your tilesets are apparently using indexed 8-bit color and both variants of the code are spending most of time converting from indexed 8-bit format to 32-bit color format. It wouldn't be very surprising when this conversion was a little faster when performed on the whole image instead of on each tile separately as the existing code does, so I think that's the reason why you're seeing your code being a little bit faster.

That said, the speed difference seems minimal to me and in the more common case of 32-bit images I measured that it became slower, so I still think it's better to just let the code as it is.

@alphaonex86
Copy link
Contributor Author

I use optimized tilset size by pngquant, then yes, indexed. When I use only the function is very more faster.
Then maybe force the conversion 8-32Bits image at the methode start and just use your code will help? Because I target optimisation what can be included into the mainline and have good performance on 8Bits and 32Bits image. 8Bits image it's optimised image very usefull to don't use lot of server bandwith.

@alphaonex86
Copy link
Contributor Author

Extracted from: https://bugreports.qt-project.org/browse/QTBUG-36031
As the pixmap only contains a QImage internally the QPixmap -> QImage conversion is just an increased refcount. Why is this a problem?

@bjorn
Copy link
Member

bjorn commented Jan 9, 2014

The temporary conversion to QImage and back is indeed very fast, but it is still better avoided when this is trivially possible. Why is this issue so important to you that you would go and ask the Qt developers to optimize this method? Even by your own measurements your application is only about 10% faster with your changes...

@alphaonex86
Copy link
Contributor Author

Because it's key method for render the large map (like google map), and on my datapack explorer:
http://catchchallenger.first-world.info/official-server/datapack-explorer/maps/indoor/1.6a.html
And into my game, I render multiple chuck of map, and that's have greate influance on chuck loading (more than 2x into my game client).
And for other planted project part (already use async + cache when is possible).

@alphaonex86
Copy link
Contributor Author

Then I have add into in my tiled version:
QImage MapReaderPrivate::readImage()
if(image.format()!=QImage::Format_ARGB32 || image.format()!=QImage::Format_RGB32)
image=image.convertToFormat(QImage::Format_ARGB32)
I hop performance improvements (compiling Qt to pass libpng 1.15 -> 1.16)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants