Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster blit_with_alpha() #14448

Merged
merged 7 commits into from Mar 19, 2024
Merged

Faster blit_with_alpha() #14448

merged 7 commits into from Mar 19, 2024

Conversation

HybridDog
Copy link
Contributor

@HybridDog HybridDog commented Mar 9, 2024

The blit_with_alpha function has a noticeable effect on the time it takes to join a game.

To reduce the join times, I replace the blit_with_alpha function with a new one:

  • It does not uses floating-point numbers.
  • It directly operates on the raw pixel data instead of using the comparatively
    slow setPixel and getPixel functions from Irrlicht.
    Only ECF_A8R8G8B8 base images are supported now.
    If the top image does not have the ECF_A8R8G8B8 colour format, it is converted;
    I assume that this happens rarely.
  • There are case distinctions for fully opaque, fully transparent and semi-transparent pixels.
    This empirically increases the performance since the mixing between two semi-transparent happens rarely.
  • The new function no longer has the src_pos argument since it was always the zero vector.
  • The function is only documented once where it is declared.

For backwards compatibility, blit_with_alpha still mixes colours without gamma correction.
blit_with_alpha nonetheless behaves slightly different than before:
If a semi-transparent pixel is drawn on top of another semi-transparent pixel,
the colour is mixed in a way which we can consider to be more correct now.

Related issue: #14322
Roadmap goal: 2.1 Rendering/Graphics improvements (perhaps)

To do

This PR is a Ready for Review.

How to test

  • Add this Lua code to a mod:
local kick_on_join = true
if kick_on_join then
	minetest.register_on_joinplayer(function(player)
		minetest.kick_player(player:get_player_name(),
			"This is for testing purposes. Don't cry.")
	end)
end
  • Start a local Minetest server
  • In a Shell which supports time, join the server with the following command:
time bin/minetest --name your_name --address 127.0.0.1 --go
  • Repeat the above command multiple times with and without the proposed changes and note down the duration values.

With the mods I commonly use, I could observe a speedup from 2.5 s to 2.2 s real time and a speedup from 1.3 s to 1.0 s user CPU time.
The speedup may be different with large textures and/or a different selection of mods.
Joining the game may not be the only situation which is faster with the new blit_with_alpha function.

Texture test nodes can be helpful to test if `blit_with_alpha` works correctly.

The alpha compositing test node covers different cases where pixel colours are mixed with each other.
The test currently fails because `blitPixel` does not work correctly if a semi-transparent colour
is drawn on top of another semi-transparent colour.
The test nodes for the fill texture modifier show if the size and position arguments of the modifer work correctly.
They do not cover special cases such as very large or negative position or size values.
@Zughy Zughy added Roadmap The change matches an item on the current roadmap. Performance Feature ✨ PRs that add or enhance a feature @ Client rendering labels Mar 9, 2024
src/client/texturesource.cpp Outdated Show resolved Hide resolved
src/client/texturesource.cpp Outdated Show resolved Hide resolved
src/client/texturesource.cpp Outdated Show resolved Hide resolved
src/client/texturesource.cpp Outdated Show resolved Hide resolved
src/client/texturesource.cpp Outdated Show resolved Hide resolved
src/client/texturesource.cpp Outdated Show resolved Hide resolved
src/client/texturesource.cpp Outdated Show resolved Hide resolved
@Desour Desour added Action / change needed Code still needs changes (PR) / more information requested (Issues) and removed Feature ✨ PRs that add or enhance a feature labels Mar 12, 2024
The `blit_with_alpha` function has a noticeable effect on the time it takes to join a game.

To reduce the join times, I replace the `blit_with_alpha` function with a new one:
* It does not uses floating-point numbers.
* It directly operates on the raw pixel data instead of using the comparatively
  slow `setPixel` and `getPixel` functions from Irrlicht.
  Only ECF_A8R8G8B8 base images are supported now.
  If the top image does not have the ECF_A8R8G8B8 colour format, it is converted;
  I assume that this happens rarely.
* There are case distinctions for fully opaque, fully transparent and semi-transparent pixels.
  This empirically increases the performance since the mixing between two semi-transparent happens rarely.
* The new function no longer has the `src_pos` argument since it was always the zero vector.
* The function is only documented once where it is declared.

For backwards compatibility, `blit_with_alpha` still mixes colours without gamma correction.
`blit_with_alpha` nonetheless behaves slightly different than before:
If a semi-transparent pixel is drawn on top of another semi-transparent pixel,
the colour is mixed in a way which we can consider to be more correct now.
@HybridDog
Copy link
Contributor Author

I have tested the performance with different implementation decisions now by collecting the blit_with_alpha inputs and measuring the time it takes to re-execute all the blit_with_alphas.

I tried it 9 times and there were 1687 blit_with_alpha inputs each time.
for f in {1..9}; do bin/minetest --name your_nameaaa --address 127.0.0.1 --go --quiet; done

With the branches the median duration was ca. 0.037 s.
All durations: 0.0353195, 0.0358201, 0.0363832, 0.0364065, 0.0372696, 0.0373504, 0.0381849, 0.0395097, 0.040282
Code: https://github.com/HybridDog/minetest/tree/8cbd75b076bb5d91caed2aaf90121b7753675d13

Without the branches the median was ca. 0.082 s.
All durations: 0.0795838, 0.0796296, 0.0812277, 0.08147, 0.0819993, 0.0820157, 0.082015, 0.0822578, 0.089137
Code: https://github.com/HybridDog/minetest/tree/efbce26

With the branches, a small reordering, and using SColor, the median was ca. 0.026 s.
The small reordering is the movement of the branch src_a == 0 before the branch src_a == 255 || dst_a == 0; without this the median was ca. 0.035 s.
I also tried u32 integers instead of static_casts to u8 and it was noticeable faster with u8 integers.
The code from this case is used in this Pull Request.
All durations: 0.0247669, 0.0249853, 0.0254656, 0.0255048, 0.0257042, 0.026188, 0.026427, 0.0265187, 0.0272248
Code: https://github.com/HybridDog/minetest/tree/96505ee7f

With blit_with_alpha from the master branch, the median was ca. 0.384 s.
All durations: 0.361643, 0.368806, 0.373106, 0.380362, 0.384423, 0.386407, 0.393304, 0.394426, 0.401159
Code: https://github.com/HybridDog/minetest/tree/6c7009384

@HybridDog HybridDog requested a review from sfan5 March 13, 2024 20:45
@Desour Desour removed the Action / change needed Code still needs changes (PR) / more information requested (Issues) label Mar 14, 2024
@Desour Desour self-assigned this Mar 14, 2024
@Desour
Copy link
Member

Desour commented Mar 14, 2024

blit_with_alpha nonetheless behaves slightly different than before:
If a semi-transparent pixel is drawn on top of another semi-transparent pixel,
the colour is mixed in a way which we can consider to be more correct now.

I also noticed how texture overlay is not alpha blending if both are semitransparent.
(This also makes ^ non-associative.)

I've made test mod for this:

code
local function trigger_hud(player)
	local row_height = 0.05
	local base_pos = {x=0.2,y=0.2}
	local ncolumns = 4

	local tex_checker = "[png:"..minetest.encode_base64(minetest.encode_png(ncolumns, 1, {
			{a=255, r=200, g=200, b=200},
			{a=255, r=55, g=55, b=55},
			{a=255, r=200, g=200, b=200},
			{a=255, r=55, g=55, b=55},
		}))

	local tex_a = "[png:"..minetest.encode_base64(minetest.encode_png(ncolumns, 1, {
			{a=255, r=255, g=255, b=255},
			{a=250, r=255, g=255, b=255},
			{a=1, r=0, g=255, b=0},
			{a=0, r=0, g=255, b=0},
		}))

	local tex_b = "[png:"..minetest.encode_base64(minetest.encode_png(ncolumns, 1, {
			{a=8, r=255, g=0, b=0},
			{a=8, r=255, g=0, b=0},
			{a=127, r=255, g=0, b=0},
			{a=127, r=255, g=0, b=0},
		}))

	-- name, texturestring
	local rows = {
		{"test_tmp:checker", tex_checker},
		{"test_tmp:a", tex_a},
		{"test_tmp:b", tex_b},
		{"test_tmp:a_b", tex_a.."^"..tex_b},
		{"test_tmp:a_b_hud", tex_a},
	}

	local function draw_row_at(i, at, z_offset)
		player:hud_add({
			type = "image",
			position = {x=base_pos.x, y=base_pos.y+at*row_height},
			name = rows[i][1],
			scale = {x=-100*row_height*ncolumns, y=-100*row_height},
			text = rows[i][2],
			alignment = {x=1, y=1},
			offset = {x=0, y=0},
			z_index = 1000 + (z_offset or 0),
		})
	end

	for row = 1, #rows do
		draw_row_at(row, row)
	end

	draw_row_at(3, 5, 1)
end

minetest.register_node("test_tmp:node1", {
	description = "Node1",
	tiles = {"default_pine_tree_top.png"},
	groups = { dig_immediate = 2 },
	on_punch = function(_pos, _node, puncher, _pointed_thing)
		if minetest.is_player(puncher) then
			trigger_hud(puncher)
		end
	end,
})

But the alpha test node you added also shows this. In master, there are some texels that are different.

I agree that we can consider this change in behaviour a bugfix though.

Copy link
Member

@Desour Desour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks fine otherwise.

src/client/texturesource.cpp Show resolved Hide resolved
Copy link
Member

@Desour Desour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested, works. 👍

Joined your land server. It was about 1/3 (5 s) faster. And blit_with_alpha is in the flamegraph basically completely gone.

Thanks!

src/client/texturesource.cpp Outdated Show resolved Hide resolved
src/client/texturesource.cpp Outdated Show resolved Hide resolved
src/client/texturesource.cpp Outdated Show resolved Hide resolved
src/client/texturesource.cpp Outdated Show resolved Hide resolved
src/client/texturesource.cpp Outdated Show resolved Hide resolved
Avoid braces and auto
More verbose blit_pixel documentation and fix rounding
More verbose blit_pixel documentation and fix rounding
cast max output instead of its arguments
Copy link
Member

@SmallJoker SmallJoker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Visual comparison: (1 color difference)
comparison

This is what I get when mixing the two textures in GIMP (100% layer opacity). The resulting pixel is of alpha=135.
GIMP reproduction

.. that's also roughly what I get in Minetest:
pixel in Minetest

The PR looks more correct, thus I approve.

@sfan5 sfan5 merged commit cda1124 into minetest:master Mar 19, 2024
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@ Client rendering Performance Roadmap The change matches an item on the current roadmap. >= Two approvals ✅ ✅
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants