Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VideoCommon: add a graphics mod action that allows you to modify the game's base rendering #11300

Merged
merged 12 commits into from Aug 22, 2023

Conversation

iwubcode
Copy link
Contributor

@iwubcode iwubcode commented Nov 24, 2022

Graphics mods are pretty basic right now. Disable bloom or stretch a HUD element. Not very exciting! But graphics mods were meant to be so much more.

With this feature, Dolphin gives users direct modification of the way game's draw by exposing hook points for users to provide their own pixel shaders.

A Basic Example

If you wanted to turn a character red, you'd provide a graphics mods that targeted that character's texture. For example, take Prince of Persia: the Forgotten Sand's main character texture here's how we might target that with our new graphics mod:

{
	"meta":
	{
		"title": "Color test pipeline",
		"author": "iwubcode"
	},
	"assets": [
		{
			"name": "material_color",
			"data":
			{
				"": "material.json"
			}
		},
		{
			"name": "shader_color",
			"data":
			{
				"metadata": "color.shader.json",
				"shader": "color.glsl"
			}
		}
	],
	"features": [
		{
			"action": "custom_pipeline",
			"action_data": {
				"passes": [
					{
						"pixel_material_asset": "material_color"
					}
				]
			},
			"group": "PrinceColor"
		}
	],
	"groups": [
		{
			"name": "PrinceColor",
			"targets": [
				{
					"texture_filename": "tex1_512x512_m_d0d80f2dcf9ed354_14",
					"type": "draw_started"
				},
				{
					"texture_filename": "tex1_512x512_m_d0d80f2dcf9ed354_14",
					"type": "create_texture"
				}
			]
		}
	]
}

with the color shader metadata ( color.shader.json ) looking like:

{
	"properties": []
}

the material json looking like ( material.json ):

{
	"shader_asset": "shader_color",
	"values": []
}

And finally the actual shader "color.glsl" that the user writes:

vec4 custom_main( in CustomShaderData data )
{
	return vec4(1.0, 0.0, 0.0, 1.0);
}

What's going on here? Well Dolphin is taking this piece of code and returning the color red. Astute readers might notice the CustomShaderData, more on that later.

If you're curious, here's the result:

pop_tfs

Game lighting

While it's up to the shader creator to decide what to do, the most obvious use case is to modify the game's lighting. Let's talk about that!

Game's provide two things for lighting: normals (a direction to describe the slope of the surface) and lights. In Dolphin, when Per-Pixel-LIghting is turned on, Dolphin will provide the normal to its pixel shader. This normal is then provided to the user through the CustomShaderData. So if you wanted to display a game's normals, you'd simply do:

vec4 custom_main( in CustomShaderData data )
{
	return vec4(data.normal, 1);
}

Applied to a handful of Fragile Dreams: Farewell Ruins to the Moon's textures, you'd see:

fragile-normals

Normals are wonderful but we really want to do something with lighting. In order to do that we need lights. Dolphin provides you all the scene's lights in the CustomShaderData. This includes their type, the color, and their position. You also get the pixels position, meaning with a little math you have all you need to get some lighting outputted.

Viewing lighting by itself might look like this (Little King's Story):

little king's story light

Game Quirks

Alas. Games don't have to give you normals. Arc Rise Fantasia fails to provide any:

arc rise fantasia - no normals

Some games give you normals but the view direction can modify them. The Last Story is an example of that:

last-story-changing_normals

Some games give you normals but don't actually provide lights on those objects. An example is Rune Factory: Tides of Destiny which provides normals on all objects but only provides lighting on the characters. In those situations you have the option to bake in the lighting directly in the shader:

rf_tod_shader_lighting

I've only tested 20 - 30 games at the moment but a majority of them have lighting/normals! Users will have to test their games to see what they can do with them.

Leveraging Textures

We have the game's normals and we may have lights. But the game's data is rather limited. What we really want is to provide textures to our shaders to give it more information. Well luckily we can!!!

Adding a texture to a shader requires us to use assets. You might have noticed this in the Prince example above but a material, a shader, and that's right even textures are all assets!

To leverage a texture, we need to:

  • Create a texture asset that references our texture file
  • Create a shader asset that says we expose a texture sampler in our shader code
  • Create a material that references our texture asset and syncs that with the texture sampler in the shader asset.

Whew!

Just showing the assets and features section it might look like this:

    "assets": [
        {
            "name": "material_output_normal",
            "data":
            {
                "": "output_normal.material.json"
            }
        },
        {
            "name": "output_replace_normal",
            "data":
            {
                "metadata": "output_normal.shader.json",
                "shader": "output_normal.glsl"
            }
        },
        {
            "name": "tex1_1024x512_m_8cfb7d28ee78826b_14_normal",
            "data":
            {
                "": "tex1_1024x512_m_8cfb7d28ee78826b_14_normal.png"
            }
        }
    ],
    "features": [
        {
            "action": "custom_pipeline",
            "action_data": {
              "passes":  [
                {
                  "pixel_material_asset": "material_output_normal"
                }
              ]
            },
            "group": "NormalShaderTarget_tex1_1024x512_m_8cfb7d28ee78826b_14"
        }
    ]

where output_normal.material.json would look like:

{
    "shader_asset": "shader_replace_normal",
    "values": [
        {
            "type": "texture_asset",
            "code_name": "NORMAL_TEX",
            "value": "tex1_1024x512_m_8cfb7d28ee78826b_14_normal"
        }
    ]
}

and output_normal.shader.json would look like:

{
    "properties": [
        {
            "type": "sampler2d",
            "description": "The normal texture",
            "code_name": "NORMAL_TEX"
        }
    ]
}

Finally, with all that in place, the output_normal.glsl might look like::

vec4 custom_main( in CustomShaderData data )
{
	if (data.texcoord_count == 0)
	{
		vec3 normal = data.normal.xyz;
		return vec4((normal * 0.5f ) + 0.5f, 1.0);
	}
	else
	{
#ifdef NORMAL_TEX_UNIT
		vec4 map_rgb = texture(samp[NORMAL_TEX_UNIT], NORMAL_TEX_COORD);
		vec4 map = map_rgb * 2.0 - 1.0;
		return map;
#else
		vec3 normal = data.normal.xyz;
		return vec4((normal * 0.5f ) + 0.5f, 1.0);
#endif
	}
}

This shader will either display the draw with the normal data (instead of the game's computed color) or will use the normal texture if available.

Where do NORMAL_TEX_UNIT and NORMAL_TEX_COORD come from? Well, in the shader metadata you specified an input sampler of NORMAL_TEX and Dolphin automatically generates NORMAL_TEX_UNIT and NORMAL_TEX_COORD for you.

You aren't restricted to one texture however. You can provide as many as you want following the same process. This allows you to do interesting features like physically based rendering (PBR). Here's Mobile Suit Gundam MS Sensen 0079 with a Mettalic/Roughness/Normal map:

gundam-pbr

Parallax occlusion is also possible.

Default

tos_parallax_occulsion_no

With parallax occlusion on (and metal details):

tos_parallax_occulsion

A time value is exposed, so basic animation is possible:

raguna_blink

This all might be quite complicated. And it is definitely very technical at the moment. Creating the boiler plate data for assets, materials, and shaders is tedious. I've written some scripts to make this simpler but all this complexity is expected to be alleviated with some sort of editor. Even not taking the assets into consideration, it will take time for the community to get up to speed to write shaders that can be leveraged.

Future work

In the future we can support even more features:

  • We can force normals to be generated with either geometry or tessellation shaders. This will allow games without normals (or broken normals) to have full use of the functionality
  • We can allow users to add additional (and dynamic!) lights. This requires knowing the position of elements, which means we need to look at hooking into the VertexLoader
  • Cube map support for static reflections
  • Expose shader compilation threading in a setting.
  • Shader uniforms
  • An editor to make it easier to create the various assets

Go out and make some pretty games!

Turn this:
harvest-moon-no-shader
Into this:
harvest-moon-shader

GRSEAF_2022-12-28_12-34-13

ffcccb

@iwubcode iwubcode marked this pull request as draft November 24, 2022 03:11
@iwubcode iwubcode changed the title VideoCommon: add a graphics mod action that allows you to modify the game's base renderering VideoCommon: add a graphics mod action that allows you to modify the game's base rendering Nov 24, 2022
@GlitchGamingX
Copy link

GlitchGamingX commented Nov 24, 2022

This is looking really promising. I look forward to it. If you need textures with Material Maps for testing, I am happy to provide some if needed.

@TellowKrinkle
Copy link
Contributor

Finalize the design. I originally opted to allow the texture inputs to be added independently of the game's textures (or custom textures). This allows the textures to be any size but has the downside of competing with the game's texture samplers (we only have 8). We could be smarter about samplers but that's another task.. Now I use texture arrays but that forces users to use the same texture size. I need to decide if I want to expose both..

Where does limitation of 8 textures come from? If it's from the render backends, that could be adjusted.

@iwubcode
Copy link
Contributor Author

Where does limitation of 8 textures come from? If it's from the render backends, that could be adjusted.

I think 8 was chosen because that matches the GC/Wii hardware. But yes, I looked at increasing it. 12 worked for my machine but increasing to 16 caused my GPU to spaz out. From what I understand, graphics cards support a very small amount of samplers but many textures. Hence my comment about improving our sampler handling. Of course, for my usage, the texture array approach works fine.

@JoshuaMKW
Copy link
Contributor

Watching with great interest

@iwubcode iwubcode force-pushed the custom-shaders branch 4 times, most recently from 8617304 to 24a9c42 Compare December 4, 2022 06:24
@iwubcode
Copy link
Contributor Author

iwubcode commented Dec 4, 2022

Ubershaders is implemented. Vulkan black textures are fixed. The code/commits are in a pretty good state. I see one more thing to do there.

Sadly, Vulkan still being Vulkan. Validation errors when trying to copy the base texture + load the additional data. No visible impact as far as I can tell but it should probably still be resolved. Not sure yet how to resolve it.

Finally, I have been debugging some flickering I see in Final Fantasy Crystal Chronicles: Crystal Bearers. Turns out, every other frame the game draws the floor with two textures instead of one. In my testing, I used both textures which means the second frame draw actually draws the second texture (when the floor is drawn with the first). While I can simply ignore the second texture in this case, I think it may be beneficial to have some sort of tev stage api. I was planning on doing that in a follow-up PR but I may see if I can implement it here.

@iwubcode
Copy link
Contributor Author

iwubcode commented Dec 21, 2022

I've added a minimal tev stage api for both the specialized and uber shaders. It allows you to get the input and output of each stage, along with analyzing the input type.

For instance, say you wanted to remove character shading from Tales of Symphonia in order to add your own shading. First you'd isolate just the character texture.

So change this:

tos_default

to this:

tos_no_shading

A shader would look like this:

vec4 custom_main( in CustomShaderData data )
{
	if (data.tev_stage_count == 0)
	{
		return data.final_color;
	}
	else
	{
		vec4 final_color = data.final_color;
		uint texture_set = 0;
		for (uint i = 0; i < data.tev_stage_count; i++)
		{
                        // There are 4 color inputs
			for (uint j = 0; j < 4; j++)
			{
				if (data.tev_stages[i].input_color[j].input_type == CUSTOM_SHADER_TEV_STAGE_INPUT_TYPE_TEX && texture_set == 0)
				{
					final_color = float4(data.tev_stages[i].input_color[j].value, 1.0);
					texture_set = 1;
				}
			}
		}

		return final_color;
	}
}

Which would then allow you to apply your own shading. Here's Tales of Symphonia with a more realistic lighting:

tos_no_shading_custom_lighting

@iwubcode
Copy link
Contributor Author

The coveted parallax occlusion is also possible. Thanks to @phire for working through this with me:

Default

tos_parallax_occulsion_no

With parallax occlusion on (and metal details):

tos_parallax_occulsion

@iwubcode iwubcode force-pushed the custom-shaders branch 2 times, most recently from 7abd1ef to 62e4669 Compare December 30, 2022 07:30
@iwubcode
Copy link
Contributor Author

iwubcode commented Dec 30, 2022

Calling on the graphics masters.

@K0bin / @TellowKrinkle / @Pokechu22 - would you all be willing to start looking this over? It's mostly ready.

Some things to call out:

  1. All the user shader code is put into the same source shader. This can cause conflicts if users use a function with the same signature and name. In order to avoid that, I wrote a mini glsl parser that will find all the function names and prefix/suffix them with a unique identifier. It's not ideal but it works. I discussed with @phire about alternatives and there wasn't really anything (at least not usable on all our backends).
  2. Testing shaders is a bit painful at the moment. First of all, every change you do needs a recompile (there aren't any uniforms). Also, that Dolphin pop up gets frustrating fast. I run into a similar issue in my post processing overhaul. Will eventually need to get that into a proper UI...
  3. There is some duplicated code.
    (a) The shader cache class just felt very unwieldy to me, so there is a bit duplicated in the CustomShaderCache. I do think it's pretty clean however.
    (b) There is texture load code (png/dds) that was pulled out of hires textures. My eventual plan is to make custom textures a graphics mod, but that's not available yet. I hope to tackle this cleanup in a future PR.
  4. There isn't a disk shader cache at the moment. This was intentional initially because I thought it would be weird with how the user can toggle on/off these shaders. But maybe these can be tagged and when a mod is disabled the cache is deleted? Not something that needs to be figured out at the moment but wanted to mention it.
  5. I tried to take a light touch to the shader logic and keep most of the custom code in one place. Obviously the TEV API sort of made that difficult. Also, it'd be nice if we could move more of the shader code into separate functions for better reuse. I also considered mimicking the API for the actual Dolphin code, might make things more readable. Not sure if it'd be slower though...

There is still an open issue with Vulkan where there is a validation error. The error is:

37:19:042 VideoBackends\Vulkan\VulkanContext.cpp:745 E[Host GPU]: Vulkan debug report: (Validation) Validation Error: [ UNASSIGNED-CoreValidation-DrawState-InvalidImageLayout ] Object 0: handle = 0x19,b20,96f,820, type = VK_OBJECT_TYPE_COMMAND_BUFFER; | MessageID = 0x4d,ae5,635 | vkQueueSubmit(): pSubmits[0].pCommandBuffers[0] command buffer VkCommandBuffer 0x19,b20,96f,820[] expects VkImage 0x2,b58,e50,000,007,49a[] (subresource: aspectMask 0x1 array layer 3, mip level 0) to be in layout VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL--instead, current layout is VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL.

I do not know how to resolve it. It happens during the point where the game texture data is copied into a new texture so that the additional textures specified by the shader can be added as additional layers.

Final thing I'll say is that the API was meant to be all encompassing so users wouldn't use anything in the shader header that we could change. However it doesn't include the variable samp . Apparently copying samp causes some issues in SPIRV-Cross. I can't recall the error offhand.

EDIT: Here is the user documentation if interested

@iwubcode iwubcode force-pushed the custom-shaders branch 2 times, most recently from e1e94c0 to 86ad7bf Compare December 30, 2022 07:56
Copy link
Contributor

@TellowKrinkle TellowKrinkle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if you could post an example of a custom shader and the GLSL (uber and non-uber) it gets converted to by the graphics mod, for reviewers who are having trouble reading the shader generation code.

{
for (int i = 0; i < 8; ++i)
{
if ((light_mask & (1 << (i + 8 * j))) != 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it would be more or less confusing to do light_count += popcount(light_mask & (0xff << (8 * j)));?

(Or you could use the for loop to mask channels from light_mask, then just light_count = popcount(light_mask))

for (u32 j = 0; j < NUM_XF_COLOR_CHANNELS; j++)
{
  if ((enablelighting & (1 << j)) == 0)
    light_mask &= ~(0xff << (8 * j));
  if ((enablelighting & (1 << (j + 2))) == 0)
    light_mask &= ~(0xff << (8 * (j + 2)));
}
u32 light_count = std::popcount(light_mask);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this comment resolved at some point? It's still marked as open on GitHub.

Copy link
Contributor Author

@iwubcode iwubcode Mar 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, I had seen this but then kept going back and forth on what to do and never commented. I like the conciseness but also felt like the current code was a little clearer and matched some of the logic the emulator already had. Does anyone feel strongly one way or the other?

Source/Core/VideoBackends/Vulkan/VKTexture.cpp Outdated Show resolved Hide resolved
Source/Core/VideoCommon/LightingShaderGen.h Outdated Show resolved Hide resolved
@iwubcode
Copy link
Contributor Author

iwubcode commented Jan 1, 2023

@TellowKrinkle - great idea.

For a user shader outlined in the opening (the prince of persia red shader), I did what you requested. I introduced an error to get it to dump (removed a semicolon) but you should be able to follow it, let me know if you have any questions.

Here is the user shader:

vec4 custom_main( in CustomShaderData data )
{
  return vec4(1.0, 0.0, 0.0, 1.0)
}

Here is the output:

specialized_shader.txt
uber_shader.txt

If you want it, I can get a lighting/texture example as well.

@iwubcode
Copy link
Contributor Author

iwubcode commented Jan 7, 2023

I forgot one of the original reasons I wanted to implement this was to support animation. Well that's now possible:

raguna_blink

The above is a three-frame blink animation (hastily drawn by myself, sorry for the poor quality). Here's a naive way to implement an animated texture:

bool time_between(float time, float start, float end)
{
	return time > start && time <= end;
}

vec4 custom_main( in CustomShaderData data )
{
	if (data.texcoord_count == 0)
	{
		return data.final_color;
	}

	float animation_sec = 3.0; // how long do we want the animation to last before looping
	float pi_times_two = 3.1415926536*2.0;

	// normalize the time between a 0 and 1 range
	float sin_normalized = sin(pi_times_two*(data.time_ms)/(animation_sec * 1000.0))* 0.5 + 0.5;

	// Could multiply by animation time length to
	// make it easier to weight out the individual frames
	float total_time_passed = sin_normalized;

	if (time_between(total_time_passed, 0.980, 0.985))
	{
		return texture(samp[PORTRAIT1_TEX_UNIT], PORTRAIT1_TEX_COORD);
	}
	else if (time_between(total_time_passed, 0.985, 0.995))
	{
		return texture(samp[PORTRAIT2_TEX_UNIT], PORTRAIT2_TEX_COORD);
	}
	else if (time_between(total_time_passed, 0.995, 1.0))
	{
		return texture(samp[PORTRAIT3_TEX_UNIT], PORTRAIT3_TEX_COORD);
	}

	return data.final_color;
}

Otherwise I cleaned up the code a bit more in the CustomShaderAction.

@iwubcode
Copy link
Contributor Author

iwubcode commented Jan 8, 2023

Minor cleanup and fixed a possibility of a crash (null pixel shader in the pipeline when the shader fails to compile).

@K0bin
Copy link
Contributor

K0bin commented Jan 11, 2023

How do I reproduce the Vulkan validation error?

You don't do self-copies, do you? (Copying from one layer of an image to another for example)

Also, is it at all possible to maybe split this PR into smaller ones? It's practically impossible to review.

EDIT: I think the validation error is related to the fact that Dolphin tracks the texture layout but can optionally put the resulting barriers into either the main command buffer or the init command buffer which is executed before the main one. So if you call Texture::Load which uses the InitCommandBuffer after doing literally anything else with the texture, the ImageLayout tracking breaks.

This patch should fix it. Alternatively you could create a new texture.

Copy link
Contributor

@Pokechu22 Pokechu22 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I did a pass over all of the shader-related code (other than UberShaderPixel which I didn't look at in depth - I assume it's basically the same as PixelShaderGen).

One thing I recommend you do is watch a few videos by Jasper, which discuss some more interesting rendering techniques used by Super Mario Galaxy and Wind Waker:

There are also two longer follow-up videos which discuss some other aspects of rendering:

Although these aren't really guides to doing gamecube graphics, they do give examples of more niche situations you might run into (and scenes that are worth experimenting with in games). Although on the other hand, it probably isn't as important to try and replace rendering of scenes like these if they already look good.

Source/Core/VideoCommon/ShaderGenCommon.cpp Outdated Show resolved Hide resolved
Source/Core/VideoCommon/ShaderGenCommon.cpp Outdated Show resolved Hide resolved
docs/CustomPipelineGraphicsMod.md Outdated Show resolved Hide resolved
Source/Core/VideoCommon/ShaderGenCommon.cpp Outdated Show resolved Hide resolved
Source/Core/VideoCommon/PixelShaderGen.cpp Outdated Show resolved Hide resolved
Comment on lines 789 to 811
for (u32 i = 0; i < uid_data->genMode_numindstages; ++i)
{
if ((uid_data->nIndirectStagesUsed & (1U << i)) != 0)
{
u32 texcoord = uid_data->GetTevindirefCoord(i);

// Quirk: when the tex coord is not less than the number of tex gens (i.e. the tex coord
// does not exist), then tex coord 0 is used (though sometimes glitchy effects happen on
// console). This affects the Mario portrait in Luigi's Mansion, where the developers forgot
// to set the number of tex gens to 2 (bug 11462).
if (texcoord >= uid_data->genMode_numtexgens)
texcoord = 0;

out->Write("\t{{\n");
out->Write("\t\tint2 fixpoint_uv = int2(custom_data.texcoord[{}].xy", texcoord);
out->Write(" * float2(" I_TEXDIMS "[{}].zw * 128));\n", texcoord);
out->Write("\t\tint2 tempcoord = fixpoint_uv >> " I_INDTEXSCALE "[{}].{};\n", i / 2,
(i & 1) ? "zw" : "xy");
out->Write("\t\tcustom_data.texcoord[{0}] = float3(tempcoord / float2(" I_TEXDIMS
"[{0}].zw * 128), 0);\n",
texcoord);
out->Write("\t}}\n");
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is something that makes sense to do. It looks like your goal is to apply the indirect texture scale, but you also need to convert into and then out of fixed point for that. But, more annoyingly, I don't think there's any rule that forces each indirect stage to use a distinct texture coordinate; I'm pretty sure you can use the same texture coordinate for multiple indirect stages (e.g. both of them use texture coordinate 0 for different textures). In particular I'm pretty sure that applies to the Luigi's Mansion portrait (albeit by accident).

It might be better to instead have indtexcoord as a field in custom_data, and store it as something like

custom_data.indtexcoord[ind_num] = custom_data.texcoord[ind_iref] / float2(1 << cindscale[ind_num/2].xy_or_zw)

(i.e. ignore the fixed-point stuff, and replace x >> y with x / (1 << y))

I don't know if that would be enough to replicate indirect texture behavior (I don't think you currently expose the actual indirect texture lookup, or any of the ways it modifies the texture coordinates (both from the texture and from wrapping and the matrix)), though. Maybe it'd be better to just get rid of this entirely for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Essentially the goal is to get the texture samplers and the texture coordinates used by the game for a given draw call. At the moment draw-call = texture, which of course isn't really right (ex: multiple textures defined for a draw) but it's hard for users to point at a draw call without some sort of editor.

The goal of this block was to grab the texture coordinates, regardless of whether it is an indirect or direct call. This normalizes the data, so the user doesn't have to care (they probably don't).

I'm pretty sure you can use the same texture coordinate for multiple indirect stages (e.g. both of them use texture coordinate 0 for different textures)

Yes, this is fine. Just want the user to say for this texture, give me the sampler and texture coordinates.

Maybe it'd be better to just get rid of this entirely for now.

I'm fine doing that but still not clear what the long-term solution would be? If the draw-call = texture is the issue, I can certainly fix that in the future. I vaguely recall before I had this, lots of textures just weren't mapping right.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Will address thus more fully later - I don't entirely understand what you mean by draw-call = texture, and will need to check more to understand better).

Yes, this is fine. Just want the user to say for this texture, give me the sampler and texture coordinates.

It's fine in the case of #11300 (comment), but not here where the same texture coordinate will get scaled multiple times.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of this block was to grab the texture coordinates, regardless of whether it is an indirect or direct call. This normalizes the data, so the user doesn't have to care (they probably don't).

Explanation attempt 1

This isn't something that makes sense to do. Based on libogc's GX_SetIndTexCoordScale, the main use of the indirect scale is so that you can use the same texture coordinate for both a regular texture and an indirect texture, but have different sizes for the two.

For each texture coordinate, bpmem.texcoords[i].s.scale_minus_1 gives a scale on the horizontal axis, and t on the vertical axis, which are generally set to the size of the corresponding texture in texels (libogc's GX_SetTexCoordScaleManually says that this happens automatically in most cases, but can be overridden if needed). So for a 256 by 256 texture, you would supply input coordinates between (0, 0) and (1, 1), and then these would be scaled to (0, 0) to (256, 256) when actually sampling the texture. GLSL's texture function doesn't expect this multiplication by the texture size, so we have to divide it by the texture's actual size ahead of time; the texture's actual size is stored in I_TEXDIMS[i].xy while this scale is I_TEXDIMS[i].zw. (Manual texture sampling uses texelFetch which does require multiplication by the texture size, and also manual wrapping; there's an extra complication here about the texture size the game uses being stored in I_TEXDIMS[i].xy, but custom textures possibly having a different size that needs to be used instead, which involves querying the size in GLSL.)

... OK, that's a bit of a confusing explanation. I'm going to try it in a different way:

Games usually specify textures coordinates as UV coordinates where (0, 0) is the top-left of the texture and (1, 1) is the bottom-right of the texture. This includes the GameCube/Wii; Modern graphics APIs have functions such as texture where you can just provide UVs and everything works. This is probably the coordinate system you want to expose for shaders.

But for the GameCube/Wii, sampling textures is done with ST coordinates. For instance, for a 256 by 256 texture, (0, 0) is the top-left and (255, 255) is the bottom-right (... ish, there may be an off-by-one here). I've seen this UV versus ST distinction in graphics textbooks, but Dolphin doesn't really use it currently. There's a scale factor that UV coordinates get multiplied by to convert to ST coordinates. That said, this is mostly hidden from the GX API: unless GX_SetTexCoordScaleManually is used, the scale (in bpmem.texcoords[texcoord].s.scale_minus_1 for s, and a similar value for t, and in shaders I_TEXDIMS[texcoord].zw) will match the one for the texture being used (in AllTexUnits.GetUnit(texmap).texImage0 or I_TEXDIMS[texmap].xy - note the difference in indexing here).

This does pose an issue when using the same texture coordinate for multiple textures, if those textures are of different sizes (for instance one is 256 by 256 and the second is 128 by 128); the scale will only match one of those. If the scale factor is 256, then the 128 by 128 texture will be drawn at half-scale horizontally (as in it'll be drawn twice for every time the bigger texture is drawn, or rather 4 times since this applies on both axes). I don't think there's any mechanism to deal with this for most textures that end up getting drawn to the screen, but there is a mechanism for doing this for indirect textures: GX_SetTexCoordScaleManually, which modifies bpmem.texscale[indstage/2]. This lets you scale down the texture coordinates when sampling a texture for an indirect texture operation (i.e. when you're reading a texture so that you can then add the values read from that texture to another texture coordinate, possibly transformed by a matrix). So, if you have a 256 by 256 main texture, and also a 128 by 128 texture you want to use as an indirect texture, you can set bpmem.texscale[0].ss0 and ts0 to 1, and thus the ST coordinates will be at appropriate scales for both textures.

The key issue here is that this kind of scaling is meaningless for UV coordinates, where the texture's size isn't a factor at all. I'd expect most custom shaders to use texture (and not e.g. texelFetch), so applying the scaling wouldn't make sense. Furthermore, even if ST coordinates were used, it would be more useful to provide ST coordinates on the main texture's scale, since that's what ends up on the screen eventually. (There's also a secondary issue of multiple indirect stages where all of the stages use the same texture coord; if you have two 128 by 128 indirect textures, the current code would divide s and t by 4, instead of 2.)

This is a lot of text, but hopefully it explains what the point of all of these are well enough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to say thank you so much for taking the time to write this in-depth explanation. I've read it a couple times and your reasoning makes a lot of sense now. I had heard of ST coordinates but always assumed they were the same as UV coordinates. Maybe this distinction is special to the Cube/Wii?

Regardless, I see now why I don't need the indirect texture uv transformation and only need to capture those texture map ids.

Thanks again for the explanation. I have it all sorted now I think!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to dig up more information regarding ST versus UV and can't find anything concrete about the differences between them (either online or in various GC patents or in my graphics textbook), just that both are used by convention for textures. It would be convenient if there was a commonly used set of terminology for the difference between ranging between 0 and 1 versus 0 and width, but it doesn't seem like there actually is one. This doesn't change the actual GameCube/Wii behavior though (which should be as I described above); I'm just clarifying in case you see the terms elsewhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original difference was that UV were surface coordinates (the range is whatever you want), and ST are texture coordinates (texture matrices applied, wrap mode is applied to fit within 0-1, sometimes the coordinates are unnormalized), but this hasn't been true in practice for many many years as hardware advancements happened and as shaders took off.

UV is now the standard, ST died out as it isn't a meaningful unit unless you make GPUs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does pose an issue when using the same texture coordinate for multiple textures, if those textures are of different sizes (for instance one is 256 by 256 and the second is 128 by 128); the scale will only match one of those. If the scale factor is 256, then the 128 by 128 texture will be drawn at half-scale horizontally (as in it'll be drawn twice for every time the bigger texture is drawn, or rather 4 times since this applies on both axes).

The standard way to handle this is to just eat up multiple texgens from the same texture coordinate.

As mentioned, this is secretly handled for you with the GX API -- right before a display list or a draw begins, it sets the texcoord scale registers to be the size of the texture. The GX's per-pixel pipeline is all unnormalized, the per-vertex XF unit generates unnormalized coordinates as one of the last things it does.

This means that indirect texture offsets are unnormalized, too. If I read +128 from an indirect texture, that's going to move +128 pixels in the final unnormalized coordinate no matter the original size. But the IndTexMtx can be used to adjust for this.

I don't think there's any mechanism to deal with this for most textures that end up getting drawn to the screen, but there is a mechanism for doing this for indirect textures: GX_SetTexCoordScaleManually, which modifies bpmem.texscale[indstage/2]

I think you meant to link GXSetIndTexCoordScale here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the info on UV versus ST.

It also sounds like the distinction I wanted was normalized versus unnormalized, where normalized is for ranges of 0 to 1 while unnormalized is for ranges of 0 to width/height. (... or at least can be, I guess, and you'd also have to consider wrapping at some point, but eh.)

I think you meant to link GXSetIndTexCoordScale here?

Yes, I meant to copy the link to GX_SetIndTexCoordScale from my first abandoned explanation, but I must not have been paying attention and copied the wrong one.

This means that indirect texture offsets are unnormalized, too. If I read +128 from an indirect texture, that's going to move +128 pixels in the final unnormalized coordinate no matter the original size. But the IndTexMtx can be used to adjust for this.

There's also an indtexbias parameter to GX_SetTevIndirect (which ends up in bpmem.tevind[stage].bias), which will subtract 128 from it in most cases, prior to the matrix being applied. (The exception is when the indirect format is not ITF_8, which is used in the Skyward Sword map (see #9876 and my writeup.)

The standard way to handle this is to just eat up multiple texgens from the same texture coordinate.

Yeah, that's what I've generally seen. If a game does this, then there's no special handling needed for graphics mods; an array listing the texture coordinate corresponding to each texture will work fine. (PixelShaderGen.cpp gets those generated texture coordinates as an input, so even if the model itself has only one texture coordinate or if the texture coordinates are generated from positions, at this point everything will be fine.)

The GX's per-pixel pipeline is all unnormalized, the per-vertex XF unit generates unnormalized coordinates as one of the last things it does.

It's worth noting that Dolphin does this multiplication by texture size in the pixel shader. This is probably suboptimal since the texture size won't change, but it does mean that (apart from when per-pixel lighting is enabled) everything in the vertex shader is based on xfmem and everything in the pixel shader is based on bpmem.

Yagcd says that the relevant variables are SU_SSIZE0 and SU_TSIZE0, which I think refers to the setup unit which is between the transform unit and the TEV (I think it's responsible for making triangles, although the rasterizer is also listed separately).

Handling both the multiplication by the scale in the vertex shader would be nice, but we'd still need to divide by the texture size for GLSL's texture and similar to work properly, both for Dolphin's use and for graphics mods. (And that division couldn't be done in the vertex shader if we want fixed-point math and indirect textures to work).

Source/Core/VideoCommon/PixelShaderGen.cpp Show resolved Hide resolved
Source/Core/VideoCommon/VertexManagerBase.cpp Outdated Show resolved Hide resolved
@iwubcode

This comment was marked as outdated.

@iwubcode iwubcode force-pushed the custom-shaders branch 4 times, most recently from ef1c303 to 844e6f5 Compare August 16, 2023 02:53
@iwubcode
Copy link
Contributor Author

@Pokechu22 - thank you again for the review. I've addressed the points raised. I hope you will find some value in this feature in the future for your debugging. At least I'd imagine this might be useful to visualizer certain scenarios. Thanks again!

@iwubcode
Copy link
Contributor Author

iwubcode commented Aug 17, 2023

Nothing major in this last push, just noticed a lighting shader compiler error (caused by a refactor request) when testing a separate feature. Still ready for re-review.

Copy link
Contributor

@Pokechu22 Pokechu22 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from a minor documentation issue, this looks good to me now. I haven't done any kind of testing though.

docs/CustomPipelineGraphicsMod.md Outdated Show resolved Hide resolved
docs/CustomPipelineGraphicsMod.md Outdated Show resolved Hide resolved
@AdmiralCurtiss AdmiralCurtiss merged commit dbe6a5f into dolphin-emu:master Aug 22, 2023
11 checks passed
@iwubcode
Copy link
Contributor Author

To @Pokechu22 and the rest of the reviewers that gave their time toward this feature, I just want to say thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
9 participants