Unpack indices of arbitrary component types #227

shawnhatori · 2023-09-17T23:38:18Z

Since the memcpy only needs the size of the index data, it can use the component size (when equal to the stride) to determine the total number of bytes to copy. This allows it to work for different component types, including the common
cgltf_component_type_r_16u.

This also adds a reference comment for the function at the top of the file, consistent with cgltf_accessor_unpack_floats.

zeux · 2023-09-18T10:43:58Z

This doesn't seem right. When the input data is floats, we will now memcpy floats into 32-bit unsigned int output which is not what the caller expects. When the input data is r16u, we will now memcpy 16-bit indices into 32-bit unsigned int output which is also not what the caller expects. This function gives the caller a 32-bit unsigned index buffer while handling format conversions; the memcpy is an optimization for the case where no conversion is required.

shawnhatori · 2023-09-18T14:53:36Z

You're right, the change as written was not clear. I'll explain the intention.

What the API currently does: Requires index data to be stored in a uint32_t array. If the component size is smaller than uint32_t, each element is padded to 32-bit boundaries (e.g. for r16u, 2, 1, 0, 3 -> 0x02000000 0x01000000 0x00000000 0x03000000).

What I was expecting: Takes in a void* for an index array and an expected component_size (i.e. the data type of the index array). If the component_size of the accessor is larger than expected, returns 0. If it is smaller than expected, pad to expected (this replicates the existing padding behavior but for any larger type). Else, they're equal so memcopy everything (fast path) (e.g. for r16u, 2, 1, 0, 3 -> 0x0200 0x0100 0x0000 0x0300).

In my use case (and I assume many others), I own and know my mesh data, which all have 16-bit index values so my indices are stored as an array of uint16_t. My graphics API backends are also expecting index data packed as R16U. So I was hoping to have an API function to handle the unpacking boilerplate that respects the data type of the index array.

I'll leave it up to the maintainers to determine if that's a use case worth supporting, and if so, how to without a breaking API change. For the sake of discussion, I'll add a temporary new function that aims to handle this case as described above.

Since the `memcpy` only needs the size of the index data, it can use the component size (when equal to the stride) to determine the total number of bytes to copy. This allows it to work for different component types, including the common `cgltf_component_type_r_16u`. This also adds a reference comment for the function at the top of the file, consistent with `cgltf_accessor_unpack_floats`.

shawnhatori · 2023-10-24T23:41:34Z

@zeux Does the most recent code change address the data type issues you identified in your previous comment? I believe it's correct now.

For what it's worth, I've been using this change in my own asset pipeline to support my 16-bit index use case and it has worked great. I did test in a debugger using 32-bit indices and the data layout appears correct (i.e. no breaking change).

If this is a use case worth supporting, the interface obviously needs tweaks. This could possibly be an _ex function.

zeux · 2023-10-25T00:37:10Z

Yeah I think this addresses my concerns; I tested this in gltfpack which uses 32-bit indices and the change works without issues. I think we would need to either add this argument as mandatory to unpack_indices, or call the new function something else. Because unpack_indices was added after the last numbered release, I would recommend just adding a required argument to the function - all callers should be trivial to adjust.

There's still one corner case that I mentioned that isn't handled the same way with this change: namely, when the input is floats, memcpy will reinterpret the bits if the destination is unsigned int. If the component type is passed instead of a component size, the function can do memcpy only when the accessor type matches the expected type. You will note that cgltf_component_read_index does support floats as the input.

However, thinking about this and looking at the history, I am not sure why we're supporting reading floats as indices in the first place. This was initially introduced all the way back in 1f77128 and then gradually refactored into the current state. I would expect read_index to be used when glTF spec disallows floating point values, and the only example OTOH that's even theoretically plausible is feature IDs in upcoming EXT_mesh_features, that I would reasonably expect to read as floats.

Maybe @prideout has context here -- is reading floats-as-indices via cgltf_accessor_read_index used in Filament?

zeux · 2023-10-25T00:39:34Z

cgltf.h

 		{
-			*dest = (cgltf_uint)cgltf_component_read_index(element, accessor->component_type);
+			cgltf_size index_data = cgltf_component_read_index(element, accessor->component_type);
+			memcpy(dest, &index_data, index_component_size);


One minor note is that this only works on little-endian platforms. Maybe that's fine though, as we have issues with endian conversion in general, but a comment would be nice here.

Oh, this should also copy out_component_size, not index_component_size.

Finally, I would need to measure this, but this might significantly regress the 16->32 bit conversion speed, because the memcpy here is very short and variable size so the compiler won't inline it. It might be better to explicitly handle index_component_size=1,2,4 with a switch and a fixed-length write directly.

Sounds good. Do you want to profile it before we merge it?

Yeah I'll need to double check perf. We never need to copy 1 byte indices here since even if out_component_size is 1, this forces us down the memcpy path, so I think this can just check if out_component_size is 2 and write a 2-byte or 4-byte index accordingly.

Oh, this should also copy out_component_size, not index_component_size.

I'm not sure this is correct. If the output array is uint32_t (out_component_size) but the each index is uint16_t (index_component_size), we want to:

Copy just the uint16_t of index data.

Move the index pointer to the next index (+ 16 bits).

Move the array pointer to the next index (+ 32 bits).

This is what the current loop is doing. Let me know if I'm misunderstanding your proposed change.

I think this can just check if out_component_size is 2 and write a 2-byte or 4-byte index accordingly.

Keep in mind, this else block handles the case where index_component_size < out_component_size. According to the glTF specification, index data can be UNSIGNED_BYTE, UNSIGNED_SHORT, or UNSIGNED_INT. So the cases would be (as you said earlier):

index size 1 to output size 2, 4, 8

index size 2 to output size 4, 8

index size 4 to output size 8

Finally, I would need to measure this, but this might significantly regress the 16->32 bit conversion speed, because the memcpy here is very short and variable size so the compiler won't inline it. It might be better to explicitly handle index_component_size=1,2,4 with a switch and a fixed-length write directly.

Yeah looking at Godbolt, I think you're right. Switching from the variable copy to the fixed-length one inlines on GCC/Clang/MSVC.
https://godbolt.org/z/6rqYYfvoq

I've done this in the latest change.

At the cost of a little copy-paste, I moved the switch outside of the for loop so you don't pay the cost of the branch every loop.

Copy just the uint16_t of index data.

This will leave gaps in the target data: for every 4 bytes that the caller expects, you'll be only writing 2. This works without changes in gltfpack (which expects 32-bit output), but that's only because gltfpack uses zero-initialized memory for the output buffer which should not be expected.

Since we have separate loops anyway, I would recommend just writing uint16_t or uint32_t according to the output (critically!) component size. This will probably also mean we'll be able to drop the case 1 as it becomes redundant, and this will actually remove the extra endianness concerns.

Yeah I was thinking to require the user to provide a zero-initialized array. You're right that this isn't necessarily a reasonable expectation.

Ah okay I think I see what you're saying now. I've made this change and tested it for the 16-bit data -> 32-bit array case, works as expected.

prideout · 2023-10-25T01:17:40Z

Maybe @prideout has context here -- is reading floats-as-indices via cgltf_accessor_read_index used in Filament?

No. If we were to remove that particular case from the switch, I don't think we would adversely affect Filament.

As I recall, I included that case for completeness, since this function was originally conceived as a general purpose conversion utility, which is consistent with a strict interpretation of its docblock at the top of the file.

Note: I am no longer affiliated with Filament or Google.

zeux · 2023-10-25T02:49:21Z

Great, thanks! I submitted #232 to remove these to avoid future confusion and ensure that unpack_indices never encounters valid scenarios where it could decode float -> int.

zeux

Thanks! This looks good to merge.

zeux reviewed Oct 25, 2023

View reviewed changes

zeux mentioned this pull request Oct 25, 2023

Remove float->int casts from read_integer/read_index #232

Merged

shawnhatori added 2 commits October 25, 2023 15:24

use fixed size memcpy, update function signature

f295b3a

write with respect to output array size

1afdc74

zeux approved these changes Oct 29, 2023

View reviewed changes

shawnhatori requested a review from jkuhlmann November 2, 2023 17:47

jkuhlmann approved these changes Nov 27, 2023

View reviewed changes

jkuhlmann merged commit 3209a22 into jkuhlmann:master Nov 27, 2023
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unpack indices of arbitrary component types #227

Unpack indices of arbitrary component types #227

shawnhatori commented Sep 17, 2023

zeux commented Sep 18, 2023

shawnhatori commented Sep 18, 2023 •

edited

shawnhatori commented Oct 24, 2023

zeux commented Oct 25, 2023 •

edited

zeux Oct 25, 2023

zeux Oct 25, 2023

zeux Oct 25, 2023

jkuhlmann Oct 25, 2023

zeux Oct 25, 2023

shawnhatori Oct 25, 2023 •

edited

shawnhatori Oct 25, 2023

zeux Oct 25, 2023

shawnhatori Oct 29, 2023

prideout commented Oct 25, 2023

zeux commented Oct 25, 2023

zeux left a comment

Unpack indices of arbitrary component types #227

Unpack indices of arbitrary component types #227

Conversation

shawnhatori commented Sep 17, 2023

zeux commented Sep 18, 2023

shawnhatori commented Sep 18, 2023 • edited

shawnhatori commented Oct 24, 2023

zeux commented Oct 25, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shawnhatori Oct 25, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

prideout commented Oct 25, 2023

zeux commented Oct 25, 2023

zeux left a comment

Choose a reason for hiding this comment

shawnhatori commented Sep 18, 2023 •

edited

zeux commented Oct 25, 2023 •

edited

shawnhatori Oct 25, 2023 •

edited