-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unpack indices of arbitrary component types #227
Conversation
This doesn't seem right. When the input data is floats, we will now memcpy floats into 32-bit unsigned int output which is not what the caller expects. When the input data is r16u, we will now memcpy 16-bit indices into 32-bit unsigned int output which is also not what the caller expects. This function gives the caller a 32-bit unsigned index buffer while handling format conversions; the memcpy is an optimization for the case where no conversion is required. |
You're right, the change as written was not clear. I'll explain the intention. What the API currently does: Requires index data to be stored in a What I was expecting: Takes in a In my use case (and I assume many others), I own and know my mesh data, which all have 16-bit index values so my indices are stored as an array of I'll leave it up to the maintainers to determine if that's a use case worth supporting, and if so, how to without a breaking API change. For the sake of discussion, I'll add a temporary new function that aims to handle this case as described above. |
Since the `memcpy` only needs the size of the index data, it can use the component size (when equal to the stride) to determine the total number of bytes to copy. This allows it to work for different component types, including the common `cgltf_component_type_r_16u`. This also adds a reference comment for the function at the top of the file, consistent with `cgltf_accessor_unpack_floats`.
@zeux Does the most recent code change address the data type issues you identified in your previous comment? I believe it's correct now. For what it's worth, I've been using this change in my own asset pipeline to support my 16-bit index use case and it has worked great. I did test in a debugger using 32-bit indices and the data layout appears correct (i.e. no breaking change). If this is a use case worth supporting, the interface obviously needs tweaks. This could possibly be an |
Yeah I think this addresses my concerns; I tested this in gltfpack which uses 32-bit indices and the change works without issues. I think we would need to either add this argument as mandatory to unpack_indices, or call the new function something else. Because unpack_indices was added after the last numbered release, I would recommend just adding a required argument to the function - all callers should be trivial to adjust. There's still one corner case that I mentioned that isn't handled the same way with this change: namely, when the input is floats, memcpy will reinterpret the bits if the destination is unsigned int. If the component type is passed instead of a component size, the function can do memcpy only when the accessor type matches the expected type. You will note that However, thinking about this and looking at the history, I am not sure why we're supporting reading floats as indices in the first place. This was initially introduced all the way back in 1f77128 and then gradually refactored into the current state. I would expect read_index to be used when glTF spec disallows floating point values, and the only example OTOH that's even theoretically plausible is feature IDs in upcoming EXT_mesh_features, that I would reasonably expect to read as floats. Maybe @prideout has context here -- is reading floats-as-indices via |
cgltf.h
Outdated
{ | ||
*dest = (cgltf_uint)cgltf_component_read_index(element, accessor->component_type); | ||
cgltf_size index_data = cgltf_component_read_index(element, accessor->component_type); | ||
memcpy(dest, &index_data, index_component_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor note is that this only works on little-endian platforms. Maybe that's fine though, as we have issues with endian conversion in general, but a comment would be nice here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, this should also copy out_component_size
, not index_component_size
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally, I would need to measure this, but this might significantly regress the 16->32 bit conversion speed, because the memcpy here is very short and variable size so the compiler won't inline it. It might be better to explicitly handle index_component_size=1,2,4 with a switch and a fixed-length write directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Do you want to profile it before we merge it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I'll need to double check perf. We never need to copy 1 byte indices here since even if out_component_size is 1, this forces us down the memcpy path, so I think this can just check if out_component_size is 2 and write a 2-byte or 4-byte index accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, this should also copy
out_component_size
, notindex_component_size
.
I'm not sure this is correct. If the output array is uint32_t
(out_component_size
) but the each index is uint16_t
(index_component_size
), we want to:
- Copy just the
uint16_t
of index data. - Move the index pointer to the next index (+ 16 bits).
- Move the array pointer to the next index (+ 32 bits).
This is what the current loop is doing. Let me know if I'm misunderstanding your proposed change.
I think this can just check if out_component_size is 2 and write a 2-byte or 4-byte index accordingly.
Keep in mind, this else block handles the case where index_component_size < out_component_size
. According to the glTF specification, index data can be UNSIGNED_BYTE
, UNSIGNED_SHORT
, or UNSIGNED_INT
. So the cases would be (as you said earlier):
- index size 1 to output size 2, 4, 8
- index size 2 to output size 4, 8
- index size 4 to output size 8
Finally, I would need to measure this, but this might significantly regress the 16->32 bit conversion speed, because the memcpy here is very short and variable size so the compiler won't inline it. It might be better to explicitly handle index_component_size=1,2,4 with a switch and a fixed-length write directly.
Yeah looking at Godbolt, I think you're right. Switching from the variable copy to the fixed-length one inlines on GCC/Clang/MSVC.
https://godbolt.org/z/6rqYYfvoq
I've done this in the latest change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the cost of a little copy-paste, I moved the switch outside of the for loop so you don't pay the cost of the branch every loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copy just the uint16_t of index data.
This will leave gaps in the target data: for every 4 bytes that the caller expects, you'll be only writing 2. This works without changes in gltfpack (which expects 32-bit output), but that's only because gltfpack uses zero-initialized memory for the output buffer which should not be expected.
Since we have separate loops anyway, I would recommend just writing uint16_t or uint32_t according to the output (critically!) component size. This will probably also mean we'll be able to drop the case 1
as it becomes redundant, and this will actually remove the extra endianness concerns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I was thinking to require the user to provide a zero-initialized array. You're right that this isn't necessarily a reasonable expectation.
Ah okay I think I see what you're saying now. I've made this change and tested it for the 16-bit data -> 32-bit array case, works as expected.
No. If we were to remove that particular case from the switch, I don't think we would adversely affect Filament. As I recall, I included that case for completeness, since this function was originally conceived as a general purpose conversion utility, which is consistent with a strict interpretation of its docblock at the top of the file. Note: I am no longer affiliated with Filament or Google. |
Great, thanks! I submitted #232 to remove these to avoid future confusion and ensure that unpack_indices never encounters valid scenarios where it could decode float -> int. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This looks good to merge.
Since the
memcpy
only needs the size of the index data, it can use the component size (when equal to the stride) to determine the total number of bytes to copy. This allows it to work for different component types, including the commoncgltf_component_type_r_16u
.This also adds a reference comment for the function at the top of the file, consistent with
cgltf_accessor_unpack_floats
.