Skip to content

Make RD::texture_create take a Span<Span<uint8_t>> to avoid needless allocations.#110901

Open
Ivorforce wants to merge 1 commit intogodotengine:masterfrom
Ivorforce:no-to-byte-array
Open

Make RD::texture_create take a Span<Span<uint8_t>> to avoid needless allocations.#110901
Ivorforce wants to merge 1 commit intogodotengine:masterfrom
Ivorforce:no-to-byte-array

Conversation

@Ivorforce
Copy link
Copy Markdown
Member

@Ivorforce Ivorforce commented Sep 25, 2025

The API RD::texture_create currently necessitates allocation (often multiple) for calls.
However, it only reads from the given arguments, so most of the time, no allocation should be necessary. This PR avoids allocation by using Span for the arguments instead.

In the end, to_byte_array was a code smell, and all of its users ended up copying data unnecessarily. To avoid this pattern in the future, I delete it from LocalVector, and discourage its use internally in Vector.

Note: This is the first use of FixedVector, which is pending a fix: #106997

Caveats

The proposed API is just a bit weird when you think about it. We should discuss if it should be merged in the current state.

For one, some callers currently call Image::get_data() to gather each image's data into some list. If we simply used get_data().span(), the callee would be allowed to copy their data before return, the data would be deallocated after the call, and the Span created from it would be invalid. While this is not currently the case, the current API allows for it. To avoid this situation, I introduce get_data_span() to Image, to force it to return a Span to its own owned data. The new API could be used in other situations where a quick view into the data is needed.

Another weird point is that the new API can potentially cause situations where additional allocations are needed:
If you have a Vector<Vector<uint8_t>> already, for whatever reason, this change will create the need to allocate for a Vector<Span<uint8_t>> instead.
However, this does not happen in the codebase. There are basically just two situations:

  • The caller passes a known number of images (1 to 6), and this can be statically allocated.
  • The caller passes an unknown number of images as e.g. TypedArray<PackedByteArray> (aka Vector<Variant>), and conversions are needed anyway. The conversion to LocalVector<Span<uint8_t>> is cheaper than the previous one to Vector<Vector<uint8_t>>.

For both of these situations, no allocations are needed.

Notes

RenderingDevice::texture_create sometimes adds more textures to the given ones. While it would be possible to implement this without many additional allocations, I just do it the same way the current implementation handles it and copy the given data if need be. It looks a little more complicated, but it's more performant than the previous implementation.

@Ivorforce Ivorforce added this to the 4.x milestone Sep 25, 2025
@Ivorforce Ivorforce requested review from a team as code owners September 25, 2025 16:33
@Ivorforce Ivorforce force-pushed the no-to-byte-array branch 3 times, most recently from ac47f17 to febf9fd Compare September 25, 2025 18:26
src_image_ptr[0].resize(src_mip_size);
memcpy(src_image_ptr[0].ptrw(), r_img->ptr() + src_mip_ofs, src_mip_size);
src_image.resize(src_mip_size);
memcpy(src_image.ptr(), r_img->ptr() + src_mip_ofs, src_mip_size);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, this memcpy/resize here should be avoided and the span should just point to the data in the source image.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, that should totally be possible in this branch!

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, on second look, this is inside a for loop which assigns mipmaps separately. Reading directly from the source buffer would only be possible if the data is already in the exact correct format. Trying to figure this out is out of scope for this PR.

Copy link
Copy Markdown
Contributor

@kiroxas kiroxas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This results in significantly fewer allocations! As you mentioned in the caveats, we’ll need to be more thorough when reviewing code around texture creation to ensure the original image lifetimes are managed correctly. Still, I think the tradeoff is well worth it.


Vector<PackedByteArray> initial_data;
Vector<uint8_t> initial_data_vector;
FixedVector<Span<uint8_t>, 1> initial_data;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use a span array directly here ? Span<<Span<uint8_t>> initialData[1] = {};

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a tricky one to answer.
First of all, Span<Span<uint8_t>> initialData[1] is not meaningful in this context. The meaningful types are either Span<Span<uint8_t>> initial_data (i.e. the type of the data passed to texture_create directly) or Span<uint8_t> initial_data[1] (i.e. an actual container that can store the type of data passed to texture_create. Your given type has one too many dimensions.

The second example (Span<uint8_t> initial_data[1]) doesn't work because we need to pass an empty Span sometimes, and a Span with a single Span at other times. Hardcoding to [1] isn't an option.
The first example (Span<Span<uint8_t>>) could work, because we can set the value to both an empty Span, or to a single element Span. But remember that Span cannot actually hold data. So if it is assigned to a single-element Span, the actual element must be stored elsewhere. It could look like this:

Span<uint8_t> initial_data;
Span<Span<uint8_t>> initial_data_container{};
if (...) {
    initial_data = image.span();
    initial_data_container = Span(&initial_data, 1);
}

While this works, it's also a bit contrived for my taste. I have no doubt this will be obtuse black magic to inexperienced developers, and it's highly prone to errors on reformats.
The FixedVector version, in contrast, works like a simple Vector type, and can be understood by anyone who understands simple containers. Therefore, that's the version I prefer.

@clayjohn
Copy link
Copy Markdown
Member

For one, some callers currently call Image::get_data() to gather each image's data into some list. If we simply used get_data().span(), the callee would be allowed to copy their data before return, the data would be deallocated after the call, and the Span created from it would be invalid. While this is not currently the case, the current API allows for it. To avoid this situation, I introduce get_data_span() to Image, to force it to return a Span to its own owned data. The new API could be used in other situations where a quick view into the data is needed.

I don't understand this. Doesn't texture_create always copy the contents of the p_data buffer before returning? Why are we worried about a span no longer being available after the call has returned?

@Ivorforce
Copy link
Copy Markdown
Member Author

Ivorforce commented Oct 21, 2025

I don't understand this. Doesn't texture_create always copy the contents of the p_data buffer before returning? Why are we worried about a span no longer being available after the call has returned?

It does. What I'm worried about is people creating the span() on some line before making the call:

Span<uint8_t> span = p_image.get_data().span();
// p_image.get_data() already destructed, span() potentially pointing to deallocated data
RD::texture_create(... span ...);

The correct version would have to be:

Vector<uint8_t> data = p_image.get_data();
Span<uint8_t> span = data.span();
RD::texture_create(... span ...);
// or just 
RD::texture_create(... image.get_data().span() ...);

While I don't think this would cause issues now, through the current interfaces it would technically be legal for p_image.get_data() to return a fresh vector. So that would be a potential footgun caused by the interface design.

…rray` internally.

Make `RD::texture_create` take a `Span<Span<uint8_t>>` to avoid needless allocations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants