Make `RD::texture_create` take a `Span<Span<uint8_t>>` to avoid needless allocations. by Ivorforce · Pull Request #110901 · godotengine/godot

Ivorforce · 2025-09-25T16:33:38Z

The API RD::texture_create currently necessitates allocation (often multiple) for calls.
However, it only reads from the given arguments, so most of the time, no allocation should be necessary. This PR avoids allocation by using Span for the arguments instead.

In the end, to_byte_array was a code smell, and all of its users ended up copying data unnecessarily. To avoid this pattern in the future, I delete it from LocalVector, and discourage its use internally in Vector.

Note: This is the first use of FixedVector, which is pending a fix: #106997

Caveats

The proposed API is just a bit weird when you think about it. We should discuss if it should be merged in the current state.

For one, some callers currently call Image::get_data() to gather each image's data into some list. If we simply used get_data().span(), the callee would be allowed to copy their data before return, the data would be deallocated after the call, and the Span created from it would be invalid. While this is not currently the case, the current API allows for it. To avoid this situation, I introduce get_data_span() to Image, to force it to return a Span to its own owned data. The new API could be used in other situations where a quick view into the data is needed.

Another weird point is that the new API can potentially cause situations where additional allocations are needed:
If you have a Vector<Vector<uint8_t>> already, for whatever reason, this change will create the need to allocate for a Vector<Span<uint8_t>> instead.
However, this does not happen in the codebase. There are basically just two situations:

The caller passes a known number of images (1 to 6), and this can be statically allocated.
The caller passes an unknown number of images as e.g. TypedArray<PackedByteArray> (aka Vector<Variant>), and conversions are needed anyway. The conversion to LocalVector<Span<uint8_t>> is cheaper than the previous one to Vector<Vector<uint8_t>>.

For both of these situations, no allocations are needed.

Notes

RenderingDevice::texture_create sometimes adds more textures to the given ones. While it would be possible to implement this without many additional allocations, I just do it the same way the current implementation handles it and copy the given data if need be. It looks a little more complicated, but it's more performant than the previous implementation.

core/templates/local_vector.h

BlueCube3310 · 2025-09-26T08:12:29Z

modules/betsy/image_compress_betsy.cpp

-			src_image_ptr[0].resize(src_mip_size);
-			memcpy(src_image_ptr[0].ptrw(), r_img->ptr() + src_mip_ofs, src_mip_size);
+			src_image.resize(src_mip_size);
+			memcpy(src_image.ptr(), r_img->ptr() + src_mip_ofs, src_mip_size);


If possible, this memcpy/resize here should be avoided and the span should just point to the data in the source image.

Oh right, that should totally be possible in this branch!

Actually, on second look, this is inside a for loop which assigns mipmaps separately. Reading directly from the source buffer would only be possible if the data is already in the exact correct format. Trying to figure this out is out of scope for this PR.

kiroxas

This results in significantly fewer allocations! As you mentioned in the caveats, we’ll need to be more thorough when reviewing code around texture creation to ensure the original image lifetimes are managed correctly. Still, I think the tradeoff is well worth it.

kiroxas · 2025-09-27T07:45:56Z

servers/rendering/renderer_rd/effects/fsr2.cpp


-	Vector<PackedByteArray> initial_data;
+	Vector<uint8_t> initial_data_vector;
+	FixedVector<Span<uint8_t>, 1> initial_data;


Why not use a span array directly here ? Span<<Span<uint8_t>> initialData[1] = {};

That's a tricky one to answer.
First of all, Span<Span<uint8_t>> initialData[1] is not meaningful in this context. The meaningful types are either Span<Span<uint8_t>> initial_data (i.e. the type of the data passed to texture_create directly) or Span<uint8_t> initial_data[1] (i.e. an actual container that can store the type of data passed to texture_create. Your given type has one too many dimensions.

The second example (Span<uint8_t> initial_data[1]) doesn't work because we need to pass an empty Span sometimes, and a Span with a single Span at other times. Hardcoding to [1] isn't an option.
The first example (Span<Span<uint8_t>>) could work, because we can set the value to both an empty Span, or to a single element Span. But remember that Span cannot actually hold data. So if it is assigned to a single-element Span, the actual element must be stored elsewhere. It could look like this:

Span<uint8_t> initial_data; Span<Span<uint8_t>> initial_data_container{}; if (...) { initial_data = image.span(); initial_data_container = Span(&initial_data, 1); }

While this works, it's also a bit contrived for my taste. I have no doubt this will be obtuse black magic to inexperienced developers, and it's highly prone to errors on reformats.
The FixedVector version, in contrast, works like a simple Vector type, and can be understood by anyone who understands simple containers. Therefore, that's the version I prefer.

clayjohn · 2025-10-21T05:21:41Z

For one, some callers currently call Image::get_data() to gather each image's data into some list. If we simply used get_data().span(), the callee would be allowed to copy their data before return, the data would be deallocated after the call, and the Span created from it would be invalid. While this is not currently the case, the current API allows for it. To avoid this situation, I introduce get_data_span() to Image, to force it to return a Span to its own owned data. The new API could be used in other situations where a quick view into the data is needed.

I don't understand this. Doesn't texture_create always copy the contents of the p_data buffer before returning? Why are we worried about a span no longer being available after the call has returned?

Ivorforce · 2025-10-21T08:37:37Z

I don't understand this. Doesn't texture_create always copy the contents of the p_data buffer before returning? Why are we worried about a span no longer being available after the call has returned?

It does. What I'm worried about is people creating the span() on some line before making the call:

Span<uint8_t> span = p_image.get_data().span();
// p_image.get_data() already destructed, span() potentially pointing to deallocated data
RD::texture_create(... span ...);

The correct version would have to be:

Vector<uint8_t> data = p_image.get_data();
Span<uint8_t> span = data.span();
RD::texture_create(... span ...);
// or just 
RD::texture_create(... image.get_data().span() ...);

While I don't think this would cause issues now, through the current interfaces it would technically be legal for p_image.get_data() to return a fresh vector. So that would be a potential footgun caused by the interface design.

…rray` internally. Make `RD::texture_create` take a `Span<Span<uint8_t>>` to avoid needless allocations.

Ivorforce added this to the 4.x milestone Sep 25, 2025

Ivorforce added the enhancement label Sep 25, 2025

Ivorforce requested review from a team as code owners September 25, 2025 16:33

Ivorforce added discussion topic:rendering performance labels Sep 25, 2025

Ivorforce force-pushed the no-to-byte-array branch 3 times, most recently from ac47f17 to febf9fd Compare September 25, 2025 18:26

kiroxas reviewed Sep 26, 2025

View reviewed changes

core/templates/local_vector.h Outdated Show resolved Hide resolved

BlueCube3310 reviewed Sep 26, 2025

View reviewed changes

Ivorforce force-pushed the no-to-byte-array branch from febf9fd to 6bd3fb2 Compare September 26, 2025 11:17

kiroxas approved these changes Sep 27, 2025

View reviewed changes

BlueCube3310 mentioned this pull request Oct 3, 2025

Betsy: Convert RGB to RGBA on the GPU for faster compression #110060

Merged

2 tasks

Remove LocalVector::to_byte_array and discourage `Vector::to_byte_a…

405651f

…rray` internally. Make `RD::texture_create` take a `Span<Span<uint8_t>>` to avoid needless allocations.

Ivorforce force-pushed the no-to-byte-array branch from 6bd3fb2 to 405651f Compare February 13, 2026 11:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make `RD::texture_create` take a `Span<Span<uint8_t>>` to avoid needless allocations.#110901

Make `RD::texture_create` take a `Span<Span<uint8_t>>` to avoid needless allocations.#110901
Ivorforce wants to merge 1 commit intogodotengine:masterfrom
Ivorforce:no-to-byte-array

Ivorforce commented Sep 25, 2025 •

edited

Loading

Uh oh!

Uh oh!

BlueCube3310 Sep 26, 2025

Uh oh!

Ivorforce Sep 26, 2025

Uh oh!

Ivorforce Sep 26, 2025

Uh oh!

kiroxas left a comment

Uh oh!

kiroxas Sep 27, 2025

Uh oh!

Ivorforce Sep 27, 2025

Uh oh!

clayjohn commented Oct 21, 2025

Uh oh!

Ivorforce commented Oct 21, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

Ivorforce commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Caveats

Notes

Uh oh!

Uh oh!

BlueCube3310 Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

Ivorforce Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

Ivorforce Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

kiroxas left a comment

Choose a reason for hiding this comment

Uh oh!

kiroxas Sep 27, 2025

Choose a reason for hiding this comment

Uh oh!

Ivorforce Sep 27, 2025

Choose a reason for hiding this comment

Uh oh!

clayjohn commented Oct 21, 2025

Uh oh!

Ivorforce commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Ivorforce commented Sep 25, 2025 •

edited

Loading

Ivorforce commented Oct 21, 2025 •

edited

Loading