Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve GDExtension user-friendliness by being more explicit on when return value should be passed initialized #35813

Merged

Conversation

touilleMan
Copy link
Member

@touilleMan touilleMan commented Feb 1, 2020

fix #34264

ptrcall always overwrite the return value buffer with it own return value
data. Hence a non initialized buffer should be provided to ptrcall.

I've tested this with code like this:

godot_string __ret;
memset(&__ret, 0xff, sizeof(__ret));
gdapi10.godot_method_bind_ptrcall(
    __methbind__Object__get_class,
    my_obj_ptr,
    NULL,
    &__ret
)

without the fix this endup in a segfault, with this fix it works fine

I guess we should be double careful before merging this given ptrcall is a strange beast and my knowledge is limited on it ;-)

@touilleMan
Copy link
Member Author

Issue #35609 is also related to this

Considering this code (see touilleMan/godot-python@d0f2245):

godot_object *__ret;  // C doesn't automatically do zero initialization
gdapi10.godot_method_bind_ptrcall(
    __methbind__Object__get_class,  // Object.get_class returns a Reference object
    my_obj_ptr,
    NULL,
    &__ret  // Reference.unreference will be called on __ret as part of it destructor
            // hence causing a segfault given __ret points on nothing
)

@touilleMan touilleMan force-pushed the ptrcall-ret-value-no-initialize branch from bdb714a to 1407ab6 Compare April 13, 2020 07:26
@touilleMan
Copy link
Member Author

I've rebased this branch on the master, should be ready to merge ;-)

@touilleMan
Copy link
Member Author

@vnen can you have a look at this ? 😃

@neikeq
Copy link
Contributor

neikeq commented Mar 16, 2021

Not entirely sure what this does. How would this affect engine code (not GDNative) like the following?:

String ret;
method->ptrcall(ptr, nullptr, &ret);

Ref<Reference> ret;
method->ptrcall(ptr, nullptr, &ret);

This is what the glue for C# uses.

@touilleMan
Copy link
Member Author

@neikeq thanks you for your input, this is a very interesting

How would this affect engine code (not GDNative) like the following?:

The initial String object would be overwritten without proper call to it destructor.

This is "fine" in your example given String constructor doesn't do any heap allocation:

_FORCE_INLINE_ String() {}

and neither does it CowData field:
mutable T *_ptr = nullptr;

It feels however very fragile:

  • if String initialization changes it could lead to hard to track memory leaks
  • it seems very strange that String() doesn't cause memory leak but String("foo") does... (this would bite people trying to save some space in there stack by re-using a string variable in their code)
  • ptrcall doesn't only returns string objects, there could be other objects

So I would say there different expectation between the C and the C++ API here:
in C, developer expects to do:

godot_string ret;
gdapi10.godot_method_bind_ptrcall(
    __methbind__Object__get_class,
    my_obj_ptr,
    NULL,
    &ret
)
[...]
godot_string_destroy(&ret)

in C++, developer expects to do:

String ret;
method->ptrcall(ptr, nullptr, &ret);
[...]

Currently the C API forces you to do the C++ way (which leads to segfault), this PR forces the C++ API to do the C way (which leads to memory leaks as you pointed out ^^)

The solutions I see:

  • Make godot_method_bind_ptrcall behave differently from method->ptrcall. I guess this would require introducing a method->ptrcall_with_uninitialized_ret
  • Add runtime check (disabled for release builds ?) to ensure the provided ret object is correct (I guess the only check we can do is "is the pointed data null initialized ?", this might be a problem for object other than strings)
  • Choose one way of doing and document it well (but ptrcall&gnative documentation are virtually existent, and it very easy to miss a comment lost in the codebase anyway...)

@neikeq
Copy link
Contributor

neikeq commented Apr 18, 2021

  • ptrcall doesn't only returns string objects, there could be other objects

Can confirm this is already the case with Array, which does allocate in the default constructor:

Array::Array() {
_p = memnew(ArrayPrivate);
_p->refcount.init();
}

So this PR would cause leaks with the C# glue:

https://gist.github.com/neikeq/857a621f2878135ce1e2a702cd0a2bb9#file-mono_glue-gen-cpp-L119

@neikeq
Copy link
Contributor

neikeq commented Apr 18, 2021

I agree with the change and I don't have a problem updating the C# glue. I'm committing some changes soon which will make this update easier. However, this is a breaking change so we need opinion from the @godotengine/gdnative team. There's also the concern you mention about someone passing a constructed object like Array which would leak if not destructed; but ptrcall was always a very strict internal feature and you should understand it well before using it.

@aaronfranke aaronfranke requested a review from a team August 25, 2021 01:35
@aaronfranke
Copy link
Member

@touilleMan GDNative has been replaced with GDExtension. Does this problem still exist on the current master? If so, this PR needs to be rebased on the latest master branch.

@briansemrau
Copy link
Contributor

Assuming this PR is still valid, this would introduce a memory leak in GDScript here:

method->ptrcall(base_obj, argptrs, ret_opaque);

The fix should be easy. Just want to make sure it doesn't slip through.

@vnen
Copy link
Member

vnen commented Oct 8, 2021

I don't think this should be done. AFAIK you're expected to pass a constructed value to ptrcall. If the problem is the C API then I think it should be done only for the C API.

@touilleMan
Copy link
Member Author

see my previous comment on C++ vs C expectations #35813 (comment)

I agree that the best way to solve this would be to keep the ptrcall working as a C++ developer expects and introduce a new ptrcall_with_uninitialized_ret woking as a C developer expects

@touilleMan
Copy link
Member Author

touilleMan commented Sep 18, 2022

This PR is totally outdated, but the issue still remains 😄 (tl;dr: well maybe there is only a documentation issue after all)

So here is an attempt to provide an up-to-date view on the issue (hence ping @godotengine/gdextension !):

1 - How things work inside Godot

Godot is a C++ codebase, so all the builtins classes (e.g. Variant, Vector2, String etc.) get their constructor automatically called when a value of their type is created.

So it's fine to have method that get output parameter as reference (i.e. r_ret here):

static void evaluate(const Operator &p_op, const Variant &p_a, const Variant &p_b, Variant &r_ret, bool &r_valid);

and use them this way:

Variant r;
evaluate(OP_NOT_EQUAL, *this, p_variant, r, v);
return r;

And finally in evaluate we have typically something like

r_ret = Variant();

So here r is initialized once by Variant's default constructor, then passed as reference to evaluate which in turn creates it own Variant object (hence calling Variant's constructor) that overwrite r (hence Variant's destructor is called on r just before the overwrite occurs)

(one could argue that r is initialized twice)

2 - How things works with GDExtension

On the other hand, GDExtension is a C API, so the evaluate is exposed with a different signature:

void (*variant_evaluate)(GDNativeVariantOperator p_op, const GDNativeVariantPtr p_a, const GDNativeVariantPtr p_b, GDNativeVariantPtr r_return, GDNativeBool *r_valid);

Being a C API, the user would expect to be able to do:

gd_variant_t return_value;  // Let's pretend gd_variant_t is defined as a C opaque struct of sizeof(Variant)
GDNativeBool valid;
godot_api->variant_evaluate(p_op, p_a, p_b, &return_value, &valid)

This will work fine as long as the compiler does zero initialization, which is not guaranteed in C (unlike for C++).
So basically your code seems to work until you change your compiler or try to make it work on another OS then you get a nasty segfault :'(

This are getting worst when doing ptrcall on Godot classe method given some of them returns a String

godot/core/string/ustring.h

Lines 182 to 184 in 4ab3fdc

class String {
CowData<char32_t> _cowdata;
static const char32_t _null;

There is no destructor for String, meaning the default one is used which will call _cowdata's destructor...

CowData<T>::~CowData() {
_unref(_ptr);
}

...which will lead to a segfault :'(

String is not alone on this:

type has destructor destructor works fine on zero-initialized object
NIL no ✅ n/a
BOOL no ✅ n/a
INT no ✅ n/a
FLOAT no ✅ n/a
VECTOR2 no ✅ n/a
VECTOR2I no ✅ n/a
RECT2 no ✅ n/a
RECT2I no ✅ n/a
VECTOR3 no ✅ n/a
VECTOR3I no ✅ n/a
TRANSFORM2D no ✅ n/a
VECTOR4 no ✅ n/a
VECTOR4I no ✅ n/a
PLANE no ✅ n/a
QUATERNION no ✅ n/a
AABB no ✅ n/a
BASIS no ✅ n/a
TRANSFORM no ✅ n/a
PROJECTION no ✅ n/a
COLOR no ✅ n/a
RID no ✅ n/a
OBJECT (i.e. pointer) no ✅ n/a
VARIANT yes ⚠️ yes
STRING yes ⚠️ ✅ yes (calls CowData's destructor)
STRING_NAME yes ⚠️ yes
NODE_PATH yes ⚠️ yes
CALLABLE yes ⚠️ yes
SIGNAL yes ⚠️ ✅ yes (calls StringName's destructor])
DICTIONARY yes ⚠️ ⚠️ yes (but logs an error)
ARRAY yes ⚠️ ✅ yes
PACKED_BYTE_ARRAY yes ⚠️ ✅ yes (calls CowData's destructor)
PACKED_INT32_ARRAY yes ⚠️ ✅ yes (calls CowData's destructor)
PACKED_INT64_ARRAY yes ⚠️ ✅ yes (calls CowData's destructor)
PACKED_FLOAT32_ARRAY yes ⚠️ ✅ yes (calls CowData's destructor)
PACKED_FLOAT64_ARRAY yes ⚠️ ✅ yes (calls CowData's destructor)
PACKED_STRING_ARRAY yes ⚠️ ✅ yes (calls CowData's destructor)
PACKED_VECTOR2_ARRAY yes ⚠️ ✅ yes (calls CowData's destructor)
PACKED_VECTOR3_ARRAY yes ⚠️ ✅ yes (calls CowData's destructor)
PACKED_COLOR_ARRAY yes ⚠️ ✅ yes (calls CowData's destructor)

So to summarize, forcing zero-initialization may seem a good rule of thumb to be safe when using GDExtension API, however:

  1. Dictionary still creates an error logs (but we could iron this out easily ^^)
  2. It is an unexpected requirement, so we must add it in the documentation, then people won't read it and will get unexpected and hard-to-track issues :'(

3 - Zero-initialization on C is hard !

On top of that doing zero initialization is suprizingly error prone in C:
The best way to do it is

gd_variant_t a;
memset(&a, 0, sizeof(a));

However I've just written a buggy code because it should be written sizeof a or sizeof(gd_variant_t) (cf. sizeof syntax). Now good luck to debug this issue that occurs only some time on some Godot methods ;-)

Also C++ programmer are often eager to use gd_variant_t a = { 0 }; as zero-init, but in C this only initializes the first element !

Of course it depends on how gd_variant_t is defined... which is left up to the user of GDExtension.

The most obvious way of defining an opaque structure would be to make it include a singe char buffer field:

struct gd_variant_t  {
   char opaque[3];
};

So now given we have only one attribute in the structure we can do this fancy gd_variant_t a = { 0 };...
...and we may get a segfault because the structure may contains padding bytes whose value is undefined.

4 - Bonus: Correctly implementing return value as parameter is tricky !

1 - In Godot

Considering how gdnative_variant_call is implemented:

static void gdnative_variant_call(GDNativeVariantPtr p_self, const GDNativeStringNamePtr p_method, const GDNativeVariantPtr *p_args, const GDNativeInt p_argcount, GDNativeVariantPtr r_return, GDNativeCallError *r_error) {
Variant *self = (Variant *)p_self;
const StringName *method = (const StringName *)p_method;
const Variant **args = (const Variant **)p_args;
Variant ret;
Callable::CallError error;
self->callp(*method, args, p_argcount, ret, error);
memnew_placement(r_return, Variant(ret));

Line 89 we use memnew_placement that doesn't call destructor on r_return before overwritting it content.
This works fine until the caller starts reusing it objects:

gd_variant_t ret;
memset(&ret, 0, sizeof(gd_variant_t));
for (int i = 0; i++; i < 10) {
    // No need to put memset&destructor in the loop given `gdnative_variant_call` takes care of that !
    godot_api->gdnative_variant_call(&self, &method, NULL, 0, &ret, NULL);
}
godot_api->variant_destroy(&ret);

And now we've leaked data :'(

2 - In native extension

GDExtension allows user to register they own native extension, which should provide call&ptrcall function pointers for each methods:

typedef void (*GDNativeExtensionClassMethodCall)();
typedef void (*GDNativeExtensionClassMethodPtrCall)();

static void my_extension_do_something_call(void *method_userdata, GDExtensionClassInstancePtr p_instance, const GDNativeVariantPtr *p_args, const GDNativeInt p_argument_count, GDNativeVariantPtr r_return, GDNativeCallError *r_error) {
    GDNativeCallError error
    // Oups ! we forget to call `godot_api->variant_destroy(r_return)`, now we leak memory !
    godot_api->variant_constructor(GDNATIVE_VARIANT_TYPE_DICTIONARY, r_return, NULL, 0, &error);
    // Continue to populate the dictionary
}
static void my_extension_do_something_ptrcall_v1(void *method_userdata, GDExtensionClassInstancePtr p_instance, const GDNativeTypePtr *p_args, GDNativeTypePtr r_ret) {
    // r_ret is already a valid Dictionary given it has been initialized by the caller, let's use it right away...
    // ...but this is only true when the caller is Godot, if the caller is another native extension r_ret might just be
    // zero-initialized and we will get a segfault when using it :(
}
static void my_extension_do_something_ptrcall_v2(void *method_userdata, GDExtensionClassInstancePtr p_instance, const GDNativeTypePtr *p_args, GDNativeTypePtr r_ret) {
    // Oups ! we forget to call dictionary destructor first ! Dictionary does heap alloc on init so we most likely leak memory !
    godot_dictionary_constructor(r_ret);
    // Continue to populate the dictionary
}

So the rule of thumb for native extension developer is "always start by calling destructor on the return value" which seems
very easy to forget.
And if the extension developer does forget, this has not consequence most of the time. Until the native extension user
starts reusing objects in GDExtension. This looks like a recipe for disaster to me 😄

5 - What can we do about this ?

So to summarize:

  • not initializing return value leads to segfault
  • initialializing return value with constructor leads to memory leak given its complicated to know if the destructor is called

I see three possibilities to fix this, by order of simplicity:

1 - Always use constructor policy

The rule of thumb becomes:

For Variant and builtins that are marked has_destructor: true

  1. always call the constructor on return value before passing it in a call to GDExtension API
  2. always call the destructor on return value before using it in native extension

This is the cheapest solution given we only have to add huge stickers everywhere to tell about the rule of thumb (typically in GDExtension documentation when it will be available and in gdnative_interface.h)

Of course the downside is that it leaves the burden on GDExtension users and native extension developer.
However this might be considered ok given GDExtension is a low-level API and actual game developers are expected to use wrappers as Godot-CPP/Godot-Rust/Godot-Python etc.

2 - Don't do allocation on init

This is a variation of the previous, the idea is to guarantee that all builtins don't do allocation on init. This way not calling the destructor is ok as long as the object has not been used yet.

This has the added benefit of allowing to do initialization with zero-initialization instead of calling the constructor (but as we saw, this is error prone so I'm not sure how much of a benefit this is 😄 )

The main benefit is it remove the necessity for native extension developer to call the destructor on return value.

However removing allocation on init might cause issue with how array/dictionnary behave...

3 - Stick with the C convension

This is the approach I originally implemented in this PR. We would have to make sure the GDExtension API only expose functions
that consider the pointer provided as return value contains unitialized memory.

The trick is, as @vnen said, we have to make sure this C API stays isolated and doesn't leak in the rest of the codebase.

This is obvious enough for most of the GDExtension API given it return value type is a void * or GDNativeTypePtr where Godot internal works with Variant */Vector2 */String * etc.
The tricky part is for the functions pointers returned (GDNativePtrOperatorEvaluator, GDNativePtrBuiltInMethod and GDNativePtrUtilityFunction) that are shared between C and C++.
So we would need two flavours of ptrcall (ptrcall and cptrcall ?), the C one would do default initialization on the return value then call the C++ one (except for NativeExtensionMethodBind that would implement the C++ version by calling into the C version that is provided by the native extension)

The drawback is the increased complexity of the code base and the size of the final binary to have this second flavor of ptrcall :/

4 - Modify GDExtension API to never pass return value by parameter

The issue here comes from the fact we pass the return value as parameter. If we were returning the return value we wouldn't have this issue in the first place !
Of course this means modifying a lot of code in the Godot codebase, and I'm not sure this is even possible (I guess the reason for passing the return value as parameter is it allow to have opaque return value of arbitrary size).
Anyway this "solution" is me being curious of an explanation ;-)

6 - Conclusion

It's 2am, I've spent ~8h on this topic and I end up concluding GDExtension has already implemented the "least worst" solution and there is nothing to do here, talking of wasted time 😭

More seriously I hope this research work will help when writing GDExtension documentation to provide the right guidelines information. I guess we can start by adding a comment about that in gdnative_interface.h, what do you think ?

@touilleMan touilleMan force-pushed the ptrcall-ret-value-no-initialize branch from 1407ab6 to 9b2c844 Compare September 18, 2022 18:01
@touilleMan
Copy link
Member Author

1 - More graph

After a good night of sleep, here is an exaustive view of all the places we use return value as pointer in GDExtension:

name part of GDNativeInterface / or function pointer ? return value type GDExtension specific destructor first called on return value ?
GDNativeVariantFromTypeConstructorFunc function ptr GDNativeVariantPtr yes ⚠️ yes
GDNativeTypeFromVariantConstructorFunc function ptr GDNativeTypePtr yes ⚠️ yes
GDNativePtrOperatorEvaluator function ptr GDNativeTypePtr no ⚠️ yes (by using PtrToArg::encode)
GDNativePtrBuiltInMethod function ptr GDNativeTypePtr no ⚠️ yes (builtins only implemented in C++ inside Godot)
GDNativePtrGetter function ptr GDNativeTypePtr no ⚠️ yes (by using PtrToArg::encode)
GDNativePtrIndexedGetter function ptr GDNativeTypePtr no ⚠️ yes (by using PtrToArg::encode)
GDNativePtrKeyedGetter function ptr GDNativeTypePtr no ⚠️ yes (by using PtrToArg::encode)
GDNativePtrUtilityFunction function ptr GDNativeTypePtr no yes (all utility functions are defined in C++ inside Godot)
GDNativeExtensionClassGet function ptr GDNativeVariantPtr yes ⚠️ implemented by native extension dev, so maybe
GDNativeExtensionClassPropertyGetRevert function ptr GDNativeVariantPtr yes ⚠️ implemented by native extension dev, so maybe
GDNativeExtensionClassCallVirtual function ptr GDNativeTypePtr yes ⚠️ implemented by native extension dev, so maybe
GDNativeExtensionClassMethodCall function ptr GDNativeVariantPtr yes ⚠️ implemented by native extension dev, so maybe
GDNativeExtensionClassMethodPtrCall function ptr GDNativeTypePtr yes ⚠️ implemented by native extension dev, so maybe
GDNativeExtensionScriptInstanceGet function ptr GDNativeVariantPtr yes ⚠️ implemented by native extension dev, so maybe
GDNativeExtensionScriptInstancePropertyGetRevert function ptr GDNativeVariantPtr yes ⚠️ implemented by native extension dev, so maybe
GDNativeExtensionScriptInstanceCall function ptr GDNativeVariantPtr yes ⚠️ implemented by native extension dev, so maybe
variant_new_copy gdnative interface GDNativeVariantPtr yes no
variant_new_nil gdnative interface GDNativeVariantPtr yes no
variant_call gdnative interface GDNativeVariantPtr yes no
variant_call_static gdnative interface GDNativeVariantPtr yes no
variant_evaluate gdnative interface GDNativeVariantPtr yes ⚠️ yes
variant_get gdnative interface GDNativeVariantPtr yes no
variant_get_named gdnative interface GDNativeVariantPtr yes no
variant_get_keyed gdnative interface GDNativeVariantPtr yes no
variant_get_indexed gdnative interface GDNativeVariantPtr yes no
variant_iter_init gdnative interface GDNativeVariantPtr yes ⚠️ yes
variant_iter_next gdnative interface GDNativeVariantPtr yes used as both input&output param
variant_iter_get gdnative interface GDNativeVariantPtr yes no
variant_duplicate gdnative interface GDNativeVariantPtr yes no
variant_stringify gdnative interface GDNativeStringPtr yes no
variant_get_type_name gdnative interface GDNativeStringPtr yes no
variant_get_constant_value gdnative interface GDNativeVariantPtr yes no
string_new_with_latin1_chars gdnative interface GDNativeStringPtr yes no
string_new_with_utf8_chars gdnative interface GDNativeStringPtr yes no
string_new_with_utf16_chars gdnative interface GDNativeStringPtr yes no
string_new_with_utf32_chars gdnative interface GDNativeStringPtr yes no
string_new_with_wide_chars gdnative interface GDNativeStringPtr yes no
string_new_with_latin1_chars_and_len gdnative interface GDNativeStringPtr yes no
string_new_with_utf8_chars_and_len gdnative interface GDNativeStringPtr yes no
string_new_with_utf16_chars_and_len gdnative interface GDNativeStringPtr yes no
string_new_with_utf32_chars_and_len gdnative interface GDNativeStringPtr yes no
string_new_with_wide_chars_and_len gdnative interface GDNativeStringPtr yes no
object_method_bind_call gdnative interface GDNativeVariantPtr yes no
object_method_bind_ptrcall gdnative interface GDNativeTypePtr ⚠️ no ⚠️ yes for Godot methods, maybe for native extension methods
get_library_path gdnative interface GDNativeStringPtr yes ⚠️ yes
GDNativePtrConstructor function ptr GDNativeTypePtr ⚠️ no no

Column "part of GDNativeInterface / or function pointer ?" indicate if this component is accessed from the structure that Godot passed to an extension
(e.g. godot_api->variant_new_nil(&my_var)) vs the function pointer that are returend by this api and must be called later on (e.g. GDNativePtrConstructor *c = godot_api->variant_get_ptr_constructor(GDNATIVE_VARIANT_TYPE_RID, 0); c(&my_rid)).

The "destructor first called on return value" column should be understood as "can this segfault if I pass an uninitialized return value ?".
From a different point of view, this can also be understood as "if I pass a return value that have done heap allocation, will I get a memory leak ?".

The "GDExtenion specific" column indicate the functions that are only implemented for GDExtension (and hence where we are free to change it behavior without impacting the rest of the Godot internals).

A couple of remarks:

  1. All functions that are not GDExtension specific expect to work with initialized return value (well except GDNativePtrConstructor given it is the constructor). This cannot be fixed without duplicating the API between C++ and C style.
  2. All methods in GDNativeInterface except object_method_bind_ptrcall are specific to GDExtension. Hence it would be good to make them all behave consistently (typically having them all expecting uninitialized return value).
  3. object_method_bind_ptrcall is a special case, it expect to work with initialized return
  4. All the native extension function pointer are implemented by 3rd party so there is no guarantee whatsoever here.

The updated conclusion

Provided my study is right, it is currently very hard to tell when return value should be initialized and when it should not.
In practice I guess it's not that bad as long as user follows what C++ does by always initializing it return value (and so never reuse an existing value).

However I think we can make this API much more consistent by doing some small changes:

  • Correct variant_evaluate/variant_iter_init/variant_iter_get/get_library_path to work with unitialize return value. This would make GDNativeInterface much more coherent given it would make all function except object_method_bind_ptrcall work the same way.
  • Add a coment to explain than extension_api.json's has_destructor determines if the type must be initialized to prevent segfault, and on top of that than zero-initialization should not be considered safe and only the actual constructor can be used when has_destructor: true.
  • Introduce GDNativeUninitializedVariantPtr/GDNativeUninitializedTypePtr types. This way GDNativeVariantPtr/GDNativeTypePtr would represent a Variant or builtin type that is guaranteed to be initialized by the caller where GDNativeUninitializedVariantPtr/GDNativeUninitializedTypePtr would represent an uninitialized buffer of the size of a Variant or builtin type.

This way we would have a much more explicit and consistent API ;-)

The update PR \o/

I've updated this PR to implemented the improvement I was talking about.
@godotengine/gdextension What do you think about that ?

@touilleMan touilleMan changed the title ptrcall no longer call destructor on return value buffer before using it Improve GDExtension user-friendliness by being more explicit on when return value should be passed initialized Sep 18, 2022
@touilleMan
Copy link
Member Author

rebased

@touilleMan touilleMan force-pushed the ptrcall-ret-value-no-initialize branch from 606b463 to 984ead4 Compare October 13, 2022 09:07
@touilleMan touilleMan force-pushed the ptrcall-ret-value-no-initialize branch from 984ead4 to 5a42665 Compare December 4, 2022 12:56
@touilleMan touilleMan force-pushed the ptrcall-ret-value-no-initialize branch from 5a42665 to f88ccc5 Compare December 15, 2022 11:41
@YuriSizov YuriSizov modified the milestones: 4.0, 4.1 Feb 16, 2023
@touilleMan
Copy link
Member Author

@YuriSizov I've seen you've marked changed the milestone for this PR to 4.1

I'm okay with that (I've already talked with @akien-mga during FOSDEM 🇧🇪 🍻 about the fact GDExtension in Godot 4.0 will be considered beta and so breaking changes are allowed)

However I'd just want to point out the actual breaking changes this PR contains: the GDExtension code written before this PR should still work but will leak a small amount of memory when calling certain GDExtension API function (e.g. gdextension_string_new_with_utf8_chars, or when doing variant to/from type conversion)

This should be easy to fix for GDExtension code writer (especially given documenting which return value passed as parameter should be pre-initialized or not is precisily the point of this PR 😃 )

@YuriSizov
Copy link
Contributor

Even if it contains breaking changes, at this point we don't include any breaking changes in 4.0. GDExt is expected to be expanded upon in 4.1, so that's okay.

@dsnopek
Copy link
Contributor

dsnopek commented Apr 24, 2023

Discussed at the GDExtension meeting and we think it makes sense.

Just a rebase needed and fixes for the CI errors

Also, @Bromeon will check with the Rust binding and see if this causes any issues

@touilleMan touilleMan force-pushed the ptrcall-ret-value-no-initialize branch 2 times, most recently from b7dc1e2 to 2796b93 Compare May 13, 2023 17:32
@touilleMan
Copy link
Member Author

@dsnopek @Bromeon I've updated this PR, it's should be ready for merging \o/

@touilleMan touilleMan force-pushed the ptrcall-ret-value-no-initialize branch from 2796b93 to 426f8df Compare May 13, 2023 17:54
Copy link
Contributor

@dsnopek dsnopek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, and approved! (As that's what we agreed at the GDExtension meeting on 2023-04-24)

While I was looking at it again, I noticed that the wording on the comment explaining the new typedefs could be improved, so if you have time to update that, it'd be great. But we could always fix it up afterwards too, I don't necessarily want that to hold up committing this just because of the comment.

@Bromeon It would still be great if could test this with the Rust bindings, if you have time!

And, if I've understood correctly, this will lead to a small number of situations where godot-cpp could leak memory until its updated? Is there there already a follow-up PR to address that?

core/extension/gdextension_interface.h Outdated Show resolved Hide resolved
core/extension/gdextension_interface.h Outdated Show resolved Hide resolved
core/extension/gdextension_interface.h Outdated Show resolved Hide resolved
@Bromeon
Copy link
Contributor

Bromeon commented May 15, 2023

I ran our CI against this commit, however there seem to be changes in gdextension_interface.h that are breaking from a Rust perspective:

  typedef void *GDExtensionVariantPtr;
  typedef const void *GDExtensionConstVariantPtr;
+ typedef void *GDExtensionUninitializedVariantPtr;
  typedef void *GDExtensionStringNamePtr;
  typedef const void *GDExtensionConstStringNamePtr;
+ typedef void *GDExtensionUninitializedStringNamePtr;
  typedef void *GDExtensionStringPtr;
  typedef const void *GDExtensionConstStringPtr;
+ typedef void *GDExtensionUninitializedStringPtr;
  typedef void *GDExtensionObjectPtr;
  typedef const void *GDExtensionConstObjectPtr;
+ typedef void *GDExtensionUninitializedObjectPtr;
  typedef void *GDExtensionTypePtr;
  typedef const void *GDExtensionConstTypePtr;
+ typedef void *GDExtensionUninitializedTypePtr;
  typedef const void *GDExtensionMethodBindPtr;
  typedef int64_t GDExtensionInt;
  typedef uint8_t GDExtensionBool;

The reason is that we don't treat those typedefs as void* but rather opaque pointers, so they have all different types. Since lots of signatures changed, I will need to update our counterparts for this.

@touilleMan
Copy link
Member Author

touilleMan commented May 15, 2023

@dsnopek I've updated my PR with your comments suggestions 😃

@Bromeon

The reason is that we don't treat those typedefs as void* but rather opaque pointers, so they have all different types. Since lots of signatures changed, I will need to update our counterparts for this.

I'd say this is a really good thing to avoid mixing errors ! As a matter of fact I wonder if there is a way to do the same thing in C (this is outside of the scope of this PR of course, but would be an interesting point to bring at next GDExtension meeting 😄 )

Maybe by declaring dummy struct types then only using pointer on them:

// Not the actual types ! You should never build these types directly !
struct __GDExtensionVariantPlaceholder {}
struct __GDExtensionStringPlaceholder {}
...

typedef __GDExtensionVariantPlaceholder*GDExtensionVariantPtr;
typedef __GDExtensionStringPlaceholder*GDExtensionStringPtr;
...

What do you think ?

… in GDExtension API

This commit introduce separate types (e.g. GDNativeStringPtr vs GDNativeUninitializedStringPtr)
depending on if the pointed data is already initialized (C++ style where constructor is alway
called when create a variable even if it is to be passed as return value) or not (C style).
On top of that, small changes has been made to `GDNativeInterface` so that it methods are
consistent on using uninitialized return value.
@touilleMan touilleMan force-pushed the ptrcall-ret-value-no-initialize branch from 426f8df to e785dd9 Compare May 15, 2023 15:15
@akien-mga akien-mga merged commit 313f613 into godotengine:master May 15, 2023
@akien-mga
Copy link
Member

Thanks!

@saki7
Copy link
Contributor

saki7 commented May 16, 2023

To be honest, I was surprised that this PR was merged too quickly.

You can see in the comments above that the committer of Rust binding was showing direct concern about the changes. The author of this PR was asking about how the changes could be handled on Rust-side. The contributor of GDExtension was also kindly asking about the same issue on the contributors chat (link).

I personally don't write Rust, I am just a godot-cpp user, but I believe that the Rust binding is an important part in the GDExtension ecosystem. I don't think that the PR should have been merged before receiving a reply from the participant.

@Bromeon
Copy link
Contributor

Bromeon commented May 16, 2023

Yes, would have been nice to test it -- this has been open for 3 years, I'm sure a couple of days wouldn't have mattered 😉

That said, it's less of direct concern I had, but more that it would have required some work until I can test it. If we face problems as a result of this PR, I will definitely report them here, and I'm optimistic that Godot devs are willing to address them 🙂

@akien-mga
Copy link
Member

akien-mga commented May 16, 2023

I interpreted @Bromeon's comment as highlighting that some work will be needed on the Rust bindings side, but not that anything is wrong per se with this PR. This is the development branch for 4.1, so if a PR merged upstream means some work will be needed downstream, it's not a blocker in general - unless there's reasonable concern that the downstream changes will not be possible.

The further reply from @touilleMan outlined that the way Rust handles this would be a good improvement to make for C too if possible, in a follow up PR. So I took this as a hint that this PR is fine to merge for now, Rust bindings will need to do some extra changes, and further work may be done to unify things further.

If I misinterpreted and merged too fast, I'm sorry. But there's also #76406 which was building upon this and waiting for a merge, and I keep hearing that GDExtension PRs don't get merged fast enough, so...

@saki7
Copy link
Contributor

saki7 commented May 16, 2023

@akien-mga Thanks for spending your time investigating this issue. I truly appreciate that the PR was merged for now, and I have no further argument if @Bromeon has no concern about the future work required on Rust side.

If I misinterpreted and merged too fast, I'm sorry. But there's also #76406 which was building upon this and waiting for a merge, and I keep hearing that GDExtension PRs don't get merged fast enough, so...

The issue regarding the merging process of GDExtension is a broader matter. It's not limited to this specific PR. I can't pretend not to be surprised if some PRs are going to be merged suddenly whereas other PRs which had received enough attentions are kept unhandled. Many PRs related to GDExtension needs more attension, and I believe the real problem is the lack of the maintainers.

Recently on #gdextension channel in Godot Contributors Chat, I have proposed to promote some contributor to maintainer. Doing so will fundamentally improve the situation, and the change will motivate people (like me) who are trying to submit new patches. I've got positive responses from several people (including the people from GDExtension team), but the actual movement has not happened yet.

@aaronfranke
Copy link
Member

aaronfranke commented May 16, 2023

@saki7 There is no need to do such a promotion. Anyone is able to, and welcome to, contribute to the engine. Anyone can review PRs. Regardless of whether or not the review is from a trusted member, additional effort towards PR reviewing is always appreciated. If someone continuously puts in effort, their efforts will be noticed.

If you want things to move forward, convince the people on the GDExtension team to leave PR reviews on GitHub, not just positive responses on RocketChat.

@saki7
Copy link
Contributor

saki7 commented May 16, 2023

@aaronfranke I know that's how OSS works. I'm afraid there's some misunderstanding, and the actual situation is worse than you think. Let me clarify the recent situation of GDExtension team:

  • Many people, including both contributors and non-contributors are submitting PRs to GDExtension.
  • However, there is currently no person in charge of the entire codebase.
    • You can confirm this by looking at the fact that no collaborator has the "star" mark on the member list.
    • Two people are known as the top-level contributors: mux213 and vnen. When I asked question on the Contributors Chat, mux213 answered (link):

      mux213: The problem is that we currently don’t have any such person for GDextension. vnen and myself come close i guess but neither of us has the bandwidth to become actively involved to properly review the work and so it stalls.

    • mux213 voted dsnopek to take the role.
  • Since few months ago, @dsnopek had been organizing GDExtension team meetings multiple times.
    • The attendee consists of top-level contributors in the GDExtension domain.
    • But even when the PR was confirmed on the meeting, it was usual that nobody actually presses the "Merge" button.
    • Even the PRs submitted by GDExtension team member have been stalling for days and weeks.

This is why I'm repeatedly asking for promotion. GDExtension team is also aware of the situation, and people are trying to advance.

You could check the log for #gdextension channel on the Godot Contributors Chat. The relevant conversation mostly happened after May 11, 2023.

This is getting off-topic, so please don't reply directly on this PR. If you need to ask me something, I'm also available on the Contributors Chat.

bors bot added a commit to godot-rust/gdext that referenced this pull request May 24, 2023
284: Compatibility with 4.1 GDExtension API r=Bromeon a=Bromeon

This pull request migrates gdext to the **major GDExtension API change**, as introduced in godotengine/godot#76406.
It also adopts to the new "uninitialized pointer" types, which were added in godotengine/godot#35813.

Doing so renders the CI green again, unblocking other PRs that have been broken by the upstream changes.

---

# Major GDExtension API update

TLDR of Godot's changes: instead of one giant `GDExtensionInterface` struct with all the function pointers as data members, the C API now offers only a single function pointer `get_proc_address`. This one is passed to the entry point, in place of a pointer to the interface. With `get_proc_address`, it is possible to retrieve concrete API functions. In C++, this is done by looking up via name and casting the result to the correct function pointer type:
```cpp
auto variant_call = (GDExtensionInterfaceVariantCall) get_proc_address("variant_call");
variant_call(...);
```

On the Rust side, I incorporated these changes by only modifying the FFI layer (`godot-ffi` crate), so the rest of the project is mostly unaffected. This is done by generating a `GDExtensionInterface` struct very similarly to how it existed before, in `godot-codegen`. As input, we parse the `gdextension_interface.h` header's documentation metadata. This works well so far, but we'll need to make sure with Godot devs that the doc metadata doesn't suddenly change format.

---

# Uninitialized pointer types

Certain FFI functions construct objects "into an uninitialized pointer" (placement new in C++). There used to be quite some confusion regarding _which_ functions do that. As a result, the C API introduced new types such as `GDExtensionUninitializedStringPtr`, which represent the uninitialized counterpart to `GDExtensionStringPtr`.

These typedefs are declared as `void*` in the C API, but we turn them into strongly typed opaque pointers by post-processing the C header. As such, it is impossible to accidentally mix initialized and uninitialized pointer types. I also added a trait `AsUninit` which allows explicit conversion between the two. This may not be necessary in the future, but is helpful until we are sure about correct usage everywhere. I marked some places that may need another look as `// TODO(uninit)`.


---

# Compatibility between Godot 4.0.x and 4.1+

Due to the API changes in GDExtension, extensions compiled under Godot 4.0.x are by default **not compatible** with Godot 4.1+ (which includes current `master` versions of Godot, so also our nightly CI builds).

 From now on, the `.gdextension` interface must contain a new property `compatibility_minimum`:
```toml
[configuration]
entry_symbol = "gdext_rust_init"
compatibility_minimum = 4.0

[libraries]
linux.debug.x86_64 = "res://../../../target/debug/libdodge_the_creeps.so"
...
```
This value must be set to `4.0` if you compile against the old GDExtension API (current gdext uses 4.0.3 by default).
Set it to `4.1` if you use a Godot development version (Cargo feature `custom-godot`).
If you do this wrong, gdext will abort initialization and print a descriptive error message.

## Compat bridge

Since breaking changes can be very disruptive, I built a bridging layer that allows to use existing compiled extensions under Godot v4.1 (or current development versions). Because the Rust code relies on certain features only available in the newer C header (e.g. uninitialized pointers), such changes were backported via dynamic post-processing of `gdextension_interface.h`.

Therefore, you don't need to mess around with `custom-godot`, just keep using gdext as-is and set `compatibility_minimum = 4.0` to run it under newer engine versions. Developing against a specific GDExtension API is still possible via patch, the prebuilt artifacts have been updated for all 4.0.x versions. For example, to use version `4.0`:
```toml
[patch."https://github.com/godot-rust/godot4-prebuilt"]
godot4-prebuilt = { git = "https://github.com//godot-rust/godot4-prebuilt", branch = "4.0"}
```

Once Godot 4.1.0 is released, we will likely make that the default version used by gdext, but I plan to maintain the compat layer as long as this is possible with reasonable effort.

A new CI setup verifies that the integration tests run against 4.0, 4.0.1, 4.0.2, 4.0.3 and nightly Godot versions, at least for Linux.
This will become a challenge as soon as there are breaking API changes (and there is already one upcoming in `Basis::looking_at()`).

---

# Other changes

There is now an experimental `godot::sys::GdextBuild` struct, which provides informations about static (compiled-against) and runtime (loaded-by) Godot versions. This may be extended by other build/runtime metadata and will likely be promoted to an official API in the `godot::init` module.

Several memory leaks that came up during the change to uninitialized pointers were addressed. In many places, we now use `from_sys_init()` instead of `from_sys_init_default()`, avoiding the need to construct throwaway default instances.

In addition to more CI jobs for integration tests, there are now also 2 more linux-memcheck ones, to cover both 4.0.3 and nightly. This is mostly to have extra confidence during the migration phase and may be removed later.

Godot 4.0.3 is now the default version used by gdext.

Co-authored-by: Jan Haller <bromeon@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[GDnative] ptrcall returning godot_string need to initialized