-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TIR][FFI] Allow string in PrimFunc return statement #17103
base: main
Are you sure you want to change the base?
Conversation
include/tvm/runtime/packed_func.h
Outdated
std::swap(value_, other.value_); | ||
std::swap(type_code_, other.type_code_); | ||
std::swap(f_deleter_, other.f_deleter_); | ||
std::swap(f_deleter_arg_, other.f_deleter_arg_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are doing too much special casing for this case. Instead, a simpler approach that might work is to leverage the Object system, which might align with our future directions of better FFI surrounding the object system
We need to be careful with FFI updates. As of now, TIR function mostly relies on destination passing, so not involving too much heap allocations. For the return value in generally, I think we would like to move toward a world where we rely on Object system for most cases, including the case of string return |
Using the Object system requires a C++ compiler, and a C++ runtime. The FFI supports targets that only have a C compiler and a C runtime. The initial implementation in #16836 used the Object system, and failure on C-only backends was caught by CI. See this comment for details. This commit does not introduce any heap allocations that were not already present. Instead, it does the opposite, adding the ability to express a return value that doesn't require a heap allocation. |
Object system is not necessarily tied to c++ from the ABI level (you can view the deleter etc in the layout), the main complexity here is that the return value container is adding more complexity that already is part of object system Seems returning a string is more advanced feature that we still would want to be careful with, so maybe we don't have to strictly enable it in pure C backend. |
The Object base class uses
Operations that are difficult to write for a C backend will also be difficult to write in the LLVM codegen. The main difficulty is due to the current use of a C++ type
I suppose we could strip out |
I will attempt an additional implementation, in which |
The |
Prior to this commit, the `tvm::relax::PatternContext`, `tvm::relay::AnnotatedRegionSet`, and `tvm::relay::CallGraph` classes explicitly defined constructors from `ObjectPtr<Object>`, and `operator->` implementations. However, they did not define `ContainerType`. As a result, any use of `T::ContainerType` (e.g. in `Downcast<PatternContext>(obj)` or `obj.as<PatternContext>()`) would incorrectly use the inherited `ObjectRef::ContainerType`. Since these downcast methods are erroneously comparing against `ObjectRef::ContainerType`, the type checks prior to downcasting were effectively disabled for these classes. This commit updates the `tvm::relax::PatternContext`, `tvm::relay::AnnotatedRegionSet`, and `tvm::relay::CallGraph` classes to use the `TVM_DEFINE_MUTABLE_OBJECT_REF_METHODS` and `TVM_DEFINE_MUTABLE_NOTNULLABLE_OBJECT_REF_METHODS` macros, which provide all required members for the TVM `ObjectRef` interface.
Prior to this PR, a TIR PrimFunc could return an `int64`, `float64`, or void. Returning any other type, even if supported by TVM's FFI, would raise an exception during `MakePackedAPI`. With this PR, string literals can be returned from TIR PrimFuncs. Enabling this TIR functionality requires a change to the TVM FFI. Previously, a `TVMRetValue` holding a string would allocate a `std::string*` and store it to `value_.v_handle`. This return value depends on the C++ runtime, and cannot be constructed on all backends that only require the C API. With this commit, this is instead represented as a `const char*` in `value_.v_str`, matching the representation in `TVMArgValue`. Unlike `TVMArgValue`, since p`TVMRetValue` may own its return value, two new member variables `void (*f_deleter_)(void*)` and `void* f_deleter_arg_`. These are designed to be easy to call from a C API, and are set to a deleter function that will free the allocation (if required), and an argument to be passed into the deleter function. With this new representation, the return value for different use cases would be set as follows: * Return string from C++ PackedFunc. The backing allocation is made using `new std::string(std::move(value))`. This heap allocation is stored in `f_deleter_arg_`, so that `f_deleter_` can then call `delete static_cast<std::string*>(arg)`. Finally, `new_str->data()` is stored as `value.v_str`. * Return pointer to static string from C. A pointer to the static string is stored in `v_str`. Both `f_deleter_` and `f_deleter_arg_` are left with their default value of `nullptr`, indicating that the `TVMRetValue` does not need to free any memory on destruction. This is the return value used by `T.ret(T.StringImm("my_string"))` in a TIR PrimFunc. * Return pointer to heap allocation made in C. Assuming the allocation is made using `malloc()`, a pointer to the heap allocation is stored in both `value_.v_str` and `f_deleter_arg_`. The `f_deleter_` is set to `free`, indicating that `free(f_deleter_arg_)` should be called when the `TVMRetValue` is destructed. This functionality is possibly within the updated `TVMRetValue` API, but is not currently used. This commit adds unit tests for the return of strings from compiled TIR PrimFuncs, when either running the function locally, or over a RPC connection.
Prior to this commit, a C++ `tvm::runtime::String` returned by a PackedFunc would be converted to a `tvm.runtime.String` by the Python FFI. There are two main issues with this handling. 1. Unnecessary duplication of memory. The python-native `str` constructed with `str.__new__` owns its underlying data. The argument passed to `runtime.GetFFIString` also owns its underlying data. Because the `GetFFIString` argument is saved to `val.__tvm_object__`, the python object returned from `String.__from_tvm_object__` keeps both copies alive. 2. Potential dangling references. The C++ `tvm::runtime::String` may have a deleter that is implemented in a dynamically-loaded LLVM or `.so` module. Removing the unnecessary copy and retaining the python-native `str` avoids this issue. This commit updates the Python implementation of `tvm.runtime.String.__from_tvm_object__` to return a python-native `str`.
7978778
to
6c363e9
Compare
And re-implemented so that the codegen produces a |
e2156c4
to
ad47051
Compare
Prior to this PR, a TIR PrimFunc could return an
int64
,float64
, or void. Returning any other type, even if supported by TVM's FFI, would raise an exception duringMakePackedAPI
. With this PR, string literals can be returned from TIR PrimFuncs.Enabling this TIR functionality requires a change to the TVM FFI. Previously, a
TVMRetValue
holding a string would allocate astd::string*
and store it tovalue_.v_handle
. This return value depends on the C++ runtime, and cannot be constructed on all backends that only require the C API. With this commit, this is instead represented as aconst char*
invalue_.v_str
, matching the representation inTVMArgValue
. UnlikeTVMArgValue
, since pTVMRetValue
may own its return value, two new member variablesvoid (*f_deleter_)(void*)
andvoid* f_deleter_arg_
. These are designed to be easy to call from a C API, and are set to a deleter function that will free the allocation (if required), and an argument to be passed into the deleter function.With this new representation, the return value for different use cases would be set as follows:
Return string from C++ PackedFunc. The backing allocation is made using
new std::string(std::move(value))
. This heap allocation is stored inf_deleter_arg_
, so thatf_deleter_
can then calldelete static_cast<std::string*>(arg)
. Finally,new_str->data()
is stored asvalue.v_str
.Return pointer to static string from C. A pointer to the static string is stored in
v_str
. Bothf_deleter_
andf_deleter_arg_
are left with their default value ofnullptr
, indicating that theTVMRetValue
does not need to free any memory on destruction.This is the return value used by
T.ret(T.StringImm("my_string"))
in a TIR PrimFunc.Return pointer to heap allocation made in C. Assuming the allocation is made using
malloc()
, a pointer to the heap allocation is stored in bothvalue_.v_str
andf_deleter_arg_
. Thef_deleter_
is set tofree
, indicating thatfree(f_deleter_arg_)
should be called when theTVMRetValue
is destructed.This functionality is possibly within the updated
TVMRetValue
API, but is not currently used.This commit adds unit tests for the return of strings from compiled TIR PrimFuncs, when either running the function locally, or over a RPC connection.