Check duplicate issues.
Description
RDataFrame needs type information of input columns to process them during execution of the computation graph. Most of the RDF API gets the type info at compile time (note that this could mean either true ahead of time or just-in-time compilation via cling). Since ROOT 6.38, the Snapshot method is the first part of the API that can work truly without compile-time information about column types, but simply based on std::type_info retrieved via the type name of the column. This works generally, but under the assumption that the RTTI of the input column(s) can be retrieved. The logic for this retrieval is at
|
const std::type_info &TypeName2TypeID(const std::string &name) |
.
This left out a particular scenario which used to work, that is when a class is user generated and JIT compiled and used as the type of an input column to the RDF API. In most cases, there is a happy interaction between the interpreter on the user-side and the interpreter on the RDF side: when an RDF API call is reached, e.g. a Define, the type of the column is retrieved by its name and finally the RDF API call is JITted, thus always interfacing to the RTTI via cling. Snapshot lost this implicit interaction. This is visible for example in the awkward array integration:
scikit-hep/awkward#3885
Note the error
No runtime type information is available for column "x" with type name "awkward::ListArray_tzrPHZhHbA"
Where awkward::ListArray_tzrPHZhHbA is a C++ type generated and JIT compiled before starting the RDF computation graph.
Reproducer
scikit-hep/awkward#3885
ROOT version
6.38 and above
Installation method
Any
Operating system
Any
Additional context
No response
Check duplicate issues.
Description
RDataFrame needs type information of input columns to process them during execution of the computation graph. Most of the RDF API gets the type info at compile time (note that this could mean either true ahead of time or just-in-time compilation via cling). Since ROOT 6.38, the Snapshot method is the first part of the API that can work truly without compile-time information about column types, but simply based on
std::type_inforetrieved via the type name of the column. This works generally, but under the assumption that the RTTI of the input column(s) can be retrieved. The logic for this retrieval is atroot/tree/dataframe/src/RDFUtils.cxx
Line 86 in 9fdfd5c
This left out a particular scenario which used to work, that is when a class is user generated and JIT compiled and used as the type of an input column to the RDF API. In most cases, there is a happy interaction between the interpreter on the user-side and the interpreter on the RDF side: when an RDF API call is reached, e.g. a Define, the type of the column is retrieved by its name and finally the RDF API call is JITted, thus always interfacing to the RTTI via cling. Snapshot lost this implicit interaction. This is visible for example in the awkward array integration:
scikit-hep/awkward#3885
Note the error
Where
awkward::ListArray_tzrPHZhHbAis a C++ type generated and JIT compiled before starting the RDF computation graph.Reproducer
scikit-hep/awkward#3885
ROOT version
6.38 and above
Installation method
Any
Operating system
Any
Additional context
No response