You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.
Which part is this question about
The question is about the return type of UnionArray::child, how it compares to StructArray::column, and why they are different.
Describe your question StructArray::column returns &ArrayRef, a reference to Arc<dyn Array> owned by the StructArray object. However, its spiritual equivalent in UnionArray, the method UnionArray::child returns an owned Arc<dyn Array>.
The implementations of both are relatively similar: StructArray::column and UnionArray::child essentially both look up an index in an array. The only difference that I see is that StructArray::column returns the reference to what it finds, whereas UnionArray::child calls clone on that reference to return an owned object.
It makes sense to me to always return a reference. Returning a reference gives an additional amount of flexibility to the user, who can call clone on the reference to get an owned Arc, or who can retain the reference to maintain a connection between the returned column and the union/array from which it came. The contrary also has the downside of always engaging with the reference counter and paying its overhead, regardless of whether its necessary or not.
I am curious why this discrepancy exists. I am guessing there is a specific technical reason that I don't understand, in which case I'd love to find out what it is. On the other hand, I am surreptitiously hoping this is an oversight and I might possibly agitate to harmonize the return types :)
I am writing a function that traverses a &'a RecordBatch following some user-defined path and returns a column it finds at the end. The column is meant to be downcast into a specific type (such as UInt32Array).
Since StructArray::column provides a reference with a lifetime of &'a, I can downcast it with .as_any().downcast() and receive a reference with a lifetime 'a. However, if the column is found inside a UnionArray, the method UnionArray::child produces an owned object. I can only downcast this object to a reference type, which will then only live as long as the Arc from which it came, which means only until the end of the function
I would support changing to always returning a reference and would happily approve a PR that did so. Historically a number of methods have returned owned references as a quirk of being based on equivalent methods in C++, where returning references is potentially error prone. In Rust I agree this is a code smell we should fix.