You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would be nice if a "LargeMap" type existed along side the "Map" type for parity. For other datatypes that require offset arrays/buffers, such as String, List, BinaryArray, provides a "large" version of these types, i.e. LargeString, LargeList, and LargeBinaryArray. It would be nice to have a "LargeMap" for parity.
I was more thinking about the future when I created this Jira issue. I don't have a concrete need now, but I can picture a few scenarios in which the size limitation imposed by MapArray's 32-bit offsets cannot be worked around.
Scenario 1:
Suppose you have a ListArray of MapArrays. If one of the maps requires more than int32::max key-value pairs, there's no way to do this currently. You could try using a ChunkedArray, but you would still need to split the large map across multiple rows in the list.
Scenario 2:
Even if the MapArray is at the top of the object hierarchy, the same problem could potentially arise if a row within the array needs to contain more than int32::max key-value pairs. You could try to use a ChunkedArray to resolve the issue, but the key-value pairs would still be split across multiple rows.
I've seen Parquet files with MAP columns, and I can imagine a situation in which someone has a very large MAP as the top-most data structure or within a nested one. While running into a situation in which they can't use MapArrays to represent their data is probably rare, it's not entirely impossible given int32's size restrictions.
I'd honestly be interested in looking into this myself.
It would be nice if a "LargeMap" type existed along side the "Map" type for parity. For other datatypes that require offset arrays/buffers, such as String, List, BinaryArray, provides a "large" version of these types, i.e. LargeString, LargeList, and LargeBinaryArray. It would be nice to have a "LargeMap" for parity.
Reporter: Sarah Gilmore / @sgilmore10
Note: This issue was originally created as ARROW-15554. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: