-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Closed
Labels
Description
I have a 30 gig arrow file with 100 batches. the largest batch in the file causes get batch to fail - All other batches load fine. in 14.11 the individual batch errors.. in 15.1.1 the batch crashes R studio when it is used
14.1.1
> rbn <- data_rbfr$get_batch(x)
Error in ipc__RecordBatchFileReader_ReadRecordBatch(self, i) :
Invalid: negative malloc size
15.1.1
rbn <- data_rbfr$get_batch(x) works!
df <- as.data.frame(rbn) - Crashes R Studio!
Update
I put the data in the batch into a separate file. The file size is over 2 gigs.
Using 15.1.1, when I try to load this entire file via read_arrow it also fails.
ar <- arrow::read_arrow("e:\\temp\\file.arrow")
Error in Table__from_RecordBatchFileReader(batch_reader) :
Invalid: negative malloc size
Reporter: Anthony Abate / @abbotware
Original Issue Attachments:
- image-2019-11-13-16-27-30-641.png
- SingleBatch_String_70000_Rows.ok.rar
- SingleBatch_String_85000_Rows.crash.rar
Note: This issue was originally created as ARROW-7156. Please see the migration documentation for further details.