You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I try to parse a parquet file which has 4905 rows, the following error is thrown out:
panic: runtime error: slice bounds out of range [:4905] with capacity 3072
But when I run the same code on a parquet file that has only 5 rows, there is no error (these 2 parquet files are generated by same script so they share the same schema). Here is the result:
So is there a limit of the size of the parquet file?
Besides, when I omit the AvroName field, the first parquet file can also be read successfully ( but AvroName is a field of file names just as FileName so I don't think there are any differences between them).
Moreover, I have tested several parquet files with different number of rows, and they get the same slice bounds out of range error. Therefore I think this error is not caused by occasional mistake during the generation of parquet file.
Now I am really confused and wonder if you can help me fix this bug. Thank you in advance!
The text was updated successfully, but these errors were encountered:
The schema and go struct don't match, OPTIONAL fields should be defined as pointer so it can be nil.
If it does not work after changing definition of type Schema, it will be helpful to have a sample parquet file (and better with snippet of your source code) to troubleshoot.
zolstein
pushed a commit
to zolstein/parquet-go
that referenced
this issue
Jun 23, 2023
When a Read is performed after SeekToRow on mergedRowGroups, the rowIndex is
checked against the seek index and advanced until the rowIndex == seek index.
Previously, the rowIndex was not advanced in the normal read path, resulting in
mistakenly dropping unread rows when advancing the rowIndex.
Hi, I am parsing a parquet file whose schema is (generated by parquet-tools):
And I use the following struct to hold the content of parquet file:
When I try to parse a parquet file which has 4905 rows, the following error is thrown out:
But when I run the same code on a parquet file that has only 5 rows, there is no error (these 2 parquet files are generated by same script so they share the same schema). Here is the result:
So is there a limit of the size of the parquet file?
Besides, when I omit the
AvroName
field, the first parquet file can also be read successfully ( butAvroName
is a field of file names just asFileName
so I don't think there are any differences between them).Moreover, I have tested several parquet files with different number of rows, and they get the same
slice bounds out of range
error. Therefore I think this error is not caused by occasional mistake during the generation of parquet file.Now I am really confused and wonder if you can help me fix this bug. Thank you in advance!
The text was updated successfully, but these errors were encountered: