Skip to content

ARROW-2369: [Python] Fix reading large Parquet files (> 4 GB)#1866

Closed
pitrou wants to merge 1 commit intoapache:masterfrom
pitrou:ARROW-2369
Closed

ARROW-2369: [Python] Fix reading large Parquet files (> 4 GB)#1866
pitrou wants to merge 1 commit intoapache:masterfrom
pitrou:ARROW-2369

Conversation

@pitrou
Copy link
Copy Markdown
Member

@pitrou pitrou commented Apr 9, 2018

  • Fix PythonFile.seek() for offsets > 4 GB
  • Avoid instantiating a PythonFile in ParquetFile, for efficiency

- Fix PythonFile.seek() for offsets > 4 GB
- Avoid instantiating a PythonFile in ParquetFile, for efficiency
Copy link
Copy Markdown
Member

@xhochy xhochy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, this looks good as the test seems to work but even when it's merged, I would appreciate a short explanantion what the bug was in the end.

Status Seek(int64_t position, int whence) {
// whence: 0 for relative to start of file, 2 for end of file
PyObject* result = cpp_PyObject_CallMethod(file_, "seek", "(ii)", position, whence);
PyObject* result = cpp_PyObject_CallMethod(file_, "seek", "(ni)",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problem was before that position was casted to an int32?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, i takes the given parameter as int while n takes it as Py_ssize_t.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ultimately it is the same format string syntax as documented here: https://docs.python.org/3/c-api/arg.html#building-values

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants