Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python][Dataset] Improve ergonomics of FileFragment #24399

Closed
asfimport opened this issue Mar 24, 2020 · 2 comments
Closed

[Python][Dataset] Improve ergonomics of FileFragment #24399

asfimport opened this issue Mar 24, 2020 · 2 comments

Comments

@asfimport
Copy link

FileFragment can be made more directly useful by adding convenience methods.

For example, a FileFragment could allow underlying file/buffer to be opened directly:

    def open(self):
        """
        Open a NativeFile of the buffer or file viewed by this fragment.
        """
        cdef:
            CFileSystem* c_filesystem
            shared_ptr[CRandomAccessFile] opened
            NativeFile out = NativeFile()

        buf = self.buffer
        if buf is not None:
            return pa.io.BufferReader(buf)

        with nogil:
            c_filesystem = self.file_fragment.source().filesystem()
            opened = GetResultValue(c_filesystem.OpenInputFile(
                self.file_fragment.source().path()))

        out.set_random_access_file(opened)
        out.is_readable = True
        return out

Additionally, a ParquetFileFragment's metadata could be introspectable:

    @property
    def metadata(self):
        from pyarrow._parquet import ParquetReader
        reader = ParquetReader()
        reader.open(self.open())
        return reader.metadata

Reporter: Ben Kietzman / @bkietz
Assignee: Miles Granger / @milesgranger

PRs and other links:

Note: This issue was originally created as ARROW-8201. Please see the migration documentation for further details.

@asfimport
Copy link
Author

Antoine Pitrou / @pitrou:
@milesgranger  Perhaps you would be interested in finding out whether this issue still applies, and if so, to come up with a PR?

@asfimport
Copy link
Author

Joris Van den Bossche / @jorisvandenbossche:
Issue resolved by pull request 14301
#14301

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants