Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-1008: [C++] Add abstract stream writer and reader C++ APIs. Give clearer names to IPC reader/writer classes #679

Closed
wants to merge 5 commits into from

Conversation

wesm
Copy link
Member

@wesm wesm commented May 12, 2017

The main motivation for this patch was to make StreamReader and StreamWriter abstract, so that other implementations can be created. I would also like to add the option for asynchronous reading and writing.

I also added a CMake option ARROW_NO_DEPRECATED_API for more graceful name deprecations.

@kou do you think these names for the IPC classes are more clear?

@kou
Copy link
Member

kou commented May 13, 2017

Basically, I like this change! I have some concerns.

Concern 1: Use of "batch", "stream" and "random access" words.

"batch" is used for the following things:

"stream" is used for the following things:

It may be better that we use one word for one mean.

Concern 2: Consistency.

BatchStreamReader shows "what" is read.

InputStreamReader and BatchFileReader show "where" from read.

I tried considering names but I don't get good names yet... Sorry...

Broken idea:

  • BatchStreamReader -> RecordBatchReader (or RecordBatchesReader?)
    • "record batch" is used only for record batch.
    • "stream" isn't used for "zero or more X"
  • InputStreamReader -> RecordBatchStreamReader
    • Umm... "stream" is used for "stream file format" but it's not clear...
  • BatchFileReader -> RecordBatchXXXReader (RecordBatchSetReader ?)
    • Umm... I wanted to use a word from "random access format" or "batch format" but ...

wesm added 3 commits May 13, 2017 15:59
…tream

reader and writer classes for better clarity
Change-Id: I38b1e570c69af59aac917e96845b7add947d7196
… APIs

Change-Id: I92cc2d5de55f625dee543bd7dc223225fd8f7977
Change-Id: I2b375c917e6004c3ffebf3657491d4756f792951
@wesm
Copy link
Member Author

wesm commented May 13, 2017

I renamed some things:

  • RecordBatchReader and RecordBatchWriter for the abstract base classes
  • RecordBatchStreamReader/RecordBatchStreamWriter for the synchronous streaming format classes
  • RecordBatchFileReader/RecordBatchFileWriter for the synchronous file format classes

I removed the "random access" language since we use "Streaming format" and "File format" in the Arrow format documentation.

Let me know if this sounds good for now

Change-Id: I0c06793275de0bf96be5602e5c99ae9cf7ad2b14
@kou
Copy link
Member

kou commented May 14, 2017

This sounds good!

@wesm
Copy link
Member Author

wesm commented May 14, 2017

+1

@asfgit asfgit closed this in 5739e04 May 14, 2017
@wesm wesm deleted the ARROW-1008 branch May 14, 2017 12:55
jeffknupp pushed a commit to jeffknupp/arrow that referenced this pull request Jun 3, 2017
…e clearer names to IPC reader/writer classes

The main motivation for this patch was to make `StreamReader` and `StreamWriter` abstract, so that other implementations can be created. I would also like to add the option for asynchronous reading and writing.

I also added a CMake option `ARROW_NO_DEPRECATED_API` for more graceful name deprecations.

@kou do you think these names for the IPC classes are more clear?

Author: Wes McKinney <wes.mckinney@twosigma.com>

Closes apache#679 from wesm/ARROW-1008 and squashes the following commits:

d7b7c9c [Wes McKinney] Add missing dtors for pimpl pattern
a797ee3 [Wes McKinney] Fix glib
04fa285 [Wes McKinney] Feedback on ipc reader/writer names. Add open_stream/open_file Python APIs
22346d4 [Wes McKinney] Fix unit tests
10837a6 [Wes McKinney] Add abstract stream writer and reader C++ APIs. Rename record batch stream reader and writer classes for better clarity
pcmoritz pushed a commit to pcmoritz/arrow that referenced this pull request Jun 11, 2017
…e clearer names to IPC reader/writer classes

The main motivation for this patch was to make `StreamReader` and `StreamWriter` abstract, so that other implementations can be created. I would also like to add the option for asynchronous reading and writing.

I also added a CMake option `ARROW_NO_DEPRECATED_API` for more graceful name deprecations.

@kou do you think these names for the IPC classes are more clear?

Author: Wes McKinney <wes.mckinney@twosigma.com>

Closes apache#679 from wesm/ARROW-1008 and squashes the following commits:

d7b7c9c [Wes McKinney] Add missing dtors for pimpl pattern
a797ee3 [Wes McKinney] Fix glib
04fa285 [Wes McKinney] Feedback on ipc reader/writer names. Add open_stream/open_file Python APIs
22346d4 [Wes McKinney] Fix unit tests
10837a6 [Wes McKinney] Add abstract stream writer and reader C++ APIs. Rename record batch stream reader and writer classes for better clarity
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants