Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-280: [C++] Refactor IPC / memory map IO to use common arrow_io interfaces. Create arrow_ipc leaf library #138

Closed
wants to merge 1 commit into from

Conversation

wesm
Copy link
Member

@wesm wesm commented Sep 18, 2016

Several things here

  • Clean up IO interface class structure to be able to indicate precise characteristics of an implementation
  • Make the IPC reader/writer use more generic interfaces -- writing only needs an output stream, reading only needs a random access reader. This will unblock ARROW-267
  • Create a separate arrow_ipc shared library

@wesm
Copy link
Member Author

wesm commented Sep 18, 2016

cc @emkornfield @xhochy . let me know if you'd like me to put this on gerrit

@wesm wesm force-pushed the ARROW-280 branch 4 times, most recently from 6576e67 to 48d27fb Compare September 18, 2016 17:29
@wesm
Copy link
Member Author

wesm commented Sep 18, 2016

@xhochy this will cause some downstream impacts on the arrow_parquet work ongoing. I'm going to make the necessary fixes here and we can port them over to Parquet once PARQUET-712 is done

@xhochy
Copy link
Member

xhochy commented Sep 18, 2016

@wesm feel free to merge PARQUET-712, should be ready now.

@xhochy
Copy link
Member

xhochy commented Sep 18, 2016

@wesm Also note that I pinned the arrow-git-hash in the dependencies for parquet-cpp, so breakages here won't break parquet-cpp's builds.

* Refactor memory mapped IO interfaces to be in line with other arrow::io
  classes.
* Split arrow_ipc into a leaf library
* Refactor pyarrow and arrow_parquet to suit. Move BufferReader to
  arrow_io. Pyarrow parquet tests currently segfault

Change-Id: I9a8eb95dfb6572810adb3e5199bd09cbbd020bab
@wesm
Copy link
Member Author

wesm commented Sep 18, 2016

That's great. I got things building again but the pyarrow parquet tests currently segfault, we should investigate separate from this patch

@wesm
Copy link
Member Author

wesm commented Sep 18, 2016

OK, the build is passing. I'm going to move on to implementing the file format on top of this.

@wesm
Copy link
Member Author

wesm commented Sep 18, 2016

I'm going to merge this to be able to start tackling the Parquet code migration in parallel with the file format work. Any feedback I will happily incorporate in a follow up patch. +1

@wesm wesm changed the title ARROW-280: Refactor IPC / memory map IO to use common arrow_io interfaces. Create arrow_ipc leaf library ARROW-280: [C++] Refactor IPC / memory map IO to use common arrow_io interfaces. Create arrow_ipc leaf library Sep 18, 2016
@asfgit asfgit closed this in 559b865 Sep 18, 2016
@wesm wesm deleted the ARROW-280 branch September 18, 2016 20:02
@@ -20,6 +20,7 @@

set(ARROW_IO_LINK_LIBS
arrow_shared
dl
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be needed, maybe you mixed somewhere a static and a shared build?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are using dlopen in libhdfs_shim.cc, I get a linker error without linking libdl

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok. I had the problem with missing dlopen symbols in parquet-cpp as there static and shared test libs where mixed. There it would have been only a workaround to link dl but as we explicitly use it here, it's fine.

ADD_ARROW_TEST(hdfs-io-test)
ARROW_TEST_LINK_LIBRARIES(hdfs-io-test
ADD_ARROW_TEST(io-hdfs-test)
ARROW_TEST_LINK_LIBRARIES(io-hdfs-test
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's why you need dl. This should probably be conditional on if we build static or shared tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above, the issue was actually dlopen, dlsym, etc.

@@ -37,6 +38,7 @@ set(ARROW_IO_TEST_LINK_LIBS
${ARROW_IO_PRIVATE_LINK_LIBS})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would need a arrow_io_static here to solve the linking problems in the tests.

)

set(ARROW_IPC_TEST_LINK_LIBS
arrow_ipc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll also need a static lib here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Roger. This isn't causing immediate problems -- let's create JIRAs to build static libs for the leaf libraries too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants