Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: use arrow-ipc to communicate between remote server and client #552

Merged
merged 5 commits into from
Jan 10, 2023

Conversation

jiacai2050
Copy link
Contributor

@jiacai2050 jiacai2050 commented Jan 9, 2023

Which issue does this PR close?

Closes #

Rationale for this change

In current implementation, remote service communication bewteen ceresdb using
avro to serialize RecordBatch. For arrow ecosystem, there is a designated serialization method -- Arrow IPC

Some benchmark:

cd benchmarks && cargo run --bin ipc -r
Arrow IPC encoded size:1348496
Arrow IPC encode/decode cost:1ms
Avro encoded size:904634
Avro encode/decode cost:284ms

Encoded bytes using Arrow IPC will take more space 1348496/904634 = 1.4, but it's more much less time.

After zstd compression, IPC is better than avro in both size and speed.

Arrow IPC encoded size:780181
Arrow IPC encode/decode cost:8ms
Avro encoded size:904634
Avro encode/decode cost:287ms

What changes are included in this PR?

Replace avro with arrow ipc for remote service communication

Are there any user-facing changes?

No

How does this change test

New UT test_ipc_encode_decode

@jiacai2050 jiacai2050 marked this pull request as ready for review January 9, 2023 14:22
@jiacai2050 jiacai2050 changed the title feat: use arrow-ipc to communicate bewteen remote server and client feat: use arrow-ipc to communicate between remote server and client Jan 9, 2023
Copy link
Member

@ShiKaiWi ShiKaiWi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Rachelint
Copy link
Contributor

ENCODE_ROWS_WITH_AVRO may need to rename?

@jiacai2050 jiacai2050 force-pushed the feat-arrow-ipc branch 2 times, most recently from a7c4794 to 863be12 Compare January 10, 2023 04:07
@jiacai2050
Copy link
Contributor Author

@Rachelint Thanks for remind. I have add a RemoteEngineVersion to unite version between server/client.

Copy link
Contributor

@Rachelint Rachelint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jiacai2050 jiacai2050 merged commit 1e6d46b into apache:main Jan 10, 2023
@jiacai2050 jiacai2050 deleted the feat-arrow-ipc branch January 10, 2023 07:21
chunshao90 pushed a commit to chunshao90/ceresdb that referenced this pull request May 15, 2023
…pache#552)

* feat: use arrow-ipc to communicate bewteen remote server and client

* add benchmarks

* add zstd for ipc

* add RemoteEngineVersion

* fix message
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants