Skip to content

Conversation

@kou
Copy link
Member

@kou kou commented Feb 10, 2026

Rationale for this change

This focuses on implementing base dictionary delta message support mechanism. So this adds support for only UTF-8 array as dictionary. Other arrays will be supported in follow-up tasks.

What changes are included in this PR?

  • Add support for ArrowFromat#slice (But it's not completed. It just works partially.)
  • If the second record batch includes an updated dictionary (new entries are appended), these appended entries are sliced and they are only written as delta.

Are these changes tested?

Yes.

Are there any user-facing changes?

Yes.

@github-actions
Copy link

⚠️ GitHub issue #49208 has been automatically assigned in GitHub to PR creator.

Copy link
Collaborator

@hiroyuki-sato hiroyuki-sato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

(This PR contains many changes. I just tested on the local machine.)

LD_LIBRARY_PATH=/tmp/local/lib:/path/to/arrow/red-arrow/ext/arrow/ GI_TYPELIB_PATH=/tmp/local/lib/girepository-1.0 rake -I /path/to/arrow/ruby/red-arrow/lib test
cd /path/to/arrow/ruby/red-arrow-format
/path/to/.rbenv/versions/3.4.7/bin/ruby test/run.rb
Loaded suite test
Started
/path/to/arrow/ruby/red-arrow-format/lib/arrow-format/file-reader.rb:40: warning: IO::Buffer is experimental and both the Ruby and C interface may change in the future!
Finished in 1.751186457 seconds.
--------------------------------------------------------------------------------
194 tests, 194 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications
100% passed
--------------------------------------------------------------------------------
110.78 tests/s, 110.78 assertions/s
cd -

@values = table.value.values
end
def file_extension
"arrows"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for interest: Use arrow and arrows intentionally?

class TestFileWriter < Test::Unit::TestCase

  def file_extension
    "arrow"
  end

end

class TestStreamingWriter < Test::Unit::TestCase
  def file_extension
    "arrows"
  end
end

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

https://arrow.apache.org/docs/format/Columnar.html#ipc-streaming-format

We recommend the “.arrows” file extension for the streaming format although in many cases these streams will not ever be stored as files.

@kou kou merged commit bc48921 into apache:main Feb 11, 2026
15 checks passed
@kou kou removed the awaiting committer review Awaiting committer review label Feb 11, 2026
@kou kou deleted the ruby-dictionary-delta branch February 11, 2026 00:07
@github-actions github-actions bot added the awaiting changes Awaiting changes label Feb 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants