Skip to content

Conversation

@BryanCutler
Copy link
Member

@BryanCutler BryanCutler commented May 29, 2018

Related to #2079 , the DictionaryBatch ArrowBlocks were being accumulated in the base class and used by ArrowFileWriter but not ArrowStreamWriter. This refactors the ArrowWriter to move Lists of ArrowBlocks from the base class to only ArrowFileWriter.

Moved tests counting ArrowBlocks written from ArrowStreamWriter tests to ArrowFileWriter.

@BryanCutler
Copy link
Member Author

cc @siddharthteotia

@BryanCutler
Copy link
Member Author

@icexelloss, please review if you can, thanks!

@icexelloss
Copy link
Contributor

@BryanCutler does this PR refactor out file format specific logic from ArrowWriter to ArrowFileWriter?


private static final Logger LOGGER = LoggerFactory.getLogger(ArrowFileWriter.class);

private final List<ArrowBlock> dictionaryBlocks = new ArrayList<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment on the purpose of these two lists here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

@BryanCutler
Copy link
Member Author

@BryanCutler does this PR refactor out file format specific logic from ArrowWriter to ArrowFileWriter?

Yes, specifically saving ArrowBlocks written to a list so they can be used in ArrowFileWriter.endInternal

Copy link
Contributor

@icexelloss icexelloss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@icexelloss
Copy link
Contributor

Maybe it makes more sense to change the PR title to be

"Refactor record batch and dictionary batch accumulating logic from ArrowWriter to ArrowFileWriter"

(at least more understandable to me) but this is minor so your call @BryanCutler

@BryanCutler BryanCutler changed the title ARROW-2645: [Java] Fixed ArrowStreamWriter from accumulating Dictionary batch blocks ARROW-2645: [Java] Refactor ArrowWriter to remove all ArrowFileWriter specifc logic May 31, 2018
@BryanCutler
Copy link
Member Author

Thanks for reviewing @icexelloss !

@BryanCutler
Copy link
Member Author

merged to master

pribor pushed a commit to GlobalWebIndex/arrow that referenced this pull request Oct 24, 2025
… specifc logic

Related to apache#2079 , the DictionaryBatch `ArrowBlock`s were being accumulated in the base class and used by `ArrowFileWriter` but not `ArrowStreamWriter`.  This refactors the `ArrowWriter` to move Lists of ArrowBlocks from the base class to only `ArrowFileWriter`.

Moved tests counting ArrowBlocks written from ArrowStreamWriter tests to ArrowFileWriter.

Author: Bryan Cutler <cutlerb@gmail.com>

Closes apache#2090 from BryanCutler/java-ArrowStreamWriter-accum-DictionaryBlocks-ARROW-2645 and squashes the following commits:

fc7f061 <Bryan Cutler> added comment about saving ArrowBlocks
5e4711a <Bryan Cutler> Moved lists of ArrowBlocks in ArrowWriter base class to ArrowFileWriter where they are used
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants