-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-2645: [Java] Refactor ArrowWriter to remove all ArrowFileWriter specifc logic #2090
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-2645: [Java] Refactor ArrowWriter to remove all ArrowFileWriter specifc logic #2090
Conversation
…er where they are used
|
@icexelloss, please review if you can, thanks! |
|
@BryanCutler does this PR refactor out file format specific logic from ArrowWriter to ArrowFileWriter? |
|
|
||
| private static final Logger LOGGER = LoggerFactory.getLogger(ArrowFileWriter.class); | ||
|
|
||
| private final List<ArrowBlock> dictionaryBlocks = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment on the purpose of these two lists here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
Yes, specifically saving |
icexelloss
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
|
Maybe it makes more sense to change the PR title to be "Refactor record batch and dictionary batch accumulating logic from ArrowWriter to ArrowFileWriter" (at least more understandable to me) but this is minor so your call @BryanCutler |
|
Thanks for reviewing @icexelloss ! |
|
merged to master |
… specifc logic Related to apache#2079 , the DictionaryBatch `ArrowBlock`s were being accumulated in the base class and used by `ArrowFileWriter` but not `ArrowStreamWriter`. This refactors the `ArrowWriter` to move Lists of ArrowBlocks from the base class to only `ArrowFileWriter`. Moved tests counting ArrowBlocks written from ArrowStreamWriter tests to ArrowFileWriter. Author: Bryan Cutler <cutlerb@gmail.com> Closes apache#2090 from BryanCutler/java-ArrowStreamWriter-accum-DictionaryBlocks-ARROW-2645 and squashes the following commits: fc7f061 <Bryan Cutler> added comment about saving ArrowBlocks 5e4711a <Bryan Cutler> Moved lists of ArrowBlocks in ArrowWriter base class to ArrowFileWriter where they are used
Related to #2079 , the DictionaryBatch
ArrowBlocks were being accumulated in the base class and used byArrowFileWriterbut notArrowStreamWriter. This refactors theArrowWriterto move Lists of ArrowBlocks from the base class to onlyArrowFileWriter.Moved tests counting ArrowBlocks written from ArrowStreamWriter tests to ArrowFileWriter.