-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-21621][Core] Reset numRecordsWritten after DiskBlockObjectWriter.commitAndGet called #18830
Conversation
You can see here L208, when we called 'revertPartialWritesAndClose', the written reocrds will decrease to 0. |
@cloud-fan @vanzin Would you mind take a look? Thanks a lot. |
Nice catch, looks good to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
ok to test |
Thanks for reviewing. Hi @jiangxb1987, seems the test didn't triggered. |
ok to test |
Test build #80316 has finished for PR 18830 at commit
|
LGTM, merging to master/2.2 |
…er.commitAndGet called ## What changes were proposed in this pull request? We should reset numRecordsWritten to zero after DiskBlockObjectWriter.commitAndGet called. Because when `revertPartialWritesAndClose` be called, we decrease the written records in `ShuffleWriteMetrics` . However, we decreased the written records to zero, this should be wrong, we should only decreased the number reords after the last `commitAndGet` called. ## How was this patch tested? Modified existing test. Please review http://spark.apache.org/contributing.html before opening a pull request. Author: Xianyang Liu <xianyang.liu@intel.com> Closes #18830 from ConeyLiu/DiskBlockObjectWriter. (cherry picked from commit 534a063) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Thank you all. |
…er.commitAndGet called ## What changes were proposed in this pull request? We should reset numRecordsWritten to zero after DiskBlockObjectWriter.commitAndGet called. Because when `revertPartialWritesAndClose` be called, we decrease the written records in `ShuffleWriteMetrics` . However, we decreased the written records to zero, this should be wrong, we should only decreased the number reords after the last `commitAndGet` called. ## How was this patch tested? Modified existing test. Please review http://spark.apache.org/contributing.html before opening a pull request. Author: Xianyang Liu <xianyang.liu@intel.com> Closes apache#18830 from ConeyLiu/DiskBlockObjectWriter. (cherry picked from commit 534a063) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
We should reset numRecordsWritten to zero after DiskBlockObjectWriter.commitAndGet called.
Because when
revertPartialWritesAndClose
be called, we decrease the written records inShuffleWriteMetrics
. However, we decreased the written records to zero, this should be wrong, we should only decreased the number reords after the lastcommitAndGet
called.How was this patch tested?
Modified existing test.
Please review http://spark.apache.org/contributing.html before opening a pull request.