Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix merge conflicts in get-or-create-metrics PR #3

Conversation

JoshRosen
Copy link

Review on Reviewable

cloud-fan and others added 9 commits January 18, 2016 15:10
Right now, the bucket tests are kind of hard to understand, this PR simplifies them and add more commetns.

Author: Wenchen Fan <wenchen@databricks.com>

Closes apache#10813 from cloud-fan/bucket-comment.
…ntegration doc

This PR added instructions to get flume assembly jar for Python users in the flume integration page like Kafka doc.

Author: Shixiong Zhu <shixiong@databricks.com>

Closes apache#10746 from zsxwing/flume-doc.
This reverts commit 591c88c. `lint-java` doesn't work on a machine with a clean Maven cache.
… integration doc

This PR added instructions to get Kinesis assembly jar for Python users in the Kinesis integration page like Kafka doc.

Author: Shixiong Zhu <shixiong@databricks.com>

Closes apache#10822 from zsxwing/kinesis-doc.
Based on discussions in apache#10801, I'm submitting a pull request to rename ParserDialect to ParserInterface.

Author: Reynold Xin <rxin@databricks.com>

Closes apache#10817 from rxin/SPARK-12889.
Currently SortMergeJoin and BroadcastHashJoin do not support condition, the need a followed Filter for that, the result projection to generate UnsafeRow could be very expensive if they generate lots of rows and could be filtered mostly by condition.

This PR brings the support of condition for SortMergeJoin and BroadcastHashJoin, just like other outer joins do.

This could improve the performance of Q72 by 7x (from 120s to 16.5s).

Author: Davies Liu <davies@databricks.com>

Closes apache#10653 from davies/filter_join.
This is a small step in implementing SPARK-10620, which migrates TaskMetrics to accumulators. This patch is strictly a cleanup patch and introduces no change in functionality. It literally just renames 3 fields for consistency. Today we have:

```
inputMetrics.recordsRead
outputMetrics.bytesWritten
shuffleReadMetrics.localBlocksFetched
...
shuffleWriteMetrics.shuffleRecordsWritten
shuffleWriteMetrics.shuffleBytesWritten
shuffleWriteMetrics.shuffleWriteTime
```

The shuffle write ones are kind of redundant. We can drop the `shuffle` part in the method names. I added backward compatible (but deprecated) methods with the old names.

Parent PR: apache#10717

Author: Andrew Or <andrew@databricks.com>

Closes apache#10811 from andrewor14/rename-things.
andrewor14 pushed a commit that referenced this pull request Jan 19, 2016
Fix merge conflicts in get-or-create-metrics PR
@andrewor14 andrewor14 merged commit bfb3c05 into andrewor14:get-or-create-metrics Jan 19, 2016
@JoshRosen JoshRosen deleted the andrewor14-get-or-create-metrics branch August 29, 2016 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants