Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-8464][Core][Shuffle] Consider separating aggregator and non-aggregator paths in ExternalSorter #7129

Closed
wants to merge 20 commits into from

Conversation

ilganeli
Copy link

I've started by separating ExternalAggregator into two classes, one which assumes an aggregator is defined and one which does not. Common code is extracted into a parent class.

Ilya Ganelin added 2 commits June 30, 2015 10:14
…class instances, one which uses Aggregator and one which does not. Next step is to extract out common code
…lasses that implement this interface to abstract away the aggregator usage.
@ilganeli ilganeli changed the title [SPARK-8464][Core][Shuffle][WIP] Consider separating aggregator and non-aggregator paths in ExternalSorter [SPARK-8464][Core][Shuffle] Consider separating aggregator and non-aggregator paths in ExternalSorter Jun 30, 2015
@SparkQA
Copy link

SparkQA commented Jun 30, 2015

Test build #36151 has finished for PR 7129 at commit b74ca30.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • protected[this] case class SpilledFile(
    • protected[this] class SpillReader(spill: SpilledFile)
    • protected[this] class IteratorForPartition(partitionId: Int, data: BufferedIterator[((Int, K), C)])

@SparkQA
Copy link

SparkQA commented Jun 30, 2015

Test build #36139 has finished for PR 7129 at commit d0024ef.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 30, 2015

Test build #36153 has finished for PR 7129 at commit 864d603.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • protected[this] case class SpilledFile(
    • protected[this] class SpillReader(spill: SpilledFile)
    • protected[this] class IteratorForPartition(partitionId: Int, data: BufferedIterator[((Int, K), C)])

@SparkQA
Copy link

SparkQA commented Jun 30, 2015

Test build #36154 has finished for PR 7129 at commit 8b3aca5.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • protected[this] case class SpilledFile(
    • protected[this] class SpillReader(spill: SpilledFile)
    • protected[this] class IteratorForPartition(partitionId: Int,

@SparkQA
Copy link

SparkQA commented Jun 30, 2015

Test build #36163 has finished for PR 7129 at commit 72b87e5.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • protected[this] case class SpilledFile(
    • protected[this] class SpillReader(spill: SpilledFile)
    • protected[this] class IteratorForPartition(partitionId: Int,

@ilganeli
Copy link
Author

retest this please

@SparkQA
Copy link

SparkQA commented Jun 30, 2015

Test build #36168 has finished for PR 7129 at commit 083b2b6.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • protected[this] case class SpilledFile(
    • protected[this] class SpillReader(spill: SpilledFile)
    • protected[this] class IteratorForPartition(partitionId: Int,

@SparkQA
Copy link

SparkQA commented Jun 30, 2015

Test build #36174 timed out for PR 7129 at commit 36db0dc after a configured wait of 175m.

@SparkQA
Copy link

SparkQA commented Jul 1, 2015

Test build #36253 has finished for PR 7129 at commit 1a8dba5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • protected[this] case class SpilledFile(
    • protected[this] class SpillReader(spill: SpilledFile)
    • protected[this] class IteratorForPartition(partitionId: Int,

@ilganeli
Copy link
Author

ilganeli commented Jul 1, 2015

@JoshRosen Any chance I could get a second pair of eyes on this? I'm wary of changes to ExternalSorter introducing complicated merge conflicts. Would love your help, thanks!

@SparkQA
Copy link

SparkQA commented Jul 5, 2015

Test build #36548 has finished for PR 7129 at commit 7ecc28d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • protected[this] case class SpilledFile(
    • protected[this] class SpillReader(spill: SpilledFile)
    • protected[this] class IteratorForPartition(partitionId: Int,

@SparkQA
Copy link

SparkQA commented Jul 8, 2015

Test build #36842 has finished for PR 7129 at commit 12c11a7.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • protected[this] case class SpilledFile(
    • protected[this] class SpillReader(spill: SpilledFile)
    • protected[this] class IteratorForPartition(partitionId: Int,

@SparkQA
Copy link

SparkQA commented Jul 8, 2015

Test build #36852 has finished for PR 7129 at commit 1c154c4.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • protected[this] case class SpilledFile(
    • protected[this] class SpillReader(spill: SpilledFile)
    • protected[this] class IteratorForPartition(partitionId: Int,

@SparkQA
Copy link

SparkQA commented Jul 9, 2015

Test build #36855 has finished for PR 7129 at commit 5e5d5e7.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • protected[this] case class SpilledFile(
    • protected[this] class SpillReader(spill: SpilledFile)
    • protected[this] class IteratorForPartition(partitionId: Int,

@ilganeli
Copy link
Author

ilganeli commented Jul 9, 2015

retest this please

@SparkQA
Copy link

SparkQA commented Jul 9, 2015

Test build #36872 has finished for PR 7129 at commit 8f6e327.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@ilganeli
Copy link
Author

ilganeli commented Jul 9, 2015

retest this please

@SparkQA
Copy link

SparkQA commented Jul 9, 2015

Test build #36888 has finished for PR 7129 at commit 8f6e327.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • protected[this] case class SpilledFile(
    • protected[this] class SpillReader(spill: SpilledFile)
    • protected[this] class IteratorForPartition(partitionId: Int,

@ilganeli
Copy link
Author

@rxin @JoshRosen Could I please get a review of this PR? Thanks!

@ilganeli
Copy link
Author

@JoshRosen @rxin Hi folks - any chance of getting a review? Thanks!

@SparkQA
Copy link

SparkQA commented Jul 16, 2015

Test build #37525 has finished for PR 7129 at commit e5768fa.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • protected[this] case class SpilledFile(
    • protected[this] class SpillReader(spill: SpilledFile)
    • protected[this] class IteratorForPartition(partitionId: Int,

@SparkQA
Copy link

SparkQA commented Jul 23, 2015

Test build #38259 has finished for PR 7129 at commit 1eca0e1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@ilganeli
Copy link
Author

@rxin @JoshRosen @davies Could I please get a review for this patch? Thanks.

@JoshRosen
Copy link
Contributor

Realistically, I think it's unlikely that any of us will have time to review this until next week since we're all busy working on things for the 1.5.0 feature freeze.

@SparkQA
Copy link

SparkQA commented Jul 31, 2015

Test build #39232 has finished for PR 7129 at commit bf63fce.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • protected[this] case class SpilledFile(
    • protected[this] class SpillReader(spill: SpilledFile)
    • protected[this] class IteratorForPartition(partitionId: Int,
    • class MulticlassClassificationEvaluator (override val uid: String)
    • class NaiveBayes(JavaEstimator, HasFeaturesCol, HasLabelCol, HasPredictionCol):
    • class NaiveBayesModel(JavaModel):
    • class MulticlassClassificationEvaluator(JavaEvaluator, HasLabelCol, HasPredictionCol):
    • abstract class TernaryExpression extends Expression

@marmbrus
Copy link
Contributor

marmbrus commented Sep 3, 2015

@ilganeli thanks a lot for working on this, but we've decided this probably isn't the right thing to do. Do you mind if we close this issue?

@ilganeli
Copy link
Author

ilganeli commented Sep 3, 2015

That’s fine, go for it.

From: Michael Armbrust <notifications@github.commailto:notifications@github.com>
Reply-To: apache/spark <reply@reply.github.commailto:reply@reply.github.com>
Date: Thursday, September 3, 2015 at 2:18 PM
To: apache/spark <spark@noreply.github.commailto:spark@noreply.github.com>
Cc: "Ganelin, Ilya" <ilya.ganelin@capitalone.commailto:ilya.ganelin@capitalone.com>
Subject: Re: [spark] [SPARK-8464][Core][Shuffle] Consider separating aggregator and non-aggregator paths in ExternalSorter (#7129)

@ilganelihttps://github.com/ilganeli thanks a lot for working on this, but we've decided this probably isn't the right thing to do. Do you mind if we close this issue?


Reply to this email directly or view it on GitHubhttps://github.com//pull/7129#issuecomment-137577700.


The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates and may only be used solely in performance of work or services for Capital One. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.

@asfgit asfgit closed this in 804a012 Sep 4, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants