Skip to content

Conversation

@cloud-fan
Copy link
Contributor

No description provided.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

java doesn't support contravariant in TypedColumn, which makes this typed aggregate API hard to use at java side.

cc @marmbrus @rxin

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is unfortunate. I think that the only thing we can do is use the DataFrame version of agg and then cast afterwards.

    Dataset<Tuple4<String, Integer, Long, Long>> agged2 = grouped.agg(
            new IntSumOf().toColumn(e.INT(), e.INT()),
            expr("sum(_2)"),
            count("*")).as(e.tuple(e.STRING(), e.INT(), e.LONG(), e.LONG()));

@SparkQA
Copy link

SparkQA commented Nov 10, 2015

Test build #45511 has finished for PR 9591 at commit 479d8c5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * abstract class Aggregator[-A, B, C] extends Serializable\n * case class Record(data: ByteBuffer, time: Long, promise: Promise[WriteAheadLogRecordHandle])\n

@SparkQA
Copy link

SparkQA commented Nov 10, 2015

Test build #45523 has finished for PR 9591 at commit 4e0c686.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * abstract class Aggregator[-A, B, C] extends Serializable\n

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should do something about this. I think that if we rename the object to Encoders then Scala will build static forwarders and we can just call Encoders.STRING()

@SparkQA
Copy link

SparkQA commented Nov 11, 2015

Test build #45595 has finished for PR 9591 at commit ae55976.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * abstract class Aggregator[-A, B, C] extends Serializable\n

@cloud-fan
Copy link
Contributor Author

retest this please.

@SparkQA
Copy link

SparkQA commented Nov 11, 2015

Test build #45623 has finished for PR 9591 at commit ae55976.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * abstract class Aggregator[-A, B, C] extends Serializable\n

@SparkQA
Copy link

SparkQA commented Nov 11, 2015

Test build #45632 has finished for PR 9591 at commit 3806107.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * abstract class Aggregator[-A, B, C] extends Serializable\n

@cloud-fan
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Nov 11, 2015

Test build #45635 has finished for PR 9591 at commit 3806107.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * abstract class Aggregator[-A, B, C] extends Serializable\n

@SparkQA
Copy link

SparkQA commented Nov 11, 2015

Test build #2045 has finished for PR 9591 at commit 3806107.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * abstract class Aggregator[-A, B, C] extends Serializable\n

@SparkQA
Copy link

SparkQA commented Nov 11, 2015

Test build #2046 has finished for PR 9591 at commit 3806107.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * abstract class Aggregator[-A, B, C] extends Serializable\n

@SparkQA
Copy link

SparkQA commented Nov 12, 2015

Test build #45709 has finished for PR 9591 at commit b6de790.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * abstract class Aggregator[-A, B, C] extends Serializable\n

@cloud-fan
Copy link
Contributor Author

retest this please.

@SparkQA
Copy link

SparkQA commented Nov 12, 2015

Test build #45723 has finished for PR 9591 at commit b6de790.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * abstract class Aggregator[-A, B, C] extends Serializable\n

@cloud-fan
Copy link
Contributor Author

retest this please.

@SparkQA
Copy link

SparkQA commented Nov 12, 2015

Test build #45740 has finished for PR 9591 at commit b6de790.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * abstract class Aggregator[-A, B, C] extends Serializable\n

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should just move this to Queryable with docs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, maybe we need to make GroupQueryable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rxin I think we need to expose this for the java users to work, while still getting nice inference for scala

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to provide a tEncoder here, which is not the case of GroupedData.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, we still need to add scaladoc then.

@SparkQA
Copy link

SparkQA commented Nov 13, 2015

Test build #45821 has finished for PR 9591 at commit 7c9eb43.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * class TypedColumn[-T, U](\n * abstract class Aggregator[-A, B, C] extends Serializable\n * class JavaTrackStateDStream[KeyType, ValueType, StateType, EmittedType](\n

@SparkQA
Copy link

SparkQA commented Nov 14, 2015

Test build #45907 has finished for PR 9591 at commit ae752fd.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * abstract class Aggregator[-A, B, C] extends Serializable\n

@cloud-fan
Copy link
Contributor Author

retest this please.

@SparkQA
Copy link

SparkQA commented Nov 14, 2015

Test build #45925 has finished for PR 9591 at commit ae752fd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * abstract class Aggregator[-A, B, C] extends Serializable\n

@marmbrus
Copy link
Contributor

Thanks! Merging to master and 1.6.

@asfgit asfgit closed this in fd14936 Nov 16, 2015
asfgit pushed a commit that referenced this pull request Nov 16, 2015
Author: Wenchen Fan <wenchen@databricks.com>

Closes #9591 from cloud-fan/agg-test.

(cherry picked from commit fd14936)
Signed-off-by: Michael Armbrust <michael@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants