Method not found when creating time series add #53

geoHeil · 2018-10-13T15:55:44Z

val tsRdd = TimeSeriesRDD.fromDF(dataFrame = df)(isSorted = true, timeUnit = MILLISECONDS)

throws

scala> val tsRdd = TimeSeriesRDD.fromDF(dataFrame = cellFeed)(isSorted = true, timeUnit = MILLISECONDS)
java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.plans.physical.ClusteredDistribution$.apply$default$2()Lscala/Option;
  at com.twosigma.flint.timeseries.TimeSeriesStore$.isClustered(TimeSeriesStore.scala:149)
  at com.twosigma.flint.timeseries.TimeSeriesStore$.apply(TimeSeriesStore.scala:64)
  at com.twosigma.flint.timeseries.TimeSeriesRDD$.fromDFWithPartInfo(TimeSeriesRDD.scala:509)
  at com.twosigma.flint.timeseries.TimeSeriesRDD$.fromDF(TimeSeriesRDD.scala:304)
  ... 52 elided

on spark 2.2 when trying to create the initial RDD.

Minimal reproducible sample:

import spark.implicits._
  import com.twosigma.flint.timeseries.TimeSeriesRDD
  import scala.concurrent.duration._
  val df = Seq((1, 1, 1L), (2, 3, 1L), (1, 4, 2L), (2, 2, 2L)).toDF("id", "value", "time")
  val tsRdd = TimeSeriesRDD.fromDF(dataFrame = df)(isSorted = true, timeUnit = MILLISECONDS)

on spark 2.2 via HDP 2.6.4

The text was updated successfully, but these errors were encountered:

geoHeil · 2018-10-14T07:17:27Z

Regular spark 2.3.2 works fine, regular spark 2.2.1 also fails with above exception.

geoHeil · 2018-10-14T11:54:10Z

When creating a custom build like https://github.com/geoHeil/flint/tree/flint-spark-2.2 the basic example works again with spark 2.2

However, there are 2 unit test failures

CSV reading (some timestamps did not match), maybe due to different locale

 - should correctly convert SQL TimestampType with default format *** FAILED *** (237 milliseconds)
[info]   1199232000000000000 did not equal 1199228400000000000 (CSVSpec.scala:100)

NOTE: csv test failure is also is on regular master branch

partition preserving something failed because a list was empty and head was called

icexelloss · 2018-10-14T12:45:27Z

Hi! The current flint version only supports Spark 2.3.

…

On Sun, Oct 14, 2018 at 7:54 AM geoHeil ***@***.***> wrote: When creating a custom build like https://github.com/geoHeil/flint/tree/flint-spark-2.2 the basic example works again with spark 2.2 — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#53 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAwbrBao_wLlK5O8IK28fqQZ26J2RywOks5ukyXjgaJpZM4XasUA> .

geoHeil closed this as completed Oct 20, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Method not found when creating time series add #53

Method not found when creating time series add #53

geoHeil commented Oct 13, 2018 •

edited

Loading

geoHeil commented Oct 14, 2018

geoHeil commented Oct 14, 2018 •

edited

Loading

icexelloss commented Oct 14, 2018 via email

Method not found when creating time series add #53

Method not found when creating time series add #53

Comments

geoHeil commented Oct 13, 2018 • edited Loading

geoHeil commented Oct 14, 2018

geoHeil commented Oct 14, 2018 • edited Loading

icexelloss commented Oct 14, 2018 via email

geoHeil commented Oct 13, 2018 •

edited

Loading

geoHeil commented Oct 14, 2018 •

edited

Loading