Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spark-redis support for spark 1.6 and 2.0 - as well as dataframes #30

Closed
cfregly opened this issue May 25, 2016 · 7 comments
Closed

spark-redis support for spark 1.6 and 2.0 - as well as dataframes #30

cfregly opened this issue May 25, 2016 · 7 comments

Comments

@cfregly
Copy link

cfregly commented May 25, 2016

is it just me, or does the spark-redis connector not work with spark 1.6?

assuming it won't work with spark 2.0, but haven't tested.

lastly, i was under the impression that this connector supports the Spark DataFrame API.

i've been trying to integrate Redis into my end-to-end recommendation pipeline github repo: https://github.com/fluxcapacitor/pipeline/wiki

but it's crashing all over the place.

any assistance would be appreciated. for now, i'm just using jedis directly, but that's not ideal.

please reach out to me directly @ chris@fregly.com if you'd prefer. we can circle back on this issue once we figure things out.

@dvirsky
Copy link
Contributor

dvirsky commented May 25, 2016

Chris, can you attach some crash stacks? I'm running the connector with 1.6, as do others AFAIK.

Re DataFrames - it's actually in progress, expect it in a matter of days. (there's an sql branch with the ongoing work)

@cfregly
Copy link
Author

cfregly commented May 29, 2016

hey @dvirsky !

thanks for the quick response.

i had already removed the Spark-Redis Connector from the PANCAKE STACK before i saw your response, so i had to re-introduce it to get you more info.

so i'm seeing the same compile-time error in both a Spark job and a Zeppelin notebook:

<console>:78: error: value fromRedisKV is not a member of org.apache.spark.SparkContext

it looks like the implicits aren't being attached to the SparkContext. i thought maybe it was just a notebook/zeppelin thing, but it's doing the same in both places.

more about my env: Scala v2.10.5, Zeppelin v? (custom build from a while back), Spark 1.6.1.

i've got the sbt build.sbt and Zeppelin SPARK_PACKAGES configs setup properly because the imports are valid and I can actually do adds as follows:

import com.redislabs.provider.redis._

sc.toRedisKV(sc.parallelize(("key1", "val1") :: Nil), ("127.0.0.1", 6379))
sc.toRedisKV(sc.parallelize(("key2", "val2") :: Nil), ("127.0.0.1", 6379))

i could probably debug this further, but there's a lot of letters in PANCAKE that require my attention at the moment! :)

if you help me out, i'll introduce an R and rename it the CRACK STAK - or cheesy equiv.

here's some links to the live zeppelin notebook - as well as the Spark App code - if you want to take a look (no guarantees on the live zeppelin notebook being available... dev/demo server)

and anytime you're ready with that Redis DataFrame code, let me know! :)

-chris

@dvirsky
Copy link
Contributor

dvirsky commented May 29, 2016

the snippet you've posted looks like an older version of spark-redis. care to try the newest one, 0.2? https://spark-packages.org/package/RedisLabs/spark-redis

@cfregly
Copy link
Author

cfregly commented May 29, 2016

sure, I will try here in a little bit.

to be clear, I'll need to build from source as there is no official 0.2 release that has been tagged in github or released to http://dl.bintray.com/spark-packages/maven/RedisLabs/spark-redis/

and the docs still reference 0.1.1.

I'm very bleeding-edge friendly, but just want to make sure I'm not missing something.

@cfregly
Copy link
Author

cfregly commented May 29, 2016

ok, that worked in both the notebook and spark job.

  • Write
import com.redislabs.provider.redis._

sc.toRedisKV(sc.parallelize(("key1", "val1") :: Nil))
sc.toRedisKV(sc.parallelize(("key2", "val2") :: Nil))
  • Read
val valuesRDD = sc.fromRedisKV("key1")
valuesRDD.collect()
...
valuesRDD: org.apache.spark.rdd.RDD[(String, String)] = RedisKVRDD[7] at RDD at RedisRDD.scala:19
res10: Array[(String, String)] = Array((key1,val1))

@gkorland
Copy link
Contributor

2.3.0 support was released the can be found here:
https://oss.sonatype.org/content/groups/staging/com/redislabs/spark-redis/2.3.0/

@gkorland
Copy link
Contributor

As for the Dataframes/Datesets support see #32

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants