A library for reading and writing data from and to Redis with Apache Spark

Spark-Redis provides access to all of Redis' data structures - String, Hash, List, Set and Sorted Set - from Spark as RDDs. It also supports reading/writing Dataframes and Spark SQL syntax.

The library can be used both with Redis stand-alone as well as clustered databases. When used with Redis cluster, Spark-Redis is aware of its partitioning scheme and adjusts in response to resharding and node failure events.

Spark-Redis also provides Spark-Streaming support.

Version compatibility and branching

The library has several branches, each corresponds to a different supported Spark version. For example, 'branch-2.3' works with any Spark 2.3.x version. The master branch contains the recent development for the next release.

Spark-Redis Spark Redis Supported Scala Versions
2.3 2.3 >=2.9.0 2.11
1.4 1.4 2.10

Known limitations

  • Java, Python and R API bindings are not provided at this time

Additional considerations

This library is work in progress so the API may change before the official release.



You're encouraged to contribute to the open source Spark-Redis project. There are two ways you can do so.


If you encounter an issue while using the Spark-Redis library, please report it at the project's issues tracker.

Pull request

Code contributions to the Spark-Redis project can be made using pull requests. To submit a pull request:

  1. Fork this project.
  2. Make and commit your changes.
  3. Submit your changes as a pull request.