The Official Couchbase Spark Connector
Scala Java
Latest commit 9af686c Jan 20, 2017 @daschl daschl SPARKC-71: Pre-filter loaded dataframe.
Motivation
----------
For some reason, if the dataframe is not filtered to the selected
columns before turning it into an RDD and sending it back to Spark
the RDD does not come out at the user level with all the schema
columns, which leads to inconsistent behavior when selecting
only a couple fields.

This might be a bug in spark, but I'm not sure at this point (or
if its just the way its supposed to work).

Modifications
-------------
Preprocess the dataframe with the selected field columsn as long as
the right schema is in scope, so that when passed up the stack everything
is already in the right place.

Result
------
Projections work.

Change-Id: Ida7394c0394a459410cf4d643582540422f9fdbd
Reviewed-on: http://review.couchbase.org/72273
Reviewed-by: Michael Nitschinger <michael@nitschinger.at>
Tested-by: Michael Nitschinger <michael@nitschinger.at>

README.md

Couchbase Spark Connector

A library to integrate Couchbase Server with Spark in order to use it as a data source and target in various ways.

Linking

You can link against this library (for Spark 2.0) in your program at the following coordinates:

groupId: com.couchbase.client
artifactId: spark-connector_2.11
version: 2.0.0

If you are using SBT:

libraryDependencies += "com.couchbase.client" %% "spark-connector" % "2.0.0"

Documentation

The official documentation, including a quickstart guide can be found here).

Version Compatibility

Each minor release is targeted for a specific spark version and once released branched away. Couchbase maintains bugfix releases for the branches where appropriate, please see Maven Central or Spark Packages for releases to download.

Connector Spark
2.0.x 2.0
1.2.x 1.6
1.1.x 1.5
1.0.x 1.4

License

Copyright 2015,2016 Couchbase Inc.

Licensed under the Apache License, Version 2.0.

See the Apache 2.0 license.