Frameless is a proof-of-concept library for working with Spark using more expressive types. It consists of the following modules:
datasetfor more strongly typed
Datasets (supports Spark 2.0.x)
catsfor using Spark with cats (supports Cats 0.7.x)
The Frameless project and contributors support the Typelevel Code of Conduct and want all its associated channels (e.g. GitHub, Gitter) to be a safe and friendly environment for contributing and learning.
- TypedDataset: Feature Overview
- Comparing TypedDatasets with Spark's Datasets
- Typed Encoders in Frameless
- Injection: Creating Custom Encoders
- Using Cats with RDDs
- Proof of Concept: TypedDataFrame
Benefits of using
TypedDataset compared to vanilla
- Typesafe columns referencing and expressions
- Customizable, typesafe encoders
- Typesafe casting and projections
- Enhanced type signature for some built-in functions
Frameless is compiled against Scala 2.11.x.
Note that while Frameless is still getting off the ground, it is very possible that breaking changes will be made for at least the next few versions.
To use frameless add the following dependencies as needed:
resolvers += Resolver.sonatypeRepo("releases") val framelessVersion = "0.2.0" libraryDependencies ++= List( "io.github.adelbertc" %% "frameless-cats" % framelessVersion, "io.github.adelbertc" %% "frameless-dataset" % framelessVersion )
We require at least one sign-off (thumbs-up, +1, or similar) to merge pull requests. The current maintainers (people who can merge pull requests) are:
Code is provided under the Apache 2.0 license available at http://opensource.org/licenses/Apache-2.0, as well as in the LICENSE file. This is the same license used as Spark.