Skip to content

graphframes/graphframes

master
Switch branches/tags
Code

Latest commit

* Upgrade to latest spark versions

Add CI build for spark 3.3.0

Upgrade spark 3.2.0 to 3.2.2

Upgrade spark 3.1.2 to 3.1.3

* Share single SparkContext for all tests

The previous setup would cause test to end up using a stopped spark
context causing them to fail. This seems to be due to some change in
spark 3.3.0, by making the whole suite using a single SparkContext this
issue goes away.
100fb01

Git stats

Files

Permalink
Failed to load latest commit information.

graphframes

Build Status codecov.io

GraphFrames: DataFrame-based Graphs

This is a package for DataFrame-based graphs on top of Apache Spark. Users can write highly expressive queries by leveraging the DataFrame API, combined with a new API for motif finding. The user also benefits from DataFrame performance optimizations within the Spark SQL engine.

You can find user guide and API docs at https://graphframes.github.io/graphframes.

Building and running unit tests

To compile this project, run build/sbt assembly from the project home directory. This will also run the Scala unit tests.

To run the Python unit tests, run the run-tests.sh script from the python/ directory. You will need to set SPARK_HOME to your local Spark installation directory.

Spark version compatibility

This project is compatible with Spark 2.4+. However, significant speed improvements have been made to DataFrames in more recent versions of Spark, so you may see speedups from using the latest Spark version.

Contributing

GraphFrames is collaborative effort among UC Berkeley, MIT, and Databricks. We welcome open source contributions as well!

Releases:

See release notes.