Skip to content

nikelin/gremlin-scala

 
 

Repository files navigation

logo Build Status Join the chat at https://gitter.im/mpollmeier/gremlin-scala

Gremlin-Scala for Apache Tinkerpop 3

A wrapper to use Apache Tinkerpop3 - a JVM graph traversal library - from Scala.

  • Scala friendly function signatures, aiming to be close to the Scala collection library.
    • use standard Scala functions - no need to worry about how to implement things like java.util.function.BiPredicate
  • Beautiful DSL to create vertices and edges
  • Type safe traversals
  • Minimal runtime overhead - only allocates additional instances if absolutely necessary
  • Nothing is hidden away, you can always easily access the underlying Gremlin-Java objects if needed, e.g. to access graph db specifics things like indexes

Getting started

The examples project comes with working examples for different graph databases. Typically you just need to add a dependency on "com.michaelpollmeier" %% "gremlin-scala" % "SOME_VERSION" and one for the graph db of your choice to your build.sbt.

Using the sbt console

  • tl;dr: sbt gremlin-scala/console
  • start sbt in the root project
> projects
[info]     gremlin-scala
[info]     macros
[info]   * root
>
  • Next, change to the gremlin-scala project using project gremlin-scala
  • Finally, to test out the API in a REPL type console

Vertices and edges with type safe properties

import gremlin.scala._
import org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph
val graph = TinkerGraph.open.asScala

// Keys for properties which can later be used for type safe traversals
val Founded = Key[String]("founded")
val Distance = Key[Int]("distance")
val Name = Key[String]("name")
val Population = Key[Int]("population")

// create labelled vertex
val paris = graph + "Paris"

// create vertex with typed properties
val london = graph + ("London", Founded  "43 AD")

// create labelled edges 
paris --- "OneWayRoad" --> london
paris <-- "OtherWayAround" --- london
paris <-- "Eurostar" --> london

// create edge with typed properties
paris --- ("Eurostar", Distance  495) --> london

// type safe access to properties
london.property(Founded) //Property(founded->43 AD)
london.value2(Founded) //43 AD
paris.out("Eurostar").value(Founded).head //43 AD
paris.outE("Eurostar").value(Distance).head //495

More working examples in SchemaSpec.

Other means to access properties

v1.keys // Set(Key("name"), Key("age"))
v1.property(Name) // "marko"
v1.valueMap // Map("name" → "marko", "age" → 29)
v1.valueMap("name", "age") // Map("name" → "marko", "age" → 29)
v1.properties("name", "age").length // 2
v1.properties.length // 2

More working examples in ElementSpec

Compiler helps to eliminate invalid traversals

Gremlin-Scala aims to helps you at compile time as much as possible. Take this simple example:

import gremlin.scala._
import org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph
val graph = TinkerGraph.open.asScala
graph.V.outE.inV  //compiles
graph.V.outE.outE //does _not_ compile

In Gremlin-Groovy there's nothing stopping you to create the second traversal - it will explode at runtime, as outgoing edges do not have outgoing edges. In Gremlin-Scala this simply doesn't compile.

Type safe traversals

You can label any step using as(StepLabel) and the compiler will infer the correct types for you in the select step using an HList (a type safe list, i.e. the compiler knows the types of the elements of the list). In Gremlin-Java and Gremlin-Groovy you get a Map[String, Any], so you have to cast to the type you think it will be, which is ugly and error prone. For example:

// use :paste in Scala REPL
import gremlin.scala._
import shapeless._
import org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerFactory
import org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph
def g = TinkerFactory.createModern.asScala

// select all labelled steps
g.V(1).as("a")
.outE.as("b")
.select
.toList
// returns a `(Vertex, Edge)` for each path


// select subset of labelled steps
val a = StepLabel[Vertex]()
val b = StepLabel[Edge]()
val c = StepLabel[Double]()

g.V(1)
.outE("created").as(b)
.value("weight").as(c)
.select((b, c)) //step labels parsed as tuple of any size
.head
// returns a `(Edge, Double)`

More working examples in SelectSpec. Kudos to shapeless and Scala's sophisticated type system that made this possible.

Marshalling vertices from/to case classes

You can save and load case classes as a vertex - implemented with a blackbox macro. The macro has first class support for ids and labels. Scala's Option types will be automatically unwrapped, i.e. a Some[A] will be stored as the value of type A in the database, or null if it's None. If we wouldn't unwrap it, the database would have to understand Scala's Option type itself.

If you want the case class to contain the vertex id (you don't have to!), then just annotate it with @id. You need to make sure that the id has the right type (depending on the graph database you use), otherwise you will get a class cast exception at runtime. The @label annotation is optional as well.

// this does _not_ work in a REPL
@label("example")
case class Example(@id id: Option[Int],
                   longValue: Long,
                   stringValue: Option[String])

object MyLogic {
  import gremlin.scala._
  import org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph

  val graph = TinkerGraph.open.asScala
  val example = Example(None, Long.MaxValue, Some("optional value"))
  val v = graph.addVertex(example)
  v.toCC[Example] // like example, but with id set
}

You can also define your own marshaller, if the macro generated one doesn't quite cut it. For that and more examples check out the MarshallableSpec.

Some more advanced traversals

Here are some examples of more complex traversals from the examples repo. If you want to run them yourself, check out the tinkergraph examples in there.

What is Die Hard's average rating?

// use :paste in Scala REPL
graph.V.has("movie", "name", "Die Hard")
  .inE("rated")
  .values("stars")
  .mean
  .head

Get the maximum number of movies a single user rated

// use :paste in Scala REPL
g.V.hasLabel("person")
  .flatMap(_.outE("rated").count)
  .max
  .head

What 80's action movies do 30-something programmers like? Group count the movies by their name and sort the group count map in decreasing order by value.

// use :paste in Scala REPL
g.V
  .`match`(
    __.as("a").hasLabel("movie"),
    __.as("a").out("hasGenre").has("name", "Action"),
    __.as("a").has("year", P.between(1980, 1990)),
    __.as("a").inE("rated").as("b"),
    __.as("b").has("stars", 5),
    __.as("b").outV().as("c"),
    __.as("c").out("hasOccupation").has("name", "programmer"),
    __.as("c").has("age", P.between(30, 40))
  )
  .select[Vertex]("a")
  .map(_.value[String]("name"))
  .groupCount()
  .order(Scope.local).by(Order.valueDecr)
  .limit(Scope.local, 10)
  .head

What is the most liked movie in each decade?

// use :paste in Scala REPL
g.V()
  .hasLabel("movie")
  .where(_.inE("rated").count().is(P.gt(10)))
  .group { v ⇒
    val year = v.value[Integer]("year")
    val decade = (year / 10)
    (decade * 10): Integer
  }
  .map { moviesByDecade ⇒
    val highestRatedByDecade = moviesByDecade.mapValues { movies ⇒
      movies.toList
        .sortBy { _.inE("rated").values("stars").mean().head }
        .reverse.head //get the movie with the highest mean rating
    }
    highestRatedByDecade.mapValues(_.value[String]("name"))
  }
  .order(Scope.local).by(Order.keyIncr)
  .head

Help - it's open source!

If you would like to help, here's a list of things that needs to be addressed:

Further reading

For more information about Gremlin see the Gremlin docs and the Gremlin users mailinglist. Please note that while Gremlin-Scala is very close to the original Gremlin, there are differences to Gremlin-Groovy - don't be afraid, they hopefully all make sense to a Scala developer ;)

Random links:

About

Scala wrapper for Apache TinkerPop 3 Graph DSL

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Scala 98.3%
  • Other 1.7%