Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
An open-source graph database
branch: master

Merge pull request #217 from barakmich/primarykey

Rewrite keys into concrete types, remove key package
latest commit 2c74cb1657
Barak Michener barakmich authored
Failed to load latest commit information.
config Move flag handling out of config into main
db Make query/... interfaces more idiomatic
docs Merge pull request #207 from LAlbertalli/master
graph Merge pull request #217 from barakmich/primarykey
http Make query/... interfaces more idiomatic
quad Run go vet
query Clean up a little lint and some shadowed variables
static Rename triple entities were relevant
svg Initial Commit
templates Rename triple entities were relevant
writer Rewrite keys into concrete types, remove key package
.gitignore Use github.com/peterh/liner for REPL lines
.goxc.json Add version numbers and ARM build
.travis.yml mathutil is back in line
30kmoviedata.nq.gz Fix missed case in quadfix... and re-run
AUTHORS Prevents repl panic
CONTRIBUTING.md Initial Commit
CONTRIBUTORS Prevents repl panic
LICENSE Initial Commit
README.md Remove the news in the README, add Trello Link
TODO.md Rename triple entities were relevant
app.yaml Use build constraints for appengine
appengine.go Clean up a little lint and some shadowed variables
cayley.cfg.example Initial Commit
cayley.go Add call to loadFn hook
cayley_appengine.cfg Rename GremlinTimeout -> Timeout
cayley_test.go Make query/... interfaces more idiomatic
testdata.nq Replace nt with nq in various places throughout

README.md

Cayley

Cayley is an open-source graph inspired by the graph database behind Freebase and Google's Knowledge Graph.

Its goal is to be a part of the developer's toolbox where Linked Data and graph-shaped data (semantic webs, social networks, etc) in general are concerned.

Build Status Trello Board

Features

  • Written in Go
  • Easy to get running (3 or 4 commands, below)
  • RESTful API
    • or a REPL if you prefer
  • Built-in query editor and visualizer
  • Multiple query languages:
    • JavaScript, with a Gremlin-inspired* graph object.
    • (simplified) MQL, for Freebase fans
  • Plays well with multiple backend stores:
  • Modular design; easy to extend with new languages and backends
  • Good test coverage
  • Speed, where possible.

Rough performance testing shows that, on consumer hardware and an average disk, 134m quads in LevelDB is no problem and a multi-hop intersection query -- films starring X and Y -- takes ~150ms.

* Note that while it's not exactly Gremlin, it certainly takes inspiration from that API. For this flavor, see the documentation.

Getting Started

Grab the latest release binary and extract it wherever you like.

If you prefer to build from source, see the documentation on the wiki at How to start hacking on Cayley

cd to the directory and give it a quick test with:

./cayley repl --dbpath=testdata.nq

You should see a cayley> REPL prompt. Go ahead and give it a try:

// Simple math
cayley> 2 + 2

// JavaScript syntax
cayley> x = 2 * 8
cayley> x

// See all the entities in this small follow graph.
cayley> graph.Vertex().All()

// See only dani.
cayley> graph.Vertex("dani").All()

// See who dani follows.
cayley> graph.Vertex("dani").Out("follows").All()

Sample Data

For somewhat more interesting data, a sample of 30k movies from Freebase comes in the checkout.

./cayley repl --dbpath=30kmoviedata.nq.gz

To run the web frontend, replace the "repl" command with "http"

./cayley http --dbpath=30kmoviedata.nq.gz

And visit port 64210 on your machine, commonly http://localhost:64210

Running queries

The default environment is based on Gremlin and is simply a JavaScript environment. If you can write jQuery, you can query a graph.

You'll notice we have a special object, graph or g, which is how you can interact with the graph.

The simplest query is merely to return a single vertex. Using the 30kmoviedata.nq dataset from above, let's walk through some simple queries:

// Query all vertices in the graph, limit to the first 5 vertices found.
graph.Vertex().GetLimit(5)

// Start with only one vertex, the literal name "Humphrey Bogart", and retrieve all of them.
graph.Vertex("Humphrey Bogart").All()

// `g` and `V` are synonyms for `graph` and `Vertex` respectively, as they are quite common.
g.V("Humphrey Bogart").All()

// "Humphrey Bogart" is a name, but not an entity. Let's find the entities with this name in our dataset.
// Follow links that are pointing In to our "Humphrey Bogart" node with the predicate "name".
g.V("Humphrey Bogart").In("name").All()

// Notice that "name" is a generic predicate in our dataset.
// Starting with a movie gives a similar effect.
g.V("Casablanca").In("name").All()

// Relatedly, we can ask the reverse; all ids with the name "Casablanca"
g.V().Has("name", "Casablanca").All()

You may start to notice a pattern here: with Gremlin, the query lines tend to:

Start somewhere in the graph | Follow a path | Run the query with "All" or "GetLimit"

g.V("Casablanca") | .In("name") | .All()

And these pipelines continue...

// Let's get the list of actors in the film
g.V().Has("name","Casablanca")
  .Out("/film/film/starring").Out("/film/performance/actor")
  .Out("name").All()

// But this is starting to get long. Let's use a morphism -- a pre-defined path stored in a variable -- as our linkage

var filmToActor = g.Morphism().Out("/film/film/starring").Out("/film/performance/actor")

g.V().Has("name", "Casablanca").Follow(filmToActor).Out("name").All()

There's more in the JavaScript API Documentation, but that should give you a feel for how to walk around the graph.

Disclaimer

Not a Google project, but created and maintained by a Googler, with permission from and assignment to Google, under the Apache License, version 2.0.

Contact

Something went wrong with that request. Please try again.