Skip to content
illyfrancis edited this page Jul 26, 2013 · 6 revisions

Intro to Neo4j by Emil Eifram

Core abstractions

  • Nodes
  • Relationships between nodes
  • Properties on both

Example: the matrix

  • First Identify the entities
    • node id (mandatory attrib on a node)
    • node attributes
  • Nodes (Person) knows (relationships) other node
    • Relationships also have properties (e.g. the "knows" has age = 3 days)
    • can add arbitrary relationships "coded_by"

Code 1 : building a node space

GraphDatabaseService graphDb = ... // get factory (via DI or whatever)

Transaction tx = graphDb.beginTx();  // this is important!!! 

// create Thomas "Neo" Anderson
Node mrAnderson = graphDb.createNode();
mrAnderson.setProperty("name", "Thomas Anderson");
mrAnderson.setProperty("age", 29);

// create Morpheus
Node morpheus = graphDb.createNode();
morpheus.setProperty("name", "Morpheus");
morpheus.setProperty("rank", "Captain");
morpheus.setProperty("occupation", "Total bad ass");

// create a relationship representing that they know each other
mrAnderson.createRelationshipTo(morpheus, RelTypes.KNOWS);

// create Trinity, Cypher, Agent Smith, Architect similarly

tx.commit();

Code 1b : Defining RelationshipTypes

// In package org.neo4j.graphdb
public interface RelationshipType {
  String name();
}

// example of how to roll dynamic RelationshipTypes
class MyDynamicRelType implements RelationshipType {
  private final String name;
  MyDynamicRelType(String name) { this.name = name; }
  public String name() { return this.name; }
}

There's standard imp in the API by the way.

Dynamic relationship is useful when you don't know before. But if you do know use static relationship type

// Example on how to static-relationshiptype
enum MyStaticRelTypes implements RelationshipType {
  KNOWS,
  KNOWS_FOR,
}

Graph database is whiteboard friendly (14:30)

The Graph DB model : traversal

  • Traverser framework for high-performance traversing across the node space

Code 2 : Traversing a node space (20:00)

// instantiate a traverser that returns Mr Anderson's frieds
Traverser friedsTraverser = mrAnderson.traverse(
  Traverser.Order.BREATH_FIRST,            // (1) can be depth etc
  StopEvaluator.END_OF_GRAPH,              // (3) when to stop
  ReturnableEvaluator.ALL_BUT_START_NODE,  // (4) what to include
  RelTypes.KNOWS,                          // (2) a filter 
  Direction.OUTGOING );                    // (2.1)
  
// Traverse the node space and print out the result
System.out.println("Mr Anderson's friends");
for (Node friend: friendsTraverser) {
  System.out.printf("At depth %d => %s%n,
    friendsTraverser.currentPosition().getDepth(),
    friend.getProperty("name");
}

Example: Firends in love? (24:30)

// create a traverser that returns all "friends in love"
Traverser loveTraverser = mrAnderson.traverse(
  Traverser.Order.BREATH_FIRST,
  StopEvaluator.END_OF_GRAPH,

  // to select everyone that has outgoing relationship of type love
  new ReturnalbeEvaluator() {
    public boolean isReturnableNode(TraversalPosition pos) {
      return pos.currentNode().hasRelationship(
        RelTypes.LOVES, Direction.OUTGOING);
    }
  }

  RelTypes.KNOWS,
  Direction.OUTGOING );

Code (31: Custom traverser

// Traverse the node space and print out the result
System.out.println("Who's a lover");
for (Node person: loveTraverser) {
  System.out.printf("At depth %d => %s%n,
    loveTraverser.currentPosition().getDepth(),
    person.getProperty("name");
}

Bonus code: domain model (28:00)

  • How do you implement your domain model?

  • Use the delegator pattern, i.e. every domain entity wraps a Neo4j primitive:

    class PersonImpl implements Person { private final Node underlyingNode; PersonImpl(Node node) { this.underlyingNode = node; }

    public String getName() {
      return (String) this.underlyingNode.getProperty("name");
    }
    public void setName(String name) {
      this.underlyingNode.setProperty("name", name);
    }
    

    }

Domain layer frameworks (31:40)

  • Prob is there's a lot of "cruft" to write
  • Some frameworks
  • Qi4j(www.qi4j.org)
    • framework for doing DDD in pure Java 5
    • Defines Entities / Associations / Properties
      • == Nodes / Rel's / Properties
    • Neo4j is an "EntityStore" backend
  • Jo4neo
  • Spring Data Neo4j

Spring Data Neo4j example

  • SDN is "JPA without the suck" or "JPA for graph dbs"
  • How to declare a simple node-backed POJO

@NodeEntity
class Person {
  @Indexed
  private String name;
  public String getName() { 
    return name; 
  }
  public void setName(String name) {
    this.name = name;
  }
}

What are some use cases?

  • Social data
  • Spatial data (36:01)
    • Nodes - (name, lat, longitude) (e.g. Omni Hotel, lat=33948, long=1937823)
    • Relationship - (ROAD with property length = 3 miles)
    • Then workout shortest path between two places
  • Social AND spatial data
  • Financial data

Neo4j System Characteristics

  • Disk-based
    • Navtive graph storage engine with custom binary on-disk format
  • Transactional
    • JTA/JTS, XA, 2PC, Tx recovery, deadlock detection, multi version concurrency control (MVCC) etc
  • Scales up
    • Many billions of nodes/rels/props on single JVM
  • Robust

Performance?

For this example Assume

  • ~1k persons
  • Avg 50 firends per person
  • pathExists(a,b) limit depth 4
  • two backends (mysql & neo) and warmed up

Social network pathExists()

Pros and cons compared to RDBMS

Pros

  • No O/R impedance mismatch
  • Can easily evolve schemas
  • Can represent semi-structured info
  • Can represnt graphs/netwrosk (with performance)

Cons

  • Lacks in tool and framework support
  • Few other implementations => potential lock in
  • No support for ad-hoc queries (not quite true any more) -> Query languages (Cypher)

Query lang

  • Cypher - the new graph query language released in 1.4
    • uses pattern matching
  • Example

START user = node:people-index(name = "John")
MATCH (user)-[:FRIEND_OF]->()-[:FIREND_OF]->(fof)
RETURN fof
  • Example - with filter

START user = node(1,2,3)
MATCH (user)-[:FRIEND_OF]->(friend)
WHERE friend.age > 18
RETURN friend.name
  • Gremlin - a Groovy DSL for graphs
  • Ex

start.outE('FIRNED_OF').inV{ it.age > 18 }.name

Language bindings

  • Jython, Scala, Erlang, C#, Ruby PHP, Groovy etc

There's more

  • Neo4j HA - Clustering mode with master slave replication
  • Neo4j Server with RESTFful API
  • Neo4j Spatial for "all nodes withing 20km"
  • Pretty webadmin including Gremlin and Cypher support
  • and more...

Three editions

  • Neo4j Community (GPL)
  • Neo4j Advanced (AGPL - commercially supported) - monitoring and management
  • Neo4j Enterprise (AGPL / commercially supported) - high availability clustering

From neo4j.org Graph DB 101

ACL from Hell

  • Example Access Authorization (28:00)

For Bunnies

5. How do graph databases perform?

Good for read ops. Comparable to transactional, ACID compliant relational databases for write operations

6. How do graph databases scale?

  • Vertical scalability for read and write operations
    • adding more hardware will assure near-linear scalability of read/write ops
  • Horizontal scalability for read ops (not for writes)
    • adding more nodes to a HA cluster will allow for more query throughput
    • cache-sharding tech will allow for optimized read speed

Clone this wiki locally