Skip to content
This repository has been archived by the owner. It is now read-only.


Kenny Bastani edited this page Jan 15, 2014 · 28 revisions

Graph Concepts

Breadth First


Depth First




Neo4j Internals


Relationship Chain

What is a Relationship Chain?

The relationship chain is a doubly linked list that contains next and previous pointers to relationship records for both the start and end nodes of a given relationship record.

How is a Relationship Chain implemented?

The corresponding Neo4j internals class file for the relationship chain is located in:


The output of the ToString() method is a good illustration of how a RelationshipRecord implements the relationship chain.

Why is a Relationship Chain relevant to Neo4j?

The relationship chain is a key component of Neo4j's traversal framework. Each RelationshipRecord is a fixed length consisting of 33 bytes. The composition of these bytes provides runtime traversals a local context in which to operate within.

The relationship chain is pivotal for two primary reasons.

  • It allows for deletions within the database by just relinking pointers.
  • During loading at runtime, the thread follows the pointer to the next relationship ID in the chain, which is the next record.

Node Record

What it a Node Record?

Records are format we represent Neo4j's nodes and relationships on disk. It's always 14 bytes fixed size for nodes and points on the first relationship and property.

How is an node-record implemented?

There is the node-record on disk. It is loaded by the NodeStore and represented as NodeRecord instance in Neo4j. These NodeRecords are then used to load information about the node into a NodeImpl object.

Why is a Node Record relevant to Neo4j?

Fixed size blocks allow direct, fast access with the internal id, e.g. record # 1000 is found at position 14000 (1000 x 14). Whole regions of the store files are mapped into memory. The operating system makes portions of a file available in memory and takes care of syncing to disk. So we can access node records even faster. The node record is the database structure (starting point) for the graph element of a node.

You can read more in the manual about file buffer cache and how much a node with or without relationships weigh. Webinar:





What is JMX?

JMX stands for Java Management Extensions, which is a Java based technology that provides a set of services and tools for monitoring applications running on the Java Virtual Machine (JVM).

How is JMX implemented?

A managed bean, also called an MBean, is a type of JavaBean that is implemented using JMX technology. An MBean notifies an MBeanServer of its internal changes which is subscribed to by other applications.

Why is JMX relevant to Neo4j?

JMX enables application level monitoring and visibility into memory usage and garbage collection information using an event publish-subscribe pattern. This is critical for understanding resource usage and performance monitoring of the Neo4j internals at runtime.

Object Cache

The Object Cache creates a hash table for each Neo4j node implementation and their connected relationships by relationship type and the node's internal identifier. The Object Cache is used to traverse relationships by type and provides direct access to its properties.

Source Files:

  • ./src/main/java/org/neo4j/kernel/impl/cache/
  • ./src/main/java/org/neo4j/kernel/impl/core/


Master-Slave Replication

Master Election


Execution Plan


Memory Mapping


In computer science, ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties that guarantee that database transactions are processed reliably. In the context of databases, a single logical operation on the data is called a transaction. For example, a transfer of funds from one bank account to another, even involving multiple changes such as debiting one account and crediting another, is a single transaction.

In Neo4j, transactions are a way of interacting with the database server. Transactions are the medium in which data is put in and taken out of Neo4j. Transactions can commit multiple changes to the database in a single request.

Isolation Level


Multiversion concurrency control (MVCC), is a concurrency control method commonly used by database management systems to provide concurrent access to the database and in programming languages to implement transactional memory. Concurrent readers and writers. There are different forms of MVCC, which aim to keep logically different versions of each data item.

Transactional Memory

Transactional memory attempts to simplify concurrent programming by allowing a group of load and store instructions to execute in an atomic way. It is a concurrency control mechanism analogous to database transactions for controlling access to shared memory in concurrent computing.

You can’t perform that action at this time.