Skip to content

Batches

jadell edited this page Oct 7, 2013 · 6 revisions

When manipulating data in any type of datastore, it is often useful to commit multiple changes at once. Neo4j supports transactions, and exposes this ability in its REST API via batches.

In neo4jphp, a batch can be used to group together many create, update, delete and index operations that must all succeed or fail together. When the batch is committed, if any individual operation fails, the entire batch fails, thus providing transactional semantics. Using a batch can also provide a performance improvement over performing multiple individual operations (in some circumstances.)

Using a Batch

The following code demonstrates creating two nodes, a relationship between them, and then indexing both nodes on various key/value pairs.

$ford = $client->makeNode();
$zaphod = $client->makeNode();

$cousins = $ford->relateTo($zaphod, 'KNOWS');

$characterIndex = new Everyman\Neo4j\Index($client, Index::TypeNode, 'characters');

// Create the batch and add operations
$batch = new Everyman\Neo4j\Batch($client);
$batch->save($ford);
$batch->save($zaphod);
$batch->save($cousins);
$batch->addToIndex($characterIndex, $ford, 'personality', 'hoopy');
$batch->addToIndex($characterIndex, $zaphod, 'heads', 2);

// Commit the batch altogether
$batch->commit();

The Batch::commit() operation will return true if the batch was successfully committed, and throw an exception otherwise.

Another neat trick with batches is that not every entity needs to be saved individually. If any node in a relationship needs to be created, it will be created when the relationship is committed. Any node or relationship being indexed will also be created if necessary.

The following code commits the same operations as above, without explicitly saving the nodes:

$batch = new Everyman\Neo4j\Batch($client);
$batch->save($cousins);
$batch->addToIndex($characterIndex, $ford, 'personality', 'hoopy');
$batch->addToIndex($characterIndex, $zaphod, 'heads', 2);
$batch->commit();

Using a Batch to Create Many Nodes

In theory, there is no upper bound on home many operations can be committed in a single batch. In practice, it depends on how much memory has been allocated to your Neo4j server.

The following code creates 10000 nodes, then commits them to the server all at once:

$client->startBatch();
for ($i=0; $i < 10000; $i++) {
    $node = $client->makeNode()->save($node);
}
$client->commitBatch();

Transparent Batches

It is often desirable to have multiple data manipulations from many parts of an application applied in the same batch. In these cases, it can be inconvenient to pass around a Batch object. To handle these cases, it is possible to start a batch in the client, and for all subsequent data manipulations to be batched togther until that batch is committed. Here is the syntax for these implicit batches:

$batch = $client->startBatch();

// None of the following operations are sent to the server until...
$nodeA->save();
$nodeB->save();
$nodeA->relateTo($nodeB, 'KNOWS')->save();
$nameIndex->add($nodeB, 'name', 'foo');

// ...now
$client->commitBatch();

// or
$batch->commit();

If an implicit batch needs to be cancelled, call Client::endBatch(). The batch can still be committed if a reference to the batch returned by the original Client::startBatch() call is retained, but all subsequent data manipulation operations will be sent to the server one at a time.

$batch = $client->startBatch();

// These operations are part of $batch
$nodeA->save();
$nameIndex->add($nodeA, 'name', 'foo');

// We don't want any further operations to be in $batch
$client->endBatch();

// This happens right away
$nodeB->save();

// We can still commit the original $batch
$batch->commit();