Neo4j has a batch insertion facility intended for initial imports, which bypasses transactions and other checks in favor of performance. This is useful when you have a big dataset that needs to be loaded once.
Batch insertion is included in the neo4j-kernel component, which is part of all Neo4j distributions and editions.
Be aware of the following points when using batch insertion:
-
The intended use is for initial import of data but you can use it on an existing database if the existing database is shutdown first.
-
Batch insertion is not thread safe.
-
Batch insertion is non-transactional.
-
Batch insertion is not enforcing constraints on the inserted data while inserting data.
-
Batch insertion will re-populate all existing indexes and indexes created during batch insertion on
shutdown
. -
Batch insertion will verify all existing constraints and constraints created during batch insertion on
shutdown
. -
Unless
shutdown
is successfully invoked at the end of the import, the database files will be corrupt.
Warning
|
Always perform batch insertion in a single thread (or use synchronization to make only one thread at a time access the batch inserter) and invoke |
Warning
|
Since the batch insertion doesn’t enforce constraint during data loading, if the inserted data violate any constraint the batch inserter will fail on |
To bulk load data using the batch inserter you’ll need to write a Java application which makes use of the low level BatchInserter
interface.
Tip
|
You can’t have multiple threads using the batch inserter concurrently without external synchronization. |
You can get hold of an instance of BatchInserter
by using BatchInserters
.
Here’s an example of the batch inserter in use:
component=neo4j-kernel-docs source=examples/BatchInsertDocTest.java tag=insert
When creating a relationship you can set properties on the relationship by passing in a map containing properties rather than null
as the last parameter to createRelationship
.
It’s important that the call to shutdown
is inside a finally block to ensure that it gets called even if exceptions are thrown.
If he batch inserter isn’t cleanly shutdown then the consistency of the store is not guaranteed.
Tip
|
The source code for the examples on this page can be found here: BatchInsertDocTest.java |
You can pass custom configuration options to the BatchInserter
.
(See [configuration-batchinsert] for information on the available options.)
e.g.
component=neo4j-kernel-docs source=examples/BatchInsertDocTest.java tag=configuredInsert
Alternatively you could store the configuration in a file:
link:../batchinsert-config[role=include]
You can then refer to that file when initializing BatchInserter
:
component=neo4j-kernel-docs source=examples/BatchInsertDocTest.java tag=configFileInsert
Although it’s a less common use case, the batch inserter can also be used to import data into an existing database. However, you will need to ensure that the existing database is shut down before you write to it.
Warning
|
Since the batch importer bypasses transactions there is a possibility of data inconsistency if the import process crashes midway. We would strongly suggest you take a backup of your existing database before using the batch inserter against it. |