Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Revamped parallelism of parts of batch importer
The main story here is composed of two things: - a new ForkedProcessorStep which does parallelization inside each batch, executed one by one. This to avoid difficulties parallelizing some steps which has a costly section which isn't parallelizable. With this new step items in a batch can be striped such that each forked processor knows which parts to process. - better mechanical sympathy where most stages are optimized to work with batch sizes matching pages in the page cache of the store they (mainly) work with. The forked processor simplifies a couple of stages, there are now no artificial additional steps for splitting or otherwise modify batches to be better parallelizable. Also the whole stage scales better with added processors because the old way of parallelizing those stages often involved a step which was single-threaded and acted as a divider-of-work. Such a step would often become the bottleneck in the end anyway. About mechanical sympathy the main problem previously was that reader and writer of stages which read from and wrote to the same store actually contended on each other. Given the smaller batch size, there were multiple batches of read records for any given page. Later in the stage where store was updated would often update the same page and so the reader (still reading that page) would need to do mych more retry- reading and so slow the whole stage down. Now with the aligned batch sizes the reader doesn't contend with the writers in the page cache. Additionally the main store updating step have been split into steps updating entities and properties separately, this to have the entity updating able to go even faster. The net result of this change as a whole should be that more often the disk is the only main bottleneck. On test machines and development laptops a 2x-3x performance improvement of the importer have been observed.
- Loading branch information
Showing
46 changed files
with
1,592 additions
and
865 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
66 changes: 66 additions & 0 deletions
66
...kernel/src/main/java/org/neo4j/unsafe/impl/batchimport/AssignRelationshipIdBatchStep.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
/* | ||
* Copyright (c) 2002-2016 "Neo Technology," | ||
* Network Engine for Objects in Lund AB [http://neotechnology.com] | ||
* | ||
* This file is part of Neo4j. | ||
* | ||
* Neo4j is free software: you can redistribute it and/or modify | ||
* it under the terms of the GNU General Public License as published by | ||
* the Free Software Foundation, either version 3 of the License, or | ||
* (at your option) any later version. | ||
* | ||
* This program is distributed in the hope that it will be useful, | ||
* but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
* GNU General Public License for more details. | ||
* | ||
* You should have received a copy of the GNU General Public License | ||
* along with this program. If not, see <http://www.gnu.org/licenses/>. | ||
*/ | ||
package org.neo4j.unsafe.impl.batchimport; | ||
|
||
import org.neo4j.kernel.impl.store.id.IdGeneratorImpl; | ||
import org.neo4j.kernel.impl.store.record.RelationshipRecord; | ||
import org.neo4j.unsafe.impl.batchimport.input.InputRelationship; | ||
import org.neo4j.unsafe.impl.batchimport.staging.BatchSender; | ||
import org.neo4j.unsafe.impl.batchimport.staging.Configuration; | ||
import org.neo4j.unsafe.impl.batchimport.staging.ProcessorStep; | ||
import org.neo4j.unsafe.impl.batchimport.staging.StageControl; | ||
|
||
/** | ||
* Assigns record ids to {@link Batch} for later record allocation. Since this step is single-threaded | ||
* we can safely assign these ids here. | ||
*/ | ||
public class AssignRelationshipIdBatchStep extends ProcessorStep<Batch<InputRelationship,RelationshipRecord>> | ||
{ | ||
private long nextId; | ||
|
||
public AssignRelationshipIdBatchStep( StageControl control, Configuration config, long firstRelationshipId ) | ||
{ | ||
super( control, "ASSIGN", config, 1 ); | ||
this.nextId = firstRelationshipId; | ||
} | ||
|
||
@Override | ||
protected void process( Batch<InputRelationship,RelationshipRecord> batch, BatchSender sender ) throws Throwable | ||
{ | ||
// Assign first record id and send | ||
batch.firstRecordId = nextId; | ||
sender.send( batch ); | ||
|
||
// Set state for the next batch | ||
nextId += batch.input.length; | ||
if ( nextId <= IdGeneratorImpl.INTEGER_MINUS_ONE && | ||
nextId + batch.input.length >= IdGeneratorImpl.INTEGER_MINUS_ONE ) | ||
{ | ||
// There's this pesky INTEGER_MINUS_ONE ID again. Easiest is to simply skip this batch of ids | ||
// or at least the part up to that id and just continue after it. | ||
nextId = IdGeneratorImpl.INTEGER_MINUS_ONE + 1; | ||
} | ||
} | ||
|
||
public long getNextRelationshipId() | ||
{ | ||
return nextId; | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
116 changes: 0 additions & 116 deletions
116
...kernel/src/main/java/org/neo4j/unsafe/impl/batchimport/CalculateDenseNodePrepareStep.java
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.