Batch inserter removed? #197

Closed
MrJaba opened this Issue May 3, 2012 · 12 comments

5 participants

@MrJaba

Hi,

In commit 724cd0c the Batch Inserter was removed, and I can't see where it's been moved to? Is this intentional? Is there another way of getting lots of data into Neo4J at speed?

Thanks for the awesome project!

Tom

@andreasronge
Neo4jRB member

Yes, it has been removed. I've just not had time to fix it, and I'm always keen on removing code :-)
Not sure if it's going to be included in the neo4j-core, or as a separate gem. Btw, did you use the old batch inserter work well for you ? Not sure if it works using the batch inserter on an already existing database - have you tried it ?

@MrJaba

Ah right I see, removing code is good :)

I haven't tried it on an already existing database I'm afraid (recreating each time), but it was working well for me last time I tried it, but I seem to be getting another (different) issue with it now on version 1.3.1! But before that it was inserting things incredibly quickly.

I did like the insertion speed, I have a lot of data to insert into neo4j and not sure how else I would insert it at speed? Any recommendations?

@andreasronge
Neo4jRB member

Well, you can always use the Java Batch Neo4j API directly, for example see https://github.com/andreasronge/neo4j-perf/blob/master/batch_import.rb
Is that good enough ?
Would be interesting to know if using the batch inserter on an already existing database works or not.

@MrJaba

Hi, apologies for the delay and thanks for the link to the direct API batch insert. I'll try to do a batch insert on an existing database when I get chance and will let you know how it goes. Thanks again for your help!

@jayniz jayniz pushed a commit to jayniz/neo4j that referenced this issue May 14, 2012
@andreasronge andreasronge Support for cypher query language [#197 state:resolved] 293b1ae
@jayniz jayniz pushed a commit to jayniz/neo4j that referenced this issue May 14, 2012
@andreasronge andreasronge Cypher queries with lucene index should now work. Notice Lucene index…
… files has been renamed. [#197 state:resolved]

Needed to rename the lucene index since the cypher query does not allow index names with a '-' in the filename.
Provided an upgrade script (neo4j-upgrade) which both upgrade the database and rename all the lucene index files.
b5d41f8
@MrJaba

Hi,

I had a chance to do a batch insert onto my already existing database and it happily opened and added in extra nodes. I did a full batch load of 1000000 records, then stopped the db, then added an additional 5000 records which worked perfectly.

Hth

@andreasronge
Neo4jRB member

Good news. Not sure when I will get time to reimplement the batch inserter ruby api.
Maybe it could be a pull request from you.
Maybe it should be included in both neo4j-core and neo4j-wrapper.

@MrJaba

I'm not sure I'll have time either, but I'd love to try and help, any hints you could give to point me in the right direction?

@markburns

@MrJaba I happened upon this issue whilst looking for something else, but anyway you might be interested in our Geoff gem. It's still not fully production ready or anything and there's aspects of the code that could be improved, but it's a DSL that we're using to set up our test cases. It basically generates geoff syntax files from a ruby dsl and will import them.

https://github.com/ShutlAdmin/geoff

@andreasronge
Neo4jRB member
@markburns

Thanks. We have some intermittent unpredictable failures with it that we're yet to get to the bottom of. It seems to be related to the callback triggered by the neo4j framework after deleting the rule nodes. Not sure of the exact circumstances just yet but a workaround is just to leave your rule nodes as shared state between tests.

Not ideal but we delete the whole test database for each suite run, so I can't imagine the rule nodes being present at all times (during a single suite run) causing major issues. Hopefully we can fix the bug though and contribute back to the neo4j-wrapper.

@andreasronge
Neo4jRB member
@vpacher

that is a very good pointer. however, rule nodes should be cleared from memory here: https://github.com/andreasronge/neo4j-wrapper/blob/master/lib/neo4j-wrapper/rule/event_listener.rb#L22

I'll do some more investigation

@saterus saterus closed this Jul 3, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment