Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: read ECONNRESET parse file and insert node #43

Open
imtase opened this issue May 30, 2014 · 11 comments
Open

Error: read ECONNRESET parse file and insert node #43

imtase opened this issue May 30, 2014 · 11 comments

Comments

@imtase
Copy link

imtase commented May 30, 2014

Hello,

I have a file with 32 000 lines and I want to use each one for create a node (use the insertNode function but I have same problem when I change by Cypher method. With small files I have no problem but when have a lot of line I have an error :

events.js:72
throw er; // Unhandled 'error' event
^
Error: read ECONNRESET
at errnoException (net.js:904:11)
at TCP.onread (net.js:558:19)

For load and read the file I use fast-csv

Thanks for your help.

@evengers
Copy link

I get the same error

Similar to imtase I am stepping line by line through a csv file
I pause with each line, send a cypher, on callback resume ... works fine for about 11,000 rows

@philippkueng
Copy link
Owner

Hi @evengers,

Thanks for the additional hint. I'm in the exam session right now until
monday 25.8 lunchtime. Will debug it then. Hope that works for you.

Cheers Phil

On Saturday, August 23, 2014, evengers notifications@github.com wrote:

I get the same error

Similar to imtase I am stepping line by line through a csv file
I pause with each line, send a cypher, on callback resume ... works fine
for about 11,000 rows


Reply to this email directly or view it on GitHub
#43 (comment)
.

@evengers
Copy link

Hi Phil,

I've looked at this a bit further.

I'm fairly certain it is not your library (which is very useful ... thanks!)

If I substitute your db.query ...
neodb.cypherQuery(query, function (err, result){
if (err) {
console.log ("got this query error: ",err);
}else{
if (verbose) console.log ("got the callback ... ", lineCtr);
//callback finished, query done, so get next line
ev.emit('getNextLine', "optional send something on event", function(){});

}//end if

});

with this ....

co(function *(){

var theBody = astr;
var theurl = 'http://0.0.0.0:7474/db/data/cypher';
var result = yield request({
headers: {"Content-Type" : "application/json"},
uri: theurl,
method: 'POST',
body: query
});

var responsebody = result.body;
console.log("got this back ... ", responsebody);

ev.emit('getNextLine', "optional send something on event", function(){});

})();

Then the csv parsing is much faster (I'm guessing 2x faster) but in the end I get the same error.

Every 28k rows I get a timeout.

The error message does seem to be fairly common (google "TIMEOUT util.js net.js 742" and "ECONNRESET util.js net.js " ) ( this one is funny "chaos monkey" ... https://github.com/strongloop/zone/tree/master/showcase/long-stack-http)

My guess is that there is some sort of neo4j config parameter I have to change to accommodate the large number of posts. I will try playing with some of the options described here: http://www.neo4j.org/graphgist?d788e117129c3730a042

BTW, I have looked at other ways of loading the csv file (transactions etc.)

Also BTW, the (convoluted) query I am running (which works ) looks like this:

var theStatements = [
"MERGE (le:LEI "+ leiparam +")",
"MERGE (co:COUNTRY "+ countryparam +" )",
"MERGE (ci:CITY "+ cityparam +" )",
"MERGE (le)-[:IN_CITY]-(ci)-[:IN_COUNTRY]-(co)",
"MERGE (le)-[:IN_COUNTRY]-(co)",
"MERGE (hco:COUNTRY "+ hqcountryparam +" )",
"MERGE (hci:CITY "+ hqcityparam +" )",
"MERGE (le)-[:IN_HQCITY]-(hci)-[:IN_HQCOUNTRY]-(hco)",
"MERGE (le)-[:IN_HQCOUNTRY]-(hco)",
"RETURN *",];

var query = theStatements.join('\n');

//console.log(query);

queryWithThisString(query);

Again, Phil thanks for the useful library .... If you do happen to come across an answer please do let me know.

Cheers
Steve

On 23 Aug 2014, at 10:16, Philipp Küng notifications@github.com wrote:

Hi @evengers,

Thanks for the additional hint. I'm in the exam session right now until
monday 25.8 lunchtime. Will debug it then. Hope that works for you.

Cheers Phil

On Saturday, August 23, 2014, evengers notifications@github.com wrote:

I get the same error

Similar to imtase I am stepping line by line through a csv file
I pause with each line, send a cypher, on callback resume ... works fine
for about 11,000 rows


Reply to this email directly or view it on GitHub
#43 (comment)
.


Reply to this email directly or view it on GitHub.

@evengers
Copy link

it is a configuration issue. Solved by bumping up a few parameters. Thanks.

Notes here:

#file is in etc/neo4j
#see http://www.neo4j.org/graphgist?d788e117129c3730a042 for suggestions on config
#see http://stackoverflow.com/questions/17661902/tuning-neo4j-for-performance
#use http://neotechnology.com/calculatorv2/ to figure out size
#also see this http://docs.neo4j.org/chunked/stable/configuration-caches.html
#iwas getting timeouts every 28k of cypher posts

Default values for the low-level graph engine

#neostore.nodestore.db.mapped_memory=25M
#neostore.relationshipstore.db.mapped_memory=50M
#neostore.propertystore.db.mapped_memory=90M
#neostore.propertystore.db.strings.mapped_memory=130M
#neostore.propertystore.db.arrays.mapped_memory=130M
neostore.nodestore.db.mapped_memory=50M
neostore.relationshipstore.db.mapped_memory=100M
neostore.propertystore.db.strings.mapped_memory=100M
neostore.propertystore.db.arrays.mapped_memory=10M

Enable this to be able to upgrade a store from an older version

#allow_store_upgrade=true

Enable this to specify a parser other than the default one.

#cypher_parser_version=2.0

Keep logical logs, helps debugging but uses more disk space, enabled for

legacy reasons To limit space needed to store historical logs use values such

as: "7 days" or "100M size" instead of "true"

#keep_logical_logs=true
keep_logical_logs=10M size

Autoindexing

Enable auto-indexing for nodes, default is false

#node_auto_indexing=true

The node property keys to be auto-indexed, if enabled

#node_keys_indexable=name,age

Enable auto-indexing for relationships, default is false

#relationship_auto_indexing=true
relationship_auto_indexing=true

The relationship property keys to be auto-indexed, if enabled

#relationship_keys_indexable=name,age
relationship_keys_indexable=IN_COUNTRY,IN_CITY,IN_HQCOUNTRY,IN_HQCITY

Enable shell server so that remote clients can connect via Neo4j shell.

#remote_shell_enabled=true

The network interface IP the shell will listen on (use 0.0.0 for all interfaces)

remote_shell_host=127.0.0.1

The port the shell will listen on, default is 1337

#remote_shell_port=1337

On 24 Aug 2014, at 10:15, Steven Rogers steven.rogers@evengers.com wrote:

Hi Phil,

I've looked at this a bit further.

I'm fairly certain it is not your library (which is very useful ... thanks!)

If I substitute your db.query ...
neodb.cypherQuery(query, function (err, result){
if (err) {
console.log ("got this query error: ",err);
}else{
if (verbose) console.log ("got the callback ... ", lineCtr);
//callback finished, query done, so get next line
ev.emit('getNextLine', "optional send something on event", function(){});

}//end if

});

with this ....

co(function *(){

var theBody = astr;
var theurl = 'http://0.0.0.0:7474/db/data/cypher';
var result = yield request({
headers: {"Content-Type" : "application/json"},
uri: theurl,
method: 'POST',
body: query
});

var responsebody = result.body;
console.log("got this back ... ", responsebody);

ev.emit('getNextLine', "optional send something on event", function(){});

})();

Then the csv parsing is much faster (I'm guessing 2x faster) but in the end I get the same error.

Every 28k rows I get a timeout.

The error message does seem to be fairly common (google "TIMEOUT util.js net.js 742" and "ECONNRESET util.js net.js " ) ( this one is funny "chaos monkey" ... https://github.com/strongloop/zone/tree/master/showcase/long-stack-http)

My guess is that there is some sort of neo4j config parameter I have to change to accommodate the large number of posts. I will try playing with some of the options described here: http://www.neo4j.org/graphgist?d788e117129c3730a042

BTW, I have looked at other ways of loading the csv file (transactions etc.)

Also BTW, the (convoluted) query I am running (which works ) looks like this:

var theStatements = [
"MERGE (le:LEI "+ leiparam +")",
"MERGE (co:COUNTRY "+ countryparam +" )",
"MERGE (ci:CITY "+ cityparam +" )",
"MERGE (le)-[:IN_CITY]-(ci)-[:IN_COUNTRY]-(co)",
"MERGE (le)-[:IN_COUNTRY]-(co)",
"MERGE (hco:COUNTRY "+ hqcountryparam +" )",
"MERGE (hci:CITY "+ hqcityparam +" )",
"MERGE (le)-[:IN_HQCITY]-(hci)-[:IN_HQCOUNTRY]-(hco)",
"MERGE (le)-[:IN_HQCOUNTRY]-(hco)",
"RETURN *",];

var query = theStatements.join('\n');

//console.log(query);

queryWithThisString(query);

Again, Phil thanks for the useful library .... If you do happen to come across an answer please do let me know.

Cheers
Steve

On 23 Aug 2014, at 10:16, Philipp Küng notifications@github.com wrote:

Hi @evengers,

Thanks for the additional hint. I'm in the exam session right now until
monday 25.8 lunchtime. Will debug it then. Hope that works for you.

Cheers Phil

On Saturday, August 23, 2014, evengers notifications@github.com wrote:

I get the same error

Similar to imtase I am stepping line by line through a csv file
I pause with each line, send a cypher, on callback resume ... works fine
for about 11,000 rows


Reply to this email directly or view it on GitHub
#43 (comment)
.


Reply to this email directly or view it on GitHub.

@philippkueng
Copy link
Owner

@evengers thanks for letting me know of that. (also @imtase)

While debugging i ran into the uname limit on OSX, so I wondered, what's the environment you're running the database & server on? Which version of node, neo4j and node-neo4j are you running?

@evengers
Copy link

Hi

I'm running the latest Neo4j community edition under Ubuntu 14 as a guest on My Macbook. Node 0.11.13

I give the guest 4gb memory

I was using the latest node-neo4j ... now I am experimenting with co-request to compare

The trick seems to be to set constraints with uniqueness and to put the config parameters higher than they are set on install

Default values for the low-level graph engine

#neostore.nodestore.db.mapped_memory=25M
#neostore.relationshipstore.db.mapped_memory=50M
#neostore.propertystore.db.mapped_memory=90M
#neostore.propertystore.db.strings.mapped_memory=130M
#neostore.propertystore.db.arrays.mapped_memory=130M
neostore.nodestore.db.mapped_memory=100M
neostore.relationshipstore.db.mapped_memory=500M
neostore.propertystore.db.strings.mapped_memory=130M
neostore.propertystore.db.arrays.mapped_memory=10M

Thanks!

On 25 Aug 2014, at 23:17, Philipp Küng notifications@github.com wrote:

@evengers thanks for letting me know of that. (also @imtase)

While debugging i ran into the uname limit on OSX, so I wondered, what's the environment you're running the database & server on? Which version of node, neo4j and node-neo4j are you running?


Reply to this email directly or view it on GitHub.

@evengers
Copy link

P.S. useful tips here: http://www.slideshare.net/neo4j/optimizing-cypher-32550605

On 26 Aug 2014, at 13:40, Steven Rogers steven.rogers@evengers.com wrote:

Hi

I'm running the latest Neo4j community edition under Ubuntu 14 as a guest on My Macbook. Node 0.11.13

I give the guest 4gb memory

I was using the latest node-neo4j ... now I am experimenting with co-request to compare

The trick seems to be to set constraints with uniqueness and to put the config parameters higher than they are set on install

Default values for the low-level graph engine

#neostore.nodestore.db.mapped_memory=25M
#neostore.relationshipstore.db.mapped_memory=50M
#neostore.propertystore.db.mapped_memory=90M
#neostore.propertystore.db.strings.mapped_memory=130M
#neostore.propertystore.db.arrays.mapped_memory=130M
neostore.nodestore.db.mapped_memory=100M
neostore.relationshipstore.db.mapped_memory=500M
neostore.propertystore.db.strings.mapped_memory=130M
neostore.propertystore.db.arrays.mapped_memory=10M

Thanks!

On 25 Aug 2014, at 23:17, Philipp Küng notifications@github.com wrote:

@evengers thanks for letting me know of that. (also @imtase)

While debugging i ran into the uname limit on OSX, so I wondered, what's the environment you're running the database & server on? Which version of node, neo4j and node-neo4j are you running?


Reply to this email directly or view it on GitHub.

@stormit-vn
Copy link

We cannot go-live without getting this issue fixed. So does anyone has a resolution can fix this issue?

@philippkueng
Copy link
Owner

@stormit-vn hi, sorry to hear that you're still struggling with this. I no longer work with neo4j on a daily basis but am wondering if it's related to how many connections your host OS allows to be made to the database. In case you're doing a batch import, have you tried using https://caolan.github.io/async/docs.html#queue and specifying a count of workers so that neo4j can keep up with it?

If this doesn't solve or point you in the right direction can you elaborate on your use-case? and if possible share a piece of code that reproduces the problem.

@stormit-vn
Copy link

@philippkueng Phi it just a few connections and it just CRUD operations, sometimes we got read error and sometimes we got write error.

However, I just updated our code to specify some configurations when opening a new connection, until now I didn't receive any error likes. Here is the configuration I am using

{
                encrypted: false,
                // The maximum total number of connections allowed to be managed by the connection pool, per host.
                // This includes both in-use and idle connections. No maximum connection pool size is imposed
                // by default.
                maxConnectionPoolSize: 100,
                // The maximum allowed lifetime for a pooled connection in milliseconds. Pooled connections older than this
                // threshold will be closed and removed from the pool. Such discarding happens during connection acquisition
                // so that new session is never backed by an old connection. Setting this option to a low value will cause
                // a high connection churn and might result in a performance hit. It is recommended to set maximum lifetime
                // to a slightly smaller value than the one configured in network equipment (load balancer, proxy, firewall,
                // etc. can also limit maximum connection lifetime). No maximum lifetime limit is imposed by default. Zero
                // and negative values result in lifetime not being checked.
                maxConnectionLifetime: 0,
                // The maximum amount of time to wait to acquire a connection from the pool (to either create a new
                // connection or borrow an existing one.
                connectionAcquisitionTimeout: 60000,

                // Specify the maximum time in milliseconds transactions are allowed to retry via
                // <code>Session#readTransaction()</code> and <code>Session#writeTransaction()</code> functions.
                // These functions will retry the given unit of work on `ServiceUnavailable`, `SessionExpired` and transient
                // errors with exponential back-off using initial delay of 1 second.
                // Default value is 30000 which is 30 seconds.
                maxTransactionRetryTime: 60000,

                // Provide an alternative load balancing strategy for the routing driver to use.
                // Driver uses "least_connected" by default.
                // <b>Note:</b> We are experimenting with different strategies. This could be removed in the next minor
                // version.
                // loadBalancingStrategy: 'least_connected' | 'round_robin',

                // Specify socket connection timeout in milliseconds. Numeric values are expected. Negative and zero values
                // result in no timeout being applied. Connection establishment will be then bound by the timeout configured
                // on the operating system level. Default value is 5000, which is 5 seconds.
                connectionTimeout: 30000
            }

@philippkueng
Copy link
Owner

@stormit-vn I believe you've got your answer with neo4j/neo4j-javascript-driver#126 (comment) - while the library you're using is using bolt, I believe neo4j does respond the same way for HTTP requests too if it kills connections. Will try to replicate it later. Thanks for connecting the issues and bringing it to our attention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants