NotALeaderError in write_transaction() #276

P1zz4br0etch3n · 2019-01-22T16:03:36Z

I got NotALeaderError in write_transaction() when running an application with neo4j-driver for about 1 day on average. This happens almost every day since it is runnning 24/7. The app runs a transaction every 5 minutes if necessary. After the first error is thrown no more transactions will succeed until application restart.

This might be related: neo4j-contrib/neomodel#335

Neo4j Version: 3.4.9 Enterprise
Neo4j Mode: Causal cluster with 3 core
Driver version: Python driver 1.7.1
Operating System: Docker base image python:2.7-slim
Packaging Tool: Pipenv

Steps to reproduce

Start Neo4j on self-hosted VM
Start Application on Kubernetes
Let it run for a day

Expected behavior

The application keeps being able to run write_transaction() on Neo4j cluster

Actual behavior

After running for about 1 day the application stops being able to write.
Traceback (modified to hide function/application names):

Traceback (most recent call last):\
 results = session.write_transaction(_unit_of_work, time_limit)\
 File \\"/usr/local/lib/python2.7/site-packages/neo4j/__init__.py\\", line 708, in write_transaction\
 return self._run_transaction(WRITE_ACCESS, unit_of_work, *args, **kwargs)\
 File \\"/usr/local/lib/python2.7/site-packages/neo4j/__init__.py\\", line 683, in _run_transaction\
 tx.close()\
 File \\"/usr/local/lib/python2.7/site-packages/neo4j/__init__.py\\", line 822, in close\
 self.sync()\
 File \\"/usr/local/lib/python2.7/site-packages/neo4j/__init__.py\\", line 787, in sync\
 self.session.sync()\
 File \\"/usr/local/lib/python2.7/site-packages/neo4j/__init__.py\\", line 538, in sync\
 detail_count, _ = self._connection.sync()\
 File \\"/usr/local/lib/python2.7/site-packages/neobolt/direct.py\\", line 506, in sync\
 detail_delta, summary_delta = self.fetch()\
 File \\"/usr/local/lib/python2.7/site-packages/neobolt/direct.py\\", line 413, in fetch\
 raise error\
NotALeaderError: No write operations are allowed directly on this database. Writes must pass through the leader. The role of this server is: FOLLOWER\

The text was updated successfully, but these errors were encountered:

technige · 2019-01-22T16:30:13Z

I'm not sure how easy this'll be to recreate just from the information we have here. But it sounds to me like the routing table has invalid data or isn't being updated correctly.

Getting extra logs from the client would help. You'll need to hook into the logger called "neobolt" to see the conversation between client and server. There's a built-in helper to dump this to stdout, which you might be able to capture. These two lines just need to go at the top of the application:

from neobolt.diagnostics import watch
watch("neobolt")

After that, we should have a much clearer idea what is (or isn't) happening.

P1zz4br0etch3n · 2019-01-24T13:23:19Z

Thanks for your fast reply. We're capturing the neobolt logs since yesterday but the error didn't occur, yet. I will comment again when it recurs.

P1zz4br0etch3n · 2019-01-28T10:12:08Z

We now have a log file that captured the error, starting from the last successful commit. Do you need the queries we run? If yes, can I send the file to you directly? I don't want to publish them here.

technige · 2019-01-28T10:13:55Z

Yes, please drop me an email: nigel at neo4j dot com.

P1zz4br0etch3n · 2019-01-28T11:49:27Z

Ok, I've sent you an email with the issue title in subject.

technige · 2019-01-28T16:53:52Z

Thanks, received. I'll have a look over the next couple of days.

P1zz4br0etch3n · 2019-02-05T09:06:58Z

Any progress on this?

P1zz4br0etch3n · 2019-02-18T08:59:00Z

We're still getting that error..

technige · 2019-02-27T10:18:19Z

@P1zz4br0etch3n Can you confirm that you are using bolt+routing and not just bolt. We can't see any routing table updates in the log file.

P1zz4br0etch3n · 2019-02-27T13:27:34Z

Yes, we are definitely using bolt+routing.

technige · 2019-03-04T19:36:27Z

Will be fixed in the 1.7.2 patch that contains #283 (as well as equivalent patches for 1.5 and 1.6). Due for release on Thursday 7th March 2019.

mvanderkroon mentioned this issue Feb 6, 2019

[WIP] Feature/neo retry logic neo4j-contrib/neomodel#398

Closed

technige closed this as completed Mar 4, 2019

aonamrata mentioned this issue Jun 19, 2020

NotALeaderError in read session #429

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NotALeaderError in write_transaction() #276

NotALeaderError in write_transaction() #276

P1zz4br0etch3n commented Jan 22, 2019

technige commented Jan 22, 2019

P1zz4br0etch3n commented Jan 24, 2019

P1zz4br0etch3n commented Jan 28, 2019

technige commented Jan 28, 2019

P1zz4br0etch3n commented Jan 28, 2019

technige commented Jan 28, 2019

P1zz4br0etch3n commented Feb 5, 2019

P1zz4br0etch3n commented Feb 18, 2019

technige commented Feb 27, 2019 •

edited

P1zz4br0etch3n commented Feb 27, 2019

technige commented Mar 4, 2019

NotALeaderError in write_transaction() #276

NotALeaderError in write_transaction() #276

Comments

P1zz4br0etch3n commented Jan 22, 2019

Steps to reproduce

Expected behavior

Actual behavior

technige commented Jan 22, 2019

P1zz4br0etch3n commented Jan 24, 2019

P1zz4br0etch3n commented Jan 28, 2019

technige commented Jan 28, 2019

P1zz4br0etch3n commented Jan 28, 2019

technige commented Jan 28, 2019

P1zz4br0etch3n commented Feb 5, 2019

P1zz4br0etch3n commented Feb 18, 2019

technige commented Feb 27, 2019 • edited

P1zz4br0etch3n commented Feb 27, 2019

technige commented Mar 4, 2019

technige commented Feb 27, 2019 •

edited