Timeouts on Parallel Requests #7

zivSher · 2018-01-02T09:21:01Z

Hi,
We're using django 1.10 with Postgresql 9.2 and python 3.5
We have decided to use your package since we need field auto-incremental that is not PK (for some model), and it should be unique per other model.
The sequences table is within our models DB.

It seemed like the solution was good, until we started getting parallel requests. In that case - looks like the DB got locked for too long and we started getting timeouts (We are configured to 10 seconds timeout by our api gateway (Apigee)

Is that behavior expected or the package should handle those cases?
Thanks.

aaugustin · 2018-01-02T13:17:32Z

How many parallel connections do you have and at what rate are you generating IDs? The technique used by django-sequences on PostgreSQL < 9.5 likely doesn't scale to hundreds of connections creating tens of IDs per second.

Also, do you generate several IDs within the same database transaction? In that case you need to create them in a consistent order to avoid deadlocks.

zivSher · 2018-01-02T14:34:44Z

@aaugustin, I'm trying up to 30 parallel connections, requesting about 10 request per second.
And no - only one ID per request.
It happens also with much lower number of parallel connections and rps- 5 users and about 2 rps.

aaugustin · 2018-01-02T15:30:31Z

That could be pushing the limits of the implementation on PostgreSQL < 9.5, depending on the hardware you're using to run PostgreSQL.

You should check the health of your database when running this workload, with the help of your DBA if you have one.

You should also get better results on PostgreSQL 9.5 which has a more optimized implementation.

zivSher · 2018-01-02T15:42:12Z

@aaugustin , thanks for your answer.

When not using django-sequences seems like it works fine on those loads.

some more technical data that may help with diagnose the problem -
We're using gunicorn 19.6.0 configured to 5 threads, 3 servers, postgreSql with max_connections of 856, Django's CONN_MAX_AGE is currently set to 600.

Does it sound reasonable?

aaugustin · 2018-01-02T16:07:48Z

django-sequences is really designed to generate gapless sequences. This is a regulatory requirement in some contexts (e.g. accounting). These tend not to be write-heavy contexts, or at least not concurrent-write-heavy contexts.

This causes PostgreSQL to do a lot of work to synchronize between connections attempting to increment the counter. This is where the overhead comes from.

If you don't need gapless sequences, you can probably find a less expensive solution (in terms of database load), perhaps involving PostgreSQL counters.

If you do need gapless sequences, then you need either a database that's fast enough to handle the transactional workload or another solution for generating IDs.

aaugustin · 2018-01-02T16:21:37Z

I just noticed this limitation is discussed in the README:

Database transactions that call get_next_value for a given sequence are serialized. In other words, when you call get_next_value in a database transaction, other callers which attempt to get a value from the same sequence will block until the transaction completes, either with a commit or a rollback. You should keep such transactions short to minimize the impact on performance.

aaugustin closed this as completed Jan 2, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Timeouts on Parallel Requests #7

Timeouts on Parallel Requests #7

zivSher commented Jan 2, 2018

aaugustin commented Jan 2, 2018

zivSher commented Jan 2, 2018 •

edited

aaugustin commented Jan 2, 2018

zivSher commented Jan 2, 2018

aaugustin commented Jan 2, 2018

aaugustin commented Jan 2, 2018

Timeouts on Parallel Requests #7

Timeouts on Parallel Requests #7

Comments

zivSher commented Jan 2, 2018

aaugustin commented Jan 2, 2018

zivSher commented Jan 2, 2018 • edited

aaugustin commented Jan 2, 2018

zivSher commented Jan 2, 2018

aaugustin commented Jan 2, 2018

aaugustin commented Jan 2, 2018

zivSher commented Jan 2, 2018 •

edited