Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeouts on Parallel Requests #7

Closed
zivSher opened this issue Jan 2, 2018 · 6 comments
Closed

Timeouts on Parallel Requests #7

zivSher opened this issue Jan 2, 2018 · 6 comments

Comments

@zivSher
Copy link

zivSher commented Jan 2, 2018

Hi,
We're using django 1.10 with Postgresql 9.2 and python 3.5
We have decided to use your package since we need field auto-incremental that is not PK (for some model), and it should be unique per other model.
The sequences table is within our models DB.

It seemed like the solution was good, until we started getting parallel requests. In that case - looks like the DB got locked for too long and we started getting timeouts (We are configured to 10 seconds timeout by our api gateway (Apigee)

Is that behavior expected or the package should handle those cases?
Thanks.

@aaugustin
Copy link
Owner

How many parallel connections do you have and at what rate are you generating IDs? The technique used by django-sequences on PostgreSQL < 9.5 likely doesn't scale to hundreds of connections creating tens of IDs per second.

Also, do you generate several IDs within the same database transaction? In that case you need to create them in a consistent order to avoid deadlocks.

@zivSher
Copy link
Author

zivSher commented Jan 2, 2018

@aaugustin, I'm trying up to 30 parallel connections, requesting about 10 request per second.
And no - only one ID per request.
It happens also with much lower number of parallel connections and rps- 5 users and about 2 rps.

@aaugustin
Copy link
Owner

That could be pushing the limits of the implementation on PostgreSQL < 9.5, depending on the hardware you're using to run PostgreSQL.

You should check the health of your database when running this workload, with the help of your DBA if you have one.

You should also get better results on PostgreSQL 9.5 which has a more optimized implementation.

@zivSher
Copy link
Author

zivSher commented Jan 2, 2018

@aaugustin , thanks for your answer.

When not using django-sequences seems like it works fine on those loads.

some more technical data that may help with diagnose the problem -
We're using gunicorn 19.6.0 configured to 5 threads, 3 servers, postgreSql with max_connections of 856, Django's CONN_MAX_AGE is currently set to 600.

Does it sound reasonable?

@aaugustin
Copy link
Owner

django-sequences is really designed to generate gapless sequences. This is a regulatory requirement in some contexts (e.g. accounting). These tend not to be write-heavy contexts, or at least not concurrent-write-heavy contexts.

This causes PostgreSQL to do a lot of work to synchronize between connections attempting to increment the counter. This is where the overhead comes from.

If you don't need gapless sequences, you can probably find a less expensive solution (in terms of database load), perhaps involving PostgreSQL counters.

If you do need gapless sequences, then you need either a database that's fast enough to handle the transactional workload or another solution for generating IDs.

@aaugustin
Copy link
Owner

I just noticed this limitation is discussed in the README:

Database transactions that call get_next_value for a given sequence are serialized. In other words, when you call get_next_value in a database transaction, other callers which attempt to get a value from the same sequence will block until the transaction completes, either with a commit or a rollback. You should keep such transactions short to minimize the impact on performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants