Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[-] reset session before returning connection to the pool #589

Conversation

amihajlovic
Copy link
Contributor

Tasks can leave objects behind after they finish executing.
Most commonly, session scope temporary tables and advisory locks.
Those objects persist after the connection is returned to the pool and can cause issues when the connection is reused.

Simple example:
Start pg_timetable with --cron-workers=1 and create a job SELECT timetable.add_job('tester', '* * * * *', 'create temporary table x();');

First job run will succeed and all other runs during that connections lifetime will fail with: ERROR: relation "x" already exists (SQLSTATE 42P07).

This change also enables using set role in autonomous tasks as the role is reset when connection is returned to the pool.

@pashagolub pashagolub force-pushed the reset-connection-after-release branch from e91bfee to 255daad Compare July 6, 2023 13:29
@pashagolub
Copy link
Collaborator

Thanks for your help. That's indeed might be a problem. I would rather start with not so strict reset statements. IMO temp objects, GUC and advisory locks are top priority.

@amihajlovic
Copy link
Contributor Author

Thanks for your help. That's indeed might be a problem. I would rather start with not so strict reset statements. IMO temp objects, GUC and advisory locks are top priority.

Well, I originaly wanted to resolve set role in autonomous tasks and only after realized you are working on that (among other things).

Regarding priority, temp & cursors share introduce similar issues when leaked. And leaked listens can cause a lot of major issues when pg_notify is used in triggers. Postgres notification queue is fixed size and once filled all those triggers with pg_notify start throwing errors.

@pashagolub
Copy link
Collaborator

Indeed, you're right about custom LISTENs... Let me think about it a little :)

@amihajlovic
Copy link
Contributor Author

amihajlovic commented Jul 6, 2023

On second thought, maybe we should consider a different approach.
There is little benefit in pooling chain executor connections. Timetable is a job agent after all and jobs are not executed multiple times per second.
With that in mind, maybe the safest best approach is to use the pool only for timetable "internal" queries (querying timetable.* objects) and listening to timetable notifications.
For each chain execution we open a new connection. In that case only leakage can occur between two tasks of the same chain and that is perfectly acceptable.
This keeps all chain executions completely isolated from one another, and more importantly isolated from timetable business logic. This will also simplify the code a bit as remote and local execution are handled with the same logic, just different connection strings.

@pashagolub
Copy link
Collaborator

Well, we have clients with a lot of jobs, so opening/closing connection will really hurt :(

@amihajlovic
Copy link
Contributor Author

Separate connection pool for executors? In that case we can safely use DISCARD ALL on release.
That one would be limited by max connections configs.
And we limit timetable internal pool to something like 3 connections as those should be more than enough

@pashagolub
Copy link
Collaborator

I believe there is a simpler way. Your first approach sounds good, just need to test that and we're fine. :) Will take a look later

@amihajlovic
Copy link
Contributor Author

Separate connection pool for executors? In that case we can safely use DISCARD ALL on release. That one would be limited by max connections configs. And we limit timetable internal pool to something like 3 connections as those should be more than enough

And that leads towards dynamic pooling for remote connections. If there are benefits for pooling local chains there is benefit for pooling remote connections as well.

@amihajlovic
Copy link
Contributor Author

I believe there is a simpler way. Your first approach sounds good, just need to test that and we're fine. :) Will take a look later

👍
It's the easiest one to implement right now.

@pashagolub
Copy link
Collaborator

pashagolub commented Jul 6, 2023

And that leads towards dynamic pooling for remote connections. If there are benefits for pooling local chains there is benefit for pooling remote connections as well.

Might help for a lot of remote tasks, yes. I wonder how many people prefer to use remote chains instead of having a standalone pg_timetable instance for each target database

@pashagolub pashagolub self-assigned this Jul 11, 2023
@pashagolub pashagolub added the 🐞 bug Something isn't working label Jul 11, 2023
@pashagolub pashagolub force-pushed the reset-connection-after-release branch from 5db2c69 to 32f11d2 Compare July 11, 2023 12:21
@pashagolub pashagolub changed the title Reset session before returning connection to the pool [-] reset session before returning connection to the pool Jul 11, 2023
@coveralls
Copy link

Pull Request Test Coverage Report for Build 5519859534

  • 6 of 6 (100.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.5%) to 94.939%

Totals Coverage Status
Change from base Build 5519918133: 0.5%
Covered Lines: 1632
Relevant Lines: 1719

💛 - Coveralls

@pashagolub pashagolub merged commit 60d3347 into cybertec-postgresql:master Jul 11, 2023
6 checks passed
@pashagolub
Copy link
Collaborator

Thanks a lot for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants