-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Copy phase is acquiring a lock on my original table I don't know why #94
Comments
Could you give it another test and run this query as your RDS admin user when you see those waiting locks:
Would also be good to just grab a dump of pg_stat_activity like so:
Make sure to REDACT any sensitive info before pasting it into your reply. |
BTW, it might also be helpful to see the queries section of Performance insights from the same time period you posted the screenshot. Again, remember to blur any sensitive info if you post that. |
Yeah, the output of the above query will def be useful. Thinking out loud - I wonder if its a by product of using
|
@shayonj , I also think that it's the byproduct of @jfrost Also here are the screenshots of queries section of performance insights. Largely the updates query on the original tables are stuck. Here |
@shayonj I am not finding locks per say with the query @jfrost shared but this is the behavior while trying to replicate the same on local. You can see how queries goes from taking < 10ms to like 400ms and more. clip of what's happening while trying to replicate it on local: https://imgur.com/M5h5ZPm |
If the query I sent over returned zero results while the issue is happening and you can see the lock waits in Performance Insights, then it's possible you are running the query as a user which doesn't have the appropriate permissions. You can do a simple |
Yeah, thats interesting. If you have any reproducible script or similar, that could be super useful. I also see that in the video, the latency spikes for a handful of queries but most are <100ms (?). I will take a deeper look and get back within the week. I have been meaning to refactor parts of this code and introduce concept of two connections to setup copy and trigger, like pg_repack, which I think would allow us to also drop the |
@shayonj as of now we have only been able to replicate the locking behaviour on our Amazon's RDS instances (it doesn't happen on local environment). And also copying data in batches instead of a single query solves. So I don't think serialised transaction is the issue. |
@yash-toddleapp Any chance you are running those UPDATE queries on your RDS instance with serializable isolation or something other than read committed? |
@shayonj @jfrost Nah I don't think so. I found something more weird. The table only locks while What I did was make the list of all the queries that were being executed with the help of --verbose flag. And ran them using psql. And it does not lock the table. So there's something in the tool itself I don't know what that is causing the lock. |
Interesting! I will look a bit more deeply in the next few days and get back |
So, i am unable to replicate this in our smoke spec env and not seeing any consistent locking the on the parent table. I am curious
Thanks |
*also how big the table is roughly how many writes/updates it gets. Thanks! |
@shayonj Apparently the issue is the audit table is getting locked because of the So it gets stuck on the trigger that wants to acquire row level lock but can't acquire on audit table to insert the log of insert/update. And yes this behavior is only on RDS, we still haven't figured out why. Running the disable vacuum query before the serializable transaction starts fixes the issue. I have this fork https://github.dev/yash-toddleapp/pg-osc which has this fix with some other fixes as well. I'll make a PR once I get time. We have tables as big as 50-100 million rows |
That’s a good find! And makes sense. Since serializable transaction shouldn’t be causing issues. I will get in my refactor after your patch. Let me know if I can help with the patch or anything. Thanks! curious why it’s only happening on RDS. Will try to look into it later, but I can see how the alter can cause an access exclusive lock and a lock queue. |
I proposed a simplified change here: #97 Feel free to add to it or open new PRs with any other fixes you found, also if you are able to test/verify, then that'd be great too. Thanks! |
v.0.9.2 is now out with the fix: https://github.com/shayonj/pg-osc/releases/tag/v0.9.2 |
Closing this, thanks for the report and brainstorming! please feel free to open reports or suggest PRs too. Thanks again! |
It's mostly happening for update queries. I don't know why. I thought it will release the lock. But the curve kept going up. I had to manually intervene and stop the process. It instantly came down to normal as soon as I stopped this.
The text was updated successfully, but these errors were encountered: