Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(db): add postgres triggers to cleanup expired rows on ttl-enabled schema tables #10389

Merged
merged 4 commits into from Mar 28, 2023

Conversation

bungle
Copy link
Member

@bungle bungle commented Feb 28, 2023

Summary

This is alternative to #10331. @windmgc please check this out. What do you think? If ok, please take over and do the rest of the changes (aka add migrations so that cleanup is in other entities too, similar to what I did for oauth2_tokens here).

Fix

FTI-4746 and KAG-516

@hanshuebner
Copy link
Contributor

This is a clever approach. Are we fine with the random response time increase that we see for those insertions that perform a garbage collection?

@hbagdi
Copy link
Member

hbagdi commented Feb 28, 2023

This programs the database in a way that is very hard to control when something goes wrong or needs tweaking.
Is creating such stored procedures and then invoking them from control-plane nodes a better balance? That helps us invoke the procedure on a schedule and connection that can be controlled by the CP and we (developers) can fine-tune that based on the feedback we receive.
Kong is sensitive to database latencies, especially when there is a short burst of write traffic, and such a change could potentially worsen database performance and increase the time it takes for the configuration of Kong to propagate across the cluster in hybrid mode.

I must say that the cleverness is appealing especially since it offloads all the work to the database server!

@hanshuebner
Copy link
Contributor

Is creating such stored procedures and then invoking them from control-plane nodes a better balance? That helps us invoke the procedure on a schedule and connection that can be controlled by the CP and we (developers) can fine-tune that based on the feedback we receive.

I think that if we have a mechanism for that, scheduling the garbage collection from th CP instead of randomly invoking it from insertions is better. The concern that I have is that when we'd piggyback the GC onto insertions, we'd see the additional cost as latency increases in our proxy path. I'm sure that we have other non-obvious contributing factors, but I'd generally want to keep clear of adding cost to the proxy path that'd be very difficult to analyze when looking at the latency profile.

@bungle
Copy link
Member Author

bungle commented Mar 1, 2023

Is creating such stored procedures and then invoking them from control-plane nodes a better balance? That helps us invoke the procedure on a schedule and connection that can be controlled by the CP and we (developers) can fine-tune that based on the feedback we receive.

I think that if we have a mechanism for that, scheduling the garbage collection from th CP instead of randomly invoking it from insertions is better.

All this needs to be measured. Perhaps we don't have any problems with the solution I am proposing (it has parameter for propability and number of rows to be deleted). Perhaps having cluster of 50 traditional nodes OR 10 control planes that are scheduled to bombard database is a way worse. Perhaps adding cluster mutex lock there only makes it more complex and complicated (the cluster mutex is still an additional query to database and then you need to deal with possible situations when lock is not released).

Inserting new access tokens and authorization codes is not that frequent. It usually means one per user per certain amount of time (e.g. session). If there are say 100-1000 logins per second (which is quite a bit already). This will only affect login of one or two depending on how we setup the variables. And login in particular is not that sensitive to latency. Usually login is very latency as you need to do all sorts of password hashing etc. that should be computation intensive, and yes take time - most probably much more than it takes postgres to delete some rows. Outside OAuth2 plugin, nobody has reported any problems.

@bungle
Copy link
Member Author

bungle commented Mar 1, 2023

The concern that I have is that when we'd piggyback the GC onto insertions, we'd see the additional cost as latency increases in our proxy path.

Data planes and DBless proxy nodes don't insert anything. CP inserts and there we don't have the latency issue that much (on CPs insertions are a low factor). The only problem we have is OAuth2 plugin in TRADITIONAL mode (where there is no official separation of CP and DP, easy fix is to make timer detect that is there ADMIN_LISTEN and only enable it nodes that have that).

bungle added a commit that referenced this pull request Mar 1, 2023
… listeners

### Summary

PR #10389 and #10331 pursued different ways to improve our situation with cleanup timers.

This PR makes it so that it only happens on nodes that have admin listeners. Fairly simple
but may already cover the majority of problem space.
bungle added a commit that referenced this pull request Mar 1, 2023
… listeners

### Summary

PR #10389 and #10331 pursued different ways to improve our situation with cleanup timers.

This PR makes it so that it only happens on nodes that have admin listeners. Fairly simple
but may already cover the majority of problem space.
bungle added a commit that referenced this pull request Mar 1, 2023
… listeners

### Summary

PR #10389 and #10331 pursued different ways to improve our situation with cleanup timers.

This PR makes it so that it only happens on nodes that have admin listeners. Fairly simple
but may already cover the majority of problem space.
bungle added a commit that referenced this pull request Mar 1, 2023
… listeners

### Summary

PR #10389 and #10331 pursued different ways to improve our situation with cleanup timers.

This PR makes it so that it only happens on nodes that have admin listeners. Fairly simple
but may already cover the majority of problem space.
hanshuebner pushed a commit that referenced this pull request Mar 1, 2023
… listeners (#10405)

### Summary

PR #10389 and #10331 pursued different ways to improve our situation with cleanup timers.

This PR makes it so that it only happens on nodes that have admin listeners. Fairly simple
but may already cover the majority of problem space.
@windmgc
Copy link
Contributor

windmgc commented Mar 1, 2023

I've been thinking about this the whole day, and so far I still have some concerns here:

  • The backporting problem. I'm okay with the trigger way if we're going to fix it in a new version but it's hard to backport this to an older version when solving FTI problems. Manually creating triggers on the older versions seems not sustainable enough and might cause problems when upgrading to newer versions.
  • The latency increase, which needs some test(haven't got time to do that yet, will do)

Do you think it is a good idea that we remove the RANDOM call and just DELETE two rows on every INSERT? Since we already have proper indexes

bungle added a commit that referenced this pull request Mar 1, 2023
### Summary

The PR #10405 changed cleanup to only happen on nodes that have Admin API listeners.

This PR makes deletion to happen in maximum of 50.000 row batches.

Inspired from #10331 and #10389.
@bungle bungle force-pushed the feat/pg-cleanup-trigger branch 2 times, most recently from aeef455 to ace3f86 Compare March 1, 2023 14:35
@bungle
Copy link
Member Author

bungle commented Mar 1, 2023

@hanshuebner and @hbagdi, thanks for great feedback from @windmgc I modified this PR so that now on each insert statement the trigger removes maximum of 2 rows. I also removed usage of RANDOM. The penalty of deleting maximum of 2 rows per insert statement should be miniscule.

@hbagdi
Copy link
Member

hbagdi commented Mar 1, 2023

The penalty of deleting maximum of 2 rows per insert statement should be minuscule.

I have forever been told my DBA to not use triggers or use them minimally and be careful with them. I've never found a good reason when I push for why not.
Provided that the overhead of running on the trigger on every insert is low, we are good. If that is not the case, we may be adding cost. I'm out of depth with my knowledge of databases here.

Inserting new access tokens and authorization codes is not that frequent. It usually means one per user per certain amount of time (e.g. session). If there are say 100-1000 logins per second (which is quite a bit already).

I could be wrong but isn't this change about all tables that have TTLs? At least that was what the original PR (#10331) was about.

CP inserts and there we don't have the latency issue that much

Write traffic via CP is generally bursty in nature and latency differences do matter. I do see your point that the latency matters less.

Anyways, this change looks much better than it was before.
Should this move out from Draft at this point?

@windmgc
Copy link
Contributor

windmgc commented Mar 22, 2023

Is it possible to check the DB for custom tables and then fire up the timer only if there are custom tables?

I'd like to keep all of the original timer running in the background after consideration. It seems a bit hard to tell if the entity is a "custom" one, from Kong's perspective the "oauth2_tokens" is also a custom entity loaded by plugin. And it also seems not reliable to judge by bundled_plugins since the custom entity may not be provided by plugins. I can tell for sure if the entity we're cleaning is a core entity(based on CORE_ENTITES) but running the timer on non-core entities sounds the same as running it on all entities to me.

And SKIP LOCKED is really a good improvement here, it can prevent us from potentially conflicting when running multiple cleanup logic on one table. With batch query and SKIP LOCKED I think it should be relatively safe to keep both the trigger and the timer.

@bungle @hbagdi If you all agree with keeping the original timer code I'll go this way

@bungle
Copy link
Member Author

bungle commented Mar 22, 2023

Is it possible to check the DB for custom tables and then fire up the timer only if there are custom tables?

I'd like to keep all of the original timer running in the background after consideration. It seems a bit hard to tell if the entity is a "custom" one, from Kong's perspective the "oauth2_tokens" is also a custom entity loaded by plugin. And it also seems not reliable to judge by bundled_plugins since the custom entity may not be provided by plugins. I can tell for sure if the entity we're cleaning is a core entity(based on CORE_ENTITES) but running the timer on non-core entities sounds the same as running it on all entities to me.

And SKIP LOCKED is really a good improvement here, it can prevent us from potentially conflicting when running multiple cleanup logic on one table. With batch query and SKIP LOCKED I think it should be relatively safe to keep both the trigger and the timer.

@bungle @hbagdi If you all agree with keeping the original timer code I'll go this way

Yes, I approve this approach. Now that we don't need it that much for core anymore, perhaps also change that it runs less often. I do not know what would be a good value. Once per 5 mins? Once per hour? But I feel once per minute is too much.

@hbagdi
Copy link
Member

hbagdi commented Mar 22, 2023

I don't have strong opinions either way.
If making the timer run less frequently can result in a large behavior change then we should publicly document it. Otherwise, this is ok.

@windmgc
Copy link
Contributor

windmgc commented Mar 23, 2023

I think increasing it to per 5min is enough, and it won't cause a huge behavior change.

One thing to note that is from 3.X we replaced timer library to timer-ng which can also guarantee that, if a timer is created by timer.every, only one will run at the same time. Before 3.X we don't have such a guarantee.

@windmgc windmgc force-pushed the feat/pg-cleanup-trigger branch 2 times, most recently from 4e14988 to f2172c5 Compare March 23, 2023 09:16
@windmgc windmgc changed the title feat(db): remove postgres cleanup timer and use postgres triggers instead feat(db): add postgres triggers to cleanup expired rows on ttl-enabled schema tables Mar 23, 2023
@windmgc
Copy link
Contributor

windmgc commented Mar 23, 2023

@bungle Could you please review it again?

I've restored the original timer code and increase the interval to 300s.

@hbagdi
Copy link
Member

hbagdi commented Mar 28, 2023

Can we call this change of frequency in the changelog, please?
It is technically a breaking change (I agree that its impact will be small).

@bungle bungle merged commit 3e88acf into master Mar 28, 2023
27 checks passed
@bungle bungle deleted the feat/pg-cleanup-trigger branch March 28, 2023 17:56
@hbagdi
Copy link
Member

hbagdi commented Mar 28, 2023

@windmgc Please cherry pick this PR to EE.

@hbagdi
Copy link
Member

hbagdi commented Mar 29, 2023

ping @windmgc

1 similar comment
@hbagdi
Copy link
Member

hbagdi commented Apr 4, 2023

ping @windmgc

@outsinre
Copy link
Contributor

outsinre commented Apr 9, 2023

As not yet cherry-picked, now covered by CE2EE master merge.

@kikito
Copy link
Member

kikito commented Apr 13, 2023

@outsinre notice that this has now been cherrypicked to ee

@outsinre
Copy link
Contributor

@kikito noticed. Thanks to @bungle on the debug for CE2EE master merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants