-
Notifications
You must be signed in to change notification settings - Fork 274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
call partman.partition_data_proc during partman.run_maintenance #655
Comments
The problem with this is often when the default gets data, it can be A LOT of data. If that's the case, it could cause an extremely expensive write operation to kick off during normal maintenance. I instead recommend setting up whatever monitoring application you have in your environment to run |
Thank you for the quick response. Understand, yes was planning to do these calls in our own scheduler. |
Yeah I've thought of doing this as well with a flag. But then I've given people a flag to something that could potentially be very disruptive. I'd rather provide the means to monitor for it and have users need to go out of their way to fix this situation properly. The other thing is that if you're frequently seeing data go into the default that you feel this needs to be automated, I'd likely say there's other problems that need to be fixed:
|
Yes, but creating high number of future partitions has impact on performance. Queries like 'give me all the rows newer then yesterday' will scan over all these partitions. For our implementations the number of rows ending up in the default will be an exception and as you correctly remark should be low volume else there is a design issue. Thank you for the clarification |
But those tables are empty for the most part, and as of PG12+ that performance impact of having a higher number of partitions (1000+) is negligible until you start getting into REALLY high numbers. And I'd say if your partition numbers are getting that high, you may want to re-evaluate your partitioning interval and seriously consider retention options to remove unneeded data from that partition set. I'd encourage you to test and see what that performance impact is. If it's not negligible, I'd write up the scenario and share it on the developer mailing lists so they can see what the problem is. |
Adding more partitions increases the parsing time. We prevent using generic plans for these tables to allow partition pruning. Using the partman code this can simply be tested : It shows that the parsing time goes up if we increase the number of partitions. |
If that millisecond difference is demonstratively affecting your application, I can certainly understand that. Most cases I've seen myself that difference didn't really matter vs the overhead of having to deal with the cleanup of the default table. One thing I may suggest if you're really getting down to that level of performance being important is to take advantage of pg_partman's predictable naming pattern and query the child tables directly. If you know the time condition you're asking for at the application level, dynamically generate the query there there to write the exact child tables you're targeting. That completely bypasses all partition pruning. |
This is a very useful set of procedures.
One problem we have is that the run_maintenance task fails when there is data in the default partition for the new parrtition.
Could the partman.run_maintenance not call partman.partition_data_proc instead of generating an P001 error ?
The code could check partman.check_default if there is data in the default partition which needs to be moved for the new partition being added. Then call the partman.partition_data_proc and if need the partman.partition_gap_fill to generate the missing partitions.
After that run the normal maintenatance.
The text was updated successfully, but these errors were encountered: