Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pg_partman doesn't maintain the order of execution of tables #667

Closed
databasetech0073 opened this issue Jun 9, 2024 · 4 comments
Closed
Assignees

Comments

@databasetech0073
Copy link

databasetech0073 commented Jun 9, 2024

Hello,
We have around 10 different partition tables for which the partition maintenance is done using pg_partman extension. These tables having foreign key dependency between them. We just calls partman.run_maintanance_proc() through pg_cron without any parameter being passed to it and it was working fine. So we can see only one entry in the cron.job table which is scheduled to run daily once.

It was all working fine and we were seeing the historical partition being dropped and new partitions being created without any issue. But suddenly we started seeing, its getting failed with error "ERROR: cant not drop schema1.tab1_part_p2023_12_01 because other objects depend on it".

Then we realized , it may be the case that its trying to run the partition maintenance for the Parent partition table first before the child partition table. So not sure how pg_partman handles the ordering of tables while doing the partition maintenance as we don't see any parameter to drive the ordering of the partition maintenance in part_config and we were under impression pg_partman will take care of the sequence of partition maintenance automatically. We didn't do any version upgrade, so not sure but suspecting as we did update the "premake" to 60 from initial 30 for all the tables and so if that made this happen by changing some internal order of tables in part_config.

So want to understand if anybody encountered such issues?

And to handle above issue , if we should call the partition maintenance of each of the TABLE by passing the table name explicitly to the run_maintanance_proc(), something as below. Is this advisable?

SELECT cron.unschedule('run_maintenance_proc');

SELECT cron.schedule(
    'daily_partition_maintenance',
    '0 2 * * *',
    $$
    DO $$
    BEGIN
        -- Run maintenance for child tables first
        PERFORM partman.run_maintenance_proc('schema1.child_table1');
        PERFORM partman.run_maintenance_proc('schema1.child_table2');
        -- Add more child tables as needed

        -- Run maintenance for parent tables next
        PERFORM partman.run_maintenance_proc('schema1.parent_table1');
        PERFORM partman.run_maintenance_proc('schema1.parent_table2');
        -- Add more parent tables as needed
    END;
    $$;
$$
);


Or else

create a function like below and then call/schedule it through pg_cron

CREATE OR REPLACE FUNCTION run_partition_maintenance()
RETURNS void AS $$
BEGIN
    -- Run maintenance for child tables first
    PERFORM partman.run_maintenance_proc('schema1.child_table1');
    PERFORM partman.run_maintenance_proc('schema1.child_table2');
    -- Add more child tables as needed

    -- Run maintenance for parent tables next
    PERFORM partman.run_maintenance_proc('schema1.parent_table1');
    PERFORM partman.run_maintenance_proc('schema1.parent_table2');
    -- Add more parent tables as needed
END;
$$ LANGUAGE plpgsql;

SELECT cron.unschedule('run_maintenance_proc');

SELECT cron.schedule(
    'daily_partition_maintenance',
    '0 2 * * *',
    'CALL run_partition_maintenance()'
);
@keithf4
Copy link
Collaborator

keithf4 commented Jun 10, 2024

Version 5.1 added the option to specify the order of maintenance for partition sets that will be run when you call run_mainteannce_proc() without any arguments.

https://github.com/pgpartman/pg_partman/releases/tag/v5.1.0

Since it appears that you're on version 4.x, please see the release notes for migrating to 5.x since version 4.x is only getting critical bug fixes going forward.

https://github.com/pgpartman/pg_partman/releases/tag/v5.0.1

Without specifying an order for the rows being returned in a tableset, there is no guaranteed order. If there are no changes to the underlying tables, you may see that it generally does come back in a similar order most times, but that is just a coincidence. Before this feature in 5.1 was added, there had been no way to guarantee maintenance order besides passing the parent table argument to the maintenance function and calling it multiple times in the order you need.

So your method of making a custom function to call tables in a specific order is correct. Just be sure to set automatic_maintenance in part_config to false for those tables so that if something else is calling the maintenance function without a parameter, it doesn't try and run them.

@keithf4 keithf4 self-assigned this Jun 10, 2024
@databasetech0073
Copy link
Author

databasetech0073 commented Jun 11, 2024

Thank you so much for the information.

With regards to the custom function approach, I was thinking to remove the "partman.run_maintenance_proc()" job from the pg_cron scheduler. But I believe you are asking to set the "automatic_maintenance" to "OFF" for the specific tables in part_config. Isn't both the option will achieve the same thing i.e. will not run the tables from the automatic maintenance scheduler rather will execute through the newly scheduled cron which will call the function which will have the fixed order of execution of tables?
Or
Should we not remove the "partman.run_maintenance_proc()" in the pg_cron but just set the "automatic_maintenance" OFF in part_config?

@keithf4
Copy link
Collaborator

keithf4 commented Jun 11, 2024

So technically, if every single table you manage in partman will be called directly as a parameter to run_maintenance_proc() and you never call run_maintenance_proc() without passing a table parameter, then you don't need to worry about setting automatic_maintenance to off.

However, my advice would still be to set it for any partition set that you are calling maintenance for directly. Just saves you some possible heartache of someone else possibly also needing to use partman in your database and not knowing about your manual calls so they just go and schedule a non-parameterized call for maintenance.

@keithf4
Copy link
Collaborator

keithf4 commented Jul 9, 2024

Closing this issue for now. Please feel free to respond or reopen if you still have any questions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants