Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

don't remove LOCK file when PGSQL stopped with quorum lost. #28

Open
greenx opened this issue Nov 1, 2013 · 10 comments
Open

don't remove LOCK file when PGSQL stopped with quorum lost. #28

greenx opened this issue Nov 1, 2013 · 10 comments

Comments

@greenx
Copy link

greenx commented Nov 1, 2013

Hi!
I again found trouble. )
I doing experiment with quorum lost.
After quorum lost LOCK not deleted.
I see code.

    if  [ "$1" = "master" -a "$OCF_RESKEY_CRM_meta_notify_slave_uname" = " " ]; then
        ocf_log info "Removing $PGSQL_LOCK."
        rm -f $PGSQL_LOCK
    fi

Where defined $OCF_RESKEY_CRM_meta_notify_slave_uname ? what he mean?

@greenx
Copy link
Author

greenx commented Nov 1, 2013

One answer found http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/#_multi_state_notifications

But the meaning is not quite clear.
Lost quorum - the resources will stopped.
I made ​​a stand of four nodes.
If two node down - quorum lost and pacemaker apply policy 'stop'.
But there is one slave.

Or should we switched on and off in the correct order.

@t-matsuo
Copy link
Owner

t-matsuo commented Nov 2, 2013

If two node down - quorum lost and pacemaker apply policy 'stop'.
But there is one slave.

It may be a bug of pacemaker.

@greenx
Copy link
Author

greenx commented Nov 5, 2013

You mean to say, that after the loss of a quorum pacemaker should send a message to stop all the resources and reset this variable($OCF_RESKEY_CRM_meta_notify_slave_uname), because resource master-slave too must be stopped?

@t-matsuo
Copy link
Owner

t-matsuo commented Nov 5, 2013

You mean to say, that after the loss of a quorum pacemaker should send a message to stop all the resources
and reset this variable($OCF_RESKEY_CRM_meta_notify_slave_uname), because resource master-slave too
must be stopped?

Yes.

@Wintermute3
Copy link

In the comparison above:

"$OCF_RESKEY_CRM_meta_notify_slave_uname" = " "

Why the space between the " " for the comparison target? Seems like an error to me. The OCF_RESKEY_CRM_meta_notify_slave_uname got trim'ed upon import, so blank strings would have become empty strings, no?

@playmobil77d
Copy link

Hi,
Is there a solution about this problem ?

@greenx
Copy link
Author

greenx commented Nov 12, 2015

I dont know. My project was closed. I almost was not interested in the topic.

@furynick
Copy link

I have the same problem with 2 nodes cluster.

The LOCK is never deleted as OCF_RESKEY_CRM_meta_notify_slave_uname always contains so the slave can't be restarted.

Perhaps this file can be deleted on slave successful stop after wal sync check

@t-matsuo
Copy link
Owner

Hi furynick

Sorry, I switched to another task 5 years ago.

This agent was merged into ClusterLabs repository and maintained at its community.
Could you open new topic at ClusterLabs resource-agent repository ?
Someone may respond.

@furynick
Copy link

Thanks for reply, I already found related issue at ClusterLabs#699

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants