Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ocf:heartbeat:mariadb documentation #1441

Closed
kees-closed opened this issue Dec 8, 2019 · 8 comments
Closed

ocf:heartbeat:mariadb documentation #1441

kees-closed opened this issue Dec 8, 2019 · 8 comments

Comments

@kees-closed
Copy link

kees-closed commented Dec 8, 2019

I followed the resource-agents/heartbeat/README.mariadb.md instructions. First comment is that the commands are out of date.

I had to create the resource with this command pcs resource create mariadb ocf:heartbeat:mariadb node_list='rp1 rp2' replication_user=mariabackup replication_passwd='password123' promotable promoted-max=1 notify=true. If this is the correct way, then please update the README.md.

The other instructions seems to be valid and in line with the official mariadb documentation.

The synchronization between the master and slave work fine without Pacemaker, but when I run the Pacemaker command to create the cloned resource, both resources remain in a slave role.

I was assuming Pacemaker would handle this for me, like it can do with other clone resources such as Redis and DRBD. Is this assumption wrong? Please check if the instructions in the README.md need updates in order to make it work as designed.

Some more info:

  • The database that's being replicated is called 'nextcloud'.
  • As already mentioned, replication works fine when MariaDB is started with systemd on both nodes.
  • The MySQL user mariabackup is configured on both nodes, as stated in pcs resource describe ocf:heartbeat:mariadb, this user has the following roles assigned: replication client, replication slave, SUPER, PROCESS, RELOAD
  • A Redis cloned resource is configured as well on these nodes, promoting a resource works fine in that setup.
  • Below is the output of the resource scores, for some reason the mariadb resources have a score of -1.
root@rp2:~# crm_simulate -s -L

Current cluster status:
Online: [ rp1 rp2 ]

 Clone Set: redis-www-clone [redis-www] (promotable)
     Masters: [ rp2 ]
     Slaves: [ rp1 ]
 Clone Set: mariadb-clone [mariadb] (promotable)
     Slaves: [ rp1 rp2 ]

Allocation scores:
clone_color: redis-www-clone allocation score on rp1: 0
clone_color: redis-www-clone allocation score on rp2: 0
clone_color: redis-www:0 allocation score on rp1: 0
clone_color: redis-www:0 allocation score on rp2: 1101
clone_color: redis-www:1 allocation score on rp1: 102
clone_color: redis-www:1 allocation score on rp2: 0
native_color: redis-www:0 allocation score on rp1: 0
native_color: redis-www:0 allocation score on rp2: 1101
native_color: redis-www:1 allocation score on rp1: 102
native_color: redis-www:1 allocation score on rp2: -INFINITY
redis-www:0 promotion score on rp2: 1001
redis-www:1 promotion score on rp1: 2
clone_color: mariadb-clone allocation score on rp1: 0
clone_color: mariadb-clone allocation score on rp2: 0
clone_color: mariadb:0 allocation score on rp1: 0
clone_color: mariadb:0 allocation score on rp2: 100
clone_color: mariadb:1 allocation score on rp1: 100
clone_color: mariadb:1 allocation score on rp2: 0
native_color: mariadb:0 allocation score on rp1: 0
native_color: mariadb:0 allocation score on rp2: 100
native_color: mariadb:1 allocation score on rp1: 100
native_color: mariadb:1 allocation score on rp2: -INFINITY
mariadb:0 promotion score on rp2: -1
mariadb:1 promotion score on rp1: -1

Transition Summary:
  • The full configuration of the cluster:
root@rp2:~# pcs config show
Cluster Name: neobits
Corosync Nodes:
 rp1 rp2
Pacemaker Nodes:
 rp1 rp2

Resources:
 Clone: redis-www-clone
  Meta Attrs: promotable=true promoted-max=1
  Resource: redis-www (class=ocf provider=heartbeat type=redis)
   Attributes: config=/etc/redis/redis_www.conf
   Operations: demote interval=0s timeout=120s (redis-www-demote-interval-0s)
               monitor interval=45s timeout=60s (redis-www-monitor-interval-45s)
               monitor interval=20s role=Master timeout=60s (redis-www-monitor-interval-20s)
               monitor interval=60s role=Slave timeout=60s (redis-www-monitor-interval-60s)
               notify interval=0s timeout=90s (redis-www-notify-interval-0s)
               promote interval=0s timeout=120s (redis-www-promote-interval-0s)
               start interval=0s timeout=120s (redis-www-start-interval-0s)
               stop interval=0s timeout=120s (redis-www-stop-interval-0s)
 Clone: mariadb-clone
  Meta Attrs: notify=true promotable=true promoted-max=1
  Resource: mariadb (class=ocf provider=heartbeat type=mariadb)
   Attributes: node_list="rp1 rp2" replication_passwd=password123 replication_user=mariabackup
   Operations: demote interval=0s timeout=120s (mariadb-demote-interval-0s)
               monitor interval=20s timeout=30s (mariadb-monitor-interval-20s)
               monitor interval=10s role=Master timeout=30s (mariadb-monitor-interval-10s)
               monitor interval=30s role=Slave timeout=30s (mariadb-monitor-interval-30s)
               notify interval=0s timeout=90s (mariadb-notify-interval-0s)
               promote interval=0s timeout=120s (mariadb-promote-interval-0s)
               start interval=0s timeout=120s (mariadb-start-interval-0s)
               stop interval=0s timeout=120s (mariadb-stop-interval-0s)

Stonith Devices:
Fencing Levels:

Location Constraints:
Ordering Constraints:
Colocation Constraints:
Ticket Constraints:

Alerts:
 No alerts defined

Resources Defaults:
 migration-threshold: 2
 failure-timeout: 1m
 resource-stickiness: 100
Operations Defaults:
 No defaults set

Cluster Properties:
 cluster-infrastructure: corosync
 cluster-name: neobits
 dc-version: 2.0.1-9e909a5bdd
 have-watchdog: false
 last-lrm-refresh: 1578322867
 no-quorum-policy: ignore
 redis-www_REPL_INFO: rp2
 stonith-enabled: false

Quorum:
  Options:
  • The MariaDB config in use by rp1:
[mysqld]
bind-address=192.168.122.248
server_id=1
log-basename=master
log-bin
binlog-format=row
binlog-do-db=nextcloud
  • The MariaDB config in use by rp2:
[mysqld]
bind-address=192.168.122.36
server-id=2
replicate-do-db=nextcloud
log-basename=master
log-bin
binlog-format=row
  • In the pacemaker.log an error was reported that bc was not installed, after installing the promoting of resources finally worked. However, not stable.
@tsrubar
Copy link

tsrubar commented Apr 22, 2020

I dont know whats wrong, but even after install and setting permission 755 i keep getting:
Error: Agent 'ocf:heartbeat:mariadb' is not installed or does not provide valid metadata: Metadata query for ocf:heartbeat:mariadb failed: Input/output error, use --force to override

@oalbrigt
Copy link
Contributor

You shouldnt need to chmod it.

Just run ./autogen.sh && ./configure and then copy heartbeat/mariadb to /usr/lib/ocf/resource.d/heartbeat/.

You might also need to copy heartbeat/mysql-common.sh to /usr/lib/ocf/lib/heartbeat/.

@kees-closed
Copy link
Author

For me the issue was resolved. The problem was the initial setup before setting up the cluster. It had nothing to do with Pacemaker itself. Please close, if @tsrubar issue is resolved as well.

@tsrubar
Copy link

tsrubar commented Apr 23, 2020

Thanks, it helped and finally made me able to create ocf:heartbeat:mariadb resource. I cloned this repo to /usr/lib/ocf/resource.d/ and pulled everything to be up-to-date, needed to yum install -y docbook-style-xsl glib2-devel and then I ran ./autogen.sh && ./configure what finished just fine. I think that actual documentation would really need updating to make it more clear. Thanks again guys for your help.

EDIT: ok, so now I am for few hours dealing with this: i have 2 instances of mariadb configured exactly according to manual instructions, when run manually, they are working perfectly, but when i stop them and let the pcs start them up as a resource, im keep getting:
`Failed Resource Actions:

  • mariadb_server_start_0 on node1 'unknown error' (1): call=80, status=complete, exitreason='Resource is not configured as master/slave',
    last-rc-change='Thu Apr 23 15:52:07 2020', queued=0ms, exec=106ms
  • mariadb_server_start_0 on node2 'unknown error' (1): call=140, status=complete, exitreason='Resource is not configured as master/slave',
    last-rc-change='Thu Apr 23 15:52:10 2020', queued=0ms, exec=106ms`

When i tried to create resource the same way as @AquaL1te mentioned, i always got the error:
Error: missing value of 'promotable' option.
What am I possibly doing wrong?

@tsrubar
Copy link

tsrubar commented Apr 24, 2020

So, i managed to create and start master/slave resource, using a cib file. I dont know why pcs didnt accept my instructions directly, but now it works. Also some struggling with selinux took the place...

@tsrubar
Copy link

tsrubar commented Apr 24, 2020

For me the issue was resolved. The problem was the initial setup before setting up the cluster. It had nothing to do with Pacemaker itself. Please close, if @tsrubar issue is resolved as well.

@AquaL1te please, what was yor problem? i am in the same stage now as you were, that no server come up as a master. Thanks

@kees-closed
Copy link
Author

@tsrubar it was just a quick test for my home setup. I haven't used it in "production" because of lack of time. However, I did document all the steps so that when I'll resume that hobby project, I won't make the same mistakes. I hope it's useful for you as well. Also check out the official documentation. Make sure replication and manual failover works before you start adding it to the HA cluster. The replication users are mostly important, since they need extra permissions when using Pacemaker, which is not covered in the MariaDB manual for obvious reasons.

Below is my documentation from a few months ago. Some stuff might not be applicable for you, but hopefully it will give you a rough idea of what I did.


Troubleshooting commands

  • Show all users with; select host, user, password from mysql.user;
  • Show user privileges with; SHOW GRANTS FOR 'mariabackup'@'%';
  • Delete a user with; DROP USER 'mariabackup'@'%';
  • To see an operational overview; SHOW ENGINE INNODB STATUS \G;
  • INNODB_TRX table indicates whether the transaction is waiting for a lock, to see this overview use; SELECT * FROM INFORMATION_SCHEMA.INNODB_TRX \G;

Preperation

  • apt install mariadb-server
  • mysql_secure_installation
  • In /etc/mysql/mariadb.conf.d an extra config (90-replication.cnf) needs to be added with the following (as an Ansible template):
[mysqld]
bind-address=192.168.122.248
skip-show-database
local-infile=0
max_connect_errors=5
server_id=1
log-basename=pacemaker
log-bin
binlog-format=row
binlog-do-db=nextcloud
  • Where base-name, server_id and bind address need to be included in the template if IP is used

  • Nextcloud may also connect by using the local socket: /run/mysqld/mysqld.sock

  • Disable all units, will be handled by Pacemaker; systemctl disable --now mariadb mysql

  • Prepare the database, by first creating it; create database nextcloud;

  • Create replication user with; CREATE USER 'pacemaker'@'%' IDENTIFIED BY 'password123'; - Figure out if you need to provide the peer IP in the host field, probably you do, then the user 'pacemaker' on the peer can login and replicate.

  • Grant the replication user with network privileges; GRANT REPLICATION CLIENT, REPLICATION SLAVE ON *.* TO 'pacemaker'@'rp%.neobits.nl' IDENTIFIED BY 'password123';

  • And with local permissions; GRANT REPLICATION CLIENT, REPLICATION SLAVE, SUPER, PROCESS, RELOAD on *.* to 'pacemaker'@'localhost' identified by 'password123';

  • In order to test tables for monitoring, create the following user; GRANT SELECT ON mysql.user TO 'pacemaker'@'localhost' IDENTIFIED BY 'password456';

  • Export the database on the primary node and import it on the secondary node.

  • Run on both nodes; CHANGE MASTER TO MASTER_USE_GTID = slave_pos;

  • pcs resource create mariadb ocf:heartbeat:mariadb node_list='rp1 rp2' replication_user=pacemaker replication_passwd='password123' promotable promoted-max=1 notify=true

CHANGE MASTER TO
  MASTER_HOST='rp1',
  MASTER_USER='pacemaker',
  MASTER_PASSWORD='password123',
  MASTER_PORT=3306,
  MASTER_LOG_FILE='mysqld-bin.000012',
  MASTER_LOG_POS=343,
  MASTER_CONNECT_RETRY=10;

@tsrubar
Copy link

tsrubar commented Apr 24, 2020

@AquaL1te Many thanks, you are my hero :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants