Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Post Take-Master Processes Hook #859

Merged
merged 8 commits into from May 1, 2019

Conversation

daniel-2647
Copy link
Contributor

@daniel-2647 daniel-2647 commented Apr 5, 2019

Description

This PR creates a hook which would only be called upon a successful Take-Master event. This would enable custom scripts to be called at those times when a post failover hook is not called, i.e. When we do a graceful failover of an intermediary master (See #799 for full details)

This PR introduces 1 new config option:

PostTakeMasterProcesses : "some PostTakeMasterHook here"
There is an issue created for this a few months back with details on the reasoning behind this PR: #799

  • contributed code is using same conventions as original code
  • code is formatted via gofmt (please avoid goimports)
  • code is built via ./build.sh
  • code is tested via go test ./go/...

Added PostTakeMasterProcessesEnabled and PostTakeMasterProcesses to config
Imported go/os and created new functions to enable Post Take-Master Hooks
Added sample config options to enable Post TakeMaster Processes
@daniel-2647
Copy link
Contributor Author

What does this error mean?

+gofmt -s -w go/
+git diff --exit-code --quiet
The command "script/cibuild" exited with 1.'''

@daniel-2647
Copy link
Contributor Author

daniel-2647 commented Apr 9, 2019

Sample outcome when using this functionality: (this PR addresses this issue #799)

  1. Pre take-master:
[root@lab-proxysql-1 ~]# orchestrator-client -c topology -i lab-applogdb-1:3306
lab-shadowmaster:3306   [0s,ok,10.3.13-MariaDB-log,rw,MIXED,>>]
+ lab-applogdb-1:3306   [0s,ok,10.3.13-MariaDB-log,rw,MIXED,>>,GTID]
  + lab-applogdb-2:3306 [0s,ok,10.3.13-MariaDB-log,ro,MIXED,>>,GTID]
  + lab-applogdb-3:3306 [0s,ok,10.3.13-MariaDB-log,ro,MIXED,>>,GTID]

[root@lab-proxysql-1 ~]# echo "select * from mysql_servers" | proxysql
hostgroup_id	hostname	port	status	weight	compression	max_connections	max_replication_lag	use_ssl	max_latency_ms	comment
0	lab-applogdb-1	3306	ONLINE	1000	0	1000	0	0	0
1	lab-applogdb-2	3306	ONLINE	200	0	1000	0	0	0
1	lab-applogdb-3	3306	ONLINE	10	0	1000	0	0	0
1	lab-applogdb-1	3306	ONLINE	1000	0	1000	0	0	0

[root@lab-proxysql-1 ~]# consul kv get -recurse
mysql/intmaster/STAGE/lab-applogdb-1:192.168.96.101
mysql/master/STAGE/lab-shadowmaster:3306:lab-shadowmaster:3306
mysql/master/STAGE/lab-shadowmaster:3306/hostname:lab-shadowmaster
mysql/master/STAGE/lab-shadowmaster:3306/ipv4:192.168.96.107
mysql/master/STAGE/lab-shadowmaster:3306/ipv6:
mysql/master/STAGE/lab-shadowmaster:3306/port:3306
mysql/slave/STAGE/lab-applogdb-2:192.168.96.102
mysql/slave/STAGE/lab-applogdb-3:192.168.96.103
  1. Post successful take-master (the hook is only called at the end and upon success of take-master process) is called and custom changes are applied:
[root@lab-proxysql-1 ~]# orchestrator-client -c topology -i lab-applogdb-1:3306
lab-shadowmaster:3306   [0s,ok,10.3.13-MariaDB-log,rw,MIXED,>>]
+ lab-applogdb-2:3306   [0s,ok,10.3.13-MariaDB-log,rw,MIXED,>>,GTID]
  + lab-applogdb-1:3306 [0s,ok,10.3.13-MariaDB-log,ro,MIXED,>>,GTID]
  + lab-applogdb-3:3306 [0s,ok,10.3.13-MariaDB-log,ro,MIXED,>>,GTID]

[root@lab-proxysql-1 ~]# consul kv get -recurse
mysql/intmaster/STAGE/lab-applogdb-2:192.168.96.102
mysql/master/STAGE/lab-shadowmaster:3306:lab-shadowmaster:3306
mysql/master/STAGE/lab-shadowmaster:3306/hostname:lab-shadowmaster
mysql/master/STAGE/lab-shadowmaster:3306/ipv4:192.168.96.107
mysql/master/STAGE/lab-shadowmaster:3306/ipv6:
mysql/master/STAGE/lab-shadowmaster:3306/port:3306
mysql/slave/STAGE/lab-applogdb-1:192.168.96.101
mysql/slave/STAGE/lab-applogdb-3:192.168.96.103

[root@lab-proxysql-1 ~]# echo "select * from mysql_servers" | proxysql
hostgroup_id	hostname	port	status	weight	compression	max_connections	max_replication_lag	use_ssl	max_latency_ms	comment
1	lab-applogdb-1	3306	ONLINE	200	0	1000	0	0	0
0	lab-applogdb-2	3306	ONLINE	1000	0	1000	0	0	0
1	lab-applogdb-2	3306	ONLINE	1000	0	1000	0	0	0
1	lab-applogdb-3	3306	ONLINE	10	0	1000	0	0	0

Copy link
Collaborator

@shlomi-noach shlomi-noach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this PR! I've made a few requests for changes.

go/inst/instance_topology.go Outdated Show resolved Hide resolved
go/inst/instance_topology.go Outdated Show resolved Hide resolved
go/inst/instance_topology.go Outdated Show resolved Hide resolved
go/inst/instance_topology.go Outdated Show resolved Hide resolved
go/inst/instance_topology.go Outdated Show resolved Hide resolved
go/inst/instance_topology.go Outdated Show resolved Hide resolved
…g.go, instance_topology.go and orchestrator-sample.conf.json
@shlomi-noach
Copy link
Collaborator

Thank you! I apologize for the delayed response. Completely swamped, and will be out next week. Will hope to respond by end of this month.

@daniel-2647
Copy link
Contributor Author

Thank you! I apologize for the delayed response. Completely swamped, and will be out next week. Will hope to respond by end of this month.

No worries at all Shlomi, thanks for all the work 🥇

@shlomi-noach
Copy link
Collaborator

What does this error mean?

The tests ensure code is formatted via go fmt.

Copy link
Collaborator

@shlomi-noach shlomi-noach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, thank you!

@shlomi-noach shlomi-noach changed the base branch from master to take-master-hook April 30, 2019 12:19
@shlomi-noach shlomi-noach changed the base branch from take-master-hook to master April 30, 2019 12:20
@shlomi-noach
Copy link
Collaborator

For next time, to make the process smoother, please use a branch other than master 🙏

@shlomi-noach shlomi-noach temporarily deployed to production/mysql_cluster=concertmaster April 30, 2019 12:23 Inactive
@daniel-2647
Copy link
Contributor Author

Thanks Shlomi, appreciate you looking at the request.

@daniel-2647
Copy link
Contributor Author

For next time, to make the process smoother, please use a branch other than master 🙏

Will do sir, my apologies for the confusion and the extra work I've caused. Thanks for looking through this PR and approving the changes.

@shlomi-noach shlomi-noach temporarily deployed to production/mysql_cluster=conductor May 1, 2019 08:28 Inactive
@shlomi-noach shlomi-noach merged commit a641bc5 into openark:master May 1, 2019
@yakirgb
Copy link

yakirgb commented Jun 25, 2020

Hi @daniel-2647 , general question on your comment:

[root@lab-proxysql-1 ~]# consul kv get -recurse
mysql/intmaster/STAGE/lab-applogdb-1:192.168.96.101
mysql/master/STAGE/lab-shadowmaster:3306:lab-shadowmaster:3306
mysql/master/STAGE/lab-shadowmaster:3306/hostname:lab-shadowmaster
mysql/master/STAGE/lab-shadowmaster:3306/ipv4:192.168.96.107
mysql/master/STAGE/lab-shadowmaster:3306/ipv6:
mysql/master/STAGE/lab-shadowmaster:3306/port:3306
mysql/slave/STAGE/lab-applogdb-2:192.168.96.102
mysql/slave/STAGE/lab-applogdb-3:192.168.96.103

How did you add/update consul mysql/slave or mysql/intmaster? in your hook of take-over?

@daniel-2647
Copy link
Contributor Author

Hi @daniel-2647 , general question on your comment:

[root@lab-proxysql-1 ~]# consul kv get -recurse
mysql/intmaster/STAGE/lab-applogdb-1:192.168.96.101
mysql/master/STAGE/lab-shadowmaster:3306:lab-shadowmaster:3306
mysql/master/STAGE/lab-shadowmaster:3306/hostname:lab-shadowmaster
mysql/master/STAGE/lab-shadowmaster:3306/ipv4:192.168.96.107
mysql/master/STAGE/lab-shadowmaster:3306/ipv6:
mysql/master/STAGE/lab-shadowmaster:3306/port:3306
mysql/slave/STAGE/lab-applogdb-2:192.168.96.102
mysql/slave/STAGE/lab-applogdb-3:192.168.96.103

How did you add/update consul mysql/slave or mysql/intmaster? in your hook of take-over?

I created a postfailoverhook.sh script which gets called by the post failover hook. So, in the config, I have something like this:

"PostGracefulTakeoverProcesses": [
    "/usr/local/orchestrator/postfailoverhook.sh >> /tmp/recovery.log 2>&1",
    "echo 'Planned takeover complete' >> /tmp/recovery.log"
  ],

and the shell script makes calls to update the consul store, sample below: (not complete code)

echo "Removing new master from slave pool: ${slave_pool}/${NewMaster}"
${CONSUL_EXEC} kv delete ${slave_pool}/${NewMaster}

echo "Adding old master to slave pool: ${ORC_FAILED_HOST}"
${CONSUL_EXEC} kv put ${slave_pool}/${ORC_FAILED_HOST} ${OldMasterIP}

@yakirgb
Copy link

yakirgb commented Aug 11, 2020

Thank you @daniel-2647 .

@yakirgb
Copy link

yakirgb commented Aug 11, 2020

Also PreTakeMasterProcessescan be useful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants