Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add structure warning to replication-analysis when all masters are read_only #878

Merged
merged 22 commits into from Jun 20, 2019

Conversation

jfudally
Copy link
Contributor

@jfudally jfudally commented May 8, 2019

Add structure warning to replication-analysis when all master nodes are read_only

A Pull Request should be associated with an Issue.

We wish to have discussions in Issues. A single issue may be targeted by multiple PRs.
If you're offering a new feature or fixing anything, we'd like to know beforehand in Issues,
and potentially we'll be able to point development in a particular direction.

Related issue: #865

Thank you! We are open to PRs, but please understand if for technical reasons we are unable to accept each and any PR

Description

This PR adds NoWriteableMasterStructureWarning to replication-analysis during scenarios when every master node in the cluster is read_only, omitting intermediate masters.

# Cluster with cyclical replication between two masters, and an intermediate master
bash-4.4# ./orchestrator -c topology -i mysql_1
+ mysql_1:3306     [0s,ok,5.7.25-log,rw,ROW]
  + mysql_2:3306   [0s,ok,5.7.25-log,ro,ROW]
    + mysql_3:3306 [0s,ok,5.7.25-log,ro,ROW]
+ mysql_4:3306     [0s,ok,5.7.25-log,rw,ROW]
  + mysql_5:3306   [0s,ok,5.7.25-log,ro,ROW]

# No warnings for the intermediate master
bash-4.4# ./orchestrator -c replication-analysis -i mysql_1

# mysql_1 master is now R/O
bash-4.4# ./orchestrator -c topology -i mysql_1
+ mysql_1:3306     [0s,ok,5.7.25-log,ro,ROW]
  + mysql_2:3306   [0s,ok,5.7.25-log,ro,ROW]
    + mysql_3:3306 [0s,ok,5.7.25-log,ro,ROW]
+ mysql_4:3306     [0s,ok,5.7.25-log,rw,ROW]
  + mysql_5:3306   [0s,ok,5.7.25-log,ro,ROW]

# No warnings from 1 of 2 co-masters being R/O
bash-4.4# ./orchestrator -c replication-analysis -i mysql_1

# Both co-masters mysql_1 & mysql_3 are now R/O
bash-4.4# ./orchestrator -c topology -i mysql_1
+ mysql_1:3306     [0s,ok,5.7.25-log,ro,ROW]
  + mysql_2:3306   [0s,ok,5.7.25-log,ro,ROW]
    + mysql_3:3306 [0s,ok,5.7.25-log,ro,ROW]
+ mysql_4:3306     [0s,ok,5.7.25-log,ro,ROW]
  + mysql_5:3306   [0s,ok,5.7.25-log,ro,ROW]

🎉 
bash-4.4# ./orchestrator -c replication-analysis -i mysql_1
mysql_4:3306 (cluster mysql_1:3306): NoWriteableMasterStructureWarning

In case this PR introduced Go code changes:

  • contributed code is using same conventions as original code
  • code is formatted via gofmt (please avoid goimports)
  • code is built via ./build.sh
  • code is tested via go test ./go/...

@jfudally jfudally requested a review from shlomi-noach May 8, 2019 15:15
@@ -64,6 +64,7 @@ const (
DifferentGTIDModesStructureWarning = "DifferentGTIDModesStructureWarning"
ErrantGTIDStructureWarning = "ErrantGTIDStructureWarning"
NoFailoverSupportStructureWarning = "NoFailoverSupportStructureWarning"
NoWriteableNodesWarning = "NoWriteableNodesStructureWarning"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be concerned with non-writable master, for sure. The rest of the nodes are practically irrelevant (that is, we expect them to be read-only anyhow).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd name this: NonWritableMasterWarning

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shlomi-noach Make sense, I've updated to warn if the current master is read-only.

@shlomi-noach
Copy link
Collaborator

To elaborate, on a normal topology, we only expect the master to be writable. Right now we're interested in a scenario where the master is read-only. We normally expect (and hope) for replicas to be read-only, too, but I've seen cases out there where intermediate master could be taking writes. Since we're aiming at the global audience, my suggestion is to let go of looking into the replicas.

So we should only care if we have a read-only master.

Another thing is that we want to check is the case where we have co-masters. In a co-master (circular master-master) setup, it's OK if one of the co-masters is read-only. It's also OK (though generally discouraged) if both co-masters are writable. It's not OK if both are read-only...

@jfudally jfudally requested a review from shlomi-noach May 8, 2019 16:27
@jfudally
Copy link
Contributor Author

jfudally commented May 8, 2019

I'l take a look at the co-master scenarios as well

@jfudally
Copy link
Contributor Author

@shlomi-noach I've updated the PR body. I've tested new logic to omit warnings on intermediate masters being R/O, and if at least 1 co-master is still R/W.

@@ -90,15 +91,17 @@ func GetReplicationAnalysis(clusterName string, hints *ReplicationAnalysisHints)
SELECT
master_instance.hostname,
master_instance.port,
MIN(master_instance.data_center) AS data_center,
MIN(master_instance.physical_environment) AS physical_environment,
master_instance.read_only AS read_only,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

MIN(master_instance.data_center) AS data_center,
MIN(master_instance.physical_environment) AS physical_environment,
master_instance.read_only AS read_only,
SUM(master_instance.is_co_master) AS co_master_count,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will be either 0 or 1, not what you'd expect. See later discussion.

@@ -296,6 +301,10 @@ func GetReplicationAnalysis(clusterName string, hints *ReplicationAnalysisHints)
a.CountDelayedReplicas = m.GetUint("count_delayed_replicas")
a.CountLaggingReplicas = m.GetUint("count_lagging_replicas")

if a.IsCoMaster && m.GetUint("read_only") == 1 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please read m.GetUint("read_only") into a variable first. In fact, it's really interesting information, make that a member in the ReplicationAnalysis struct

@@ -296,6 +301,10 @@ func GetReplicationAnalysis(clusterName string, hints *ReplicationAnalysisHints)
a.CountDelayedReplicas = m.GetUint("count_delayed_replicas")
a.CountLaggingReplicas = m.GetUint("count_lagging_replicas")

if a.IsCoMaster && m.GetUint("read_only") == 1 {
coMasterROCount++
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This look can iterate over multiple clusters. You only want to count the number of co masters per cluster. So this trick of coMasterROCount++ won't work.

A cluster with two co-masters appears in orchestrator as a single cluster, not as two clusters. One of the two co-masters is deemed to be the "primary" master. If one is read-only and another is writable, it's the writable. If both are read-only or both are writable, I believe the choice is alphabetic.

So, as I'm writing this, I'm rethinking our earlier discussion. if the master is a co_master and is read_only, then the other co-master is also read_only (or else IT would have been the primary master!). This simplifies things and you can get rid of coMasterROCount.

@@ -477,6 +486,14 @@ func GetReplicationAnalysis(clusterName string, hints *ReplicationAnalysisHints)
if a.MaxReplicaGTIDErrant != "" {
a.StructureAnalysis = append(a.StructureAnalysis, ErrantGTIDStructureWarning)
}

if a.IsMaster && m.GetUint("read_only") == 1 && m.GetString("master_host") == "" {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"master_host" is already read in a.AnalyzedInstanceMasterKey. You can ask if a.AnalyzedInstanceMasterKey.IsValid() (== non empty)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given earlier discussion, I think we can skip the check about master_host altogether.

if a.IsMaster && m.GetUint("read_only") == 1 && m.GetString("master_host") == "" {
a.StructureAnalysis = append(a.StructureAnalysis, NoWriteableMasterStructureWarning)
}
if a.IsCoMaster && coMasterROCount == m.GetInt("co_master_count") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given earlier discussion, I think we can skip this block altogether.

@jfudally jfudally changed the title Add structure warning to replication-analysis when all nodes are read_only Add structure warning to replication-analysis when all masters are read_only Jun 3, 2019
Copy link
Collaborator

@shlomi-noach shlomi-noach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! Let's deploy to our staging environment and experiment

@@ -90,6 +91,7 @@ func GetReplicationAnalysis(clusterName string, hints *ReplicationAnalysisHints)
SELECT
master_instance.hostname,
master_instance.port,
master_instance.read_only AS read_only,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inconsistent indentation here

@jfudally
Copy link
Contributor Author

jfudally commented Jun 6, 2019

Sounds good...as soon as Travis decides to work I can deploy to our test environment.

@jfudally jfudally temporarily deployed to production/mysql_cluster=mysql6 June 6, 2019 21:11 Inactive
@shlomi-noach
Copy link
Collaborator

Travis is used for our external facing builds; our internal builds are all that is needed (and at this time we will need to kick the build manually).

@jfudally jfudally temporarily deployed to production/mysql_cluster=mysql6 June 7, 2019 07:15 Inactive
@jfudally jfudally temporarily deployed to production/mysql_cluster=concertmaster June 10, 2019 13:33 Inactive
@jfudally jfudally temporarily deployed to production/mysql_cluster=mysql6 June 10, 2019 14:24 Inactive
@jfudally jfudally had a problem deploying to production/mysql_cluster=concertmaster June 10, 2019 14:24 Failure
@jfudally jfudally temporarily deployed to production/mysql_cluster=mysql6 June 10, 2019 14:26 Inactive
@jfudally jfudally temporarily deployed to production/mysql_cluster=concertmaster June 10, 2019 14:26 Inactive
@jfudally jfudally temporarily deployed to production/mysql_cluster=mysql6 June 20, 2019 08:51 Inactive
@shlomi-noach shlomi-noach had a problem deploying to production/mysql_cluster=concertmaster June 20, 2019 08:51 Failure
@shlomi-noach
Copy link
Collaborator

Good to go! Identifies a read-only master on a single-master topology. Future PR to handle co-masters.

@shlomi-noach shlomi-noach merged commit b0cab8d into master Jun 20, 2019
@shlomi-noach shlomi-noach deleted the jfudally-ro-structure-warning branch June 20, 2019 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants