Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix table configuration issues #4182

Closed
timmaxw opened this issue May 5, 2015 · 5 comments
Closed

Fix table configuration issues #4182

timmaxw opened this issue May 5, 2015 · 5 comments
Assignees
Milestone

Comments

@timmaxw
Copy link
Member

timmaxw commented May 5, 2015

In raft we currently don't show issues in the web UI if there are problems with table configuration. This needs to be fixed.

This is a bit tricky because we can no longer check the table config on each server separately; instead, the servers hosting the data have to check the table config and then forward the issues over the network to the server reporting the issues. We already have a mechanism for this: the local issue tracker logic. But it's still a non-trivial amount of work.

Probably this should be done after #3897.

@timmaxw timmaxw added this to the raft milestone May 5, 2015
@timmaxw
Copy link
Member Author

timmaxw commented May 5, 2015

Actually, there's a lazier way to do it: when the user requests the current_issues table, we can fetch the configuration of every table over the network, then analyze it for issues on the machine that's reporting to the user. This would require a lot less development effort than the alternative.

@timmaxw
Copy link
Member Author

timmaxw commented May 5, 2015

Table configuration issues we need to report:

  • Some server(s) listed in the config are unavailable, but the table is still available for reads and writes.
  • The table is unavailable for writes, but auto-failover should kick in any second now.
  • We've lost a quorum for one or more shards, so auto-failover is impossible and we can't perform writes (on this side of the netsplit, anyway). Variant: we've also lost a quorum for the table as a whole.
  • None of the servers hosting the table are available, so we couldn't even fetch the config.
  • All servers are available, but the table's write ack setting is too high to be satisfiable for some shards, so the table is not available for writes.

For any given table we should report at most one of the above issues. If multiple issues apply for different shards, we should report whichever one is more severe. We're already doing something similar in 2.0, but the details of the issues and how to detect them are different.

@Tryneus
Copy link
Member

Tryneus commented May 11, 2015

Working on this.

@Tryneus Tryneus self-assigned this May 11, 2015
@Tryneus
Copy link
Member

Tryneus commented May 20, 2015

Up in review 2904.

@Tryneus
Copy link
Member

Tryneus commented May 20, 2015

This has been approved and merged into raft in commit 664657a.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants