Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster Setup #5706

Merged
merged 4 commits into from
Feb 18, 2016
Merged

Cluster Setup #5706

merged 4 commits into from
Feb 18, 2016

Conversation

jwilder
Copy link
Contributor

@jwilder jwilder commented Feb 16, 2016

This PR addresses most of the issues from #5673. Specifically, it changes the following:

  • Ensures node IDs are the same when a node is running both meta and data services
  • Allows bind addresses where a hostname or IP is not specified to work correctly and bind to all interfaces by default. e.g. bind-address = ":8088" will now bind to all interfaces instead of just localhost.
  • Fixes the top-level hostname config option to allow overriding all bind address hostnames. This allows a node to advertise a different hostname than what is defined in the bind address setting. For example, if the config is bind-address = ":8088" and hostname = "influx1", the node will bind to all interfaces on port 8088 and remote nodes will reach this node using the address influx1:8088. If a hostname is not specified, we default to localhost for backwards compatibility. This may change to os.Hostname() in the future if/when Add gossip protocol for determining node addresses #5672 is implemented.
  • Adds the -hostname command-line option back to allow specifying both -join and -hostname as command-line flags if desired.
  • Enforces a configuration precedence and overriding ability defined as config file is overridden by env vars which are overriden by command-line flags. These options apply in order and update the Config used by the services and code.
  • Adds the join config file option back to meta config. This allows join servers to be specified in a config files, via env vars, or command-line flags and ordering precedence is the same as -hostname.

Fixes #5669

@corylanou

@jwilder jwilder added this to the 0.11.0 milestone Feb 16, 2016
@jwilder jwilder changed the title Use same node ID for meta and data nodes Cluster Setup Feb 17, 2016
@@ -1251,6 +1258,14 @@ func (e errRedirect) Error() string {
return fmt.Sprintf("redirect to %s", e.host)
}

type errCommand struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to make thing a struct instead of just an aliased type? type errCommand string

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular reason other than the error above did it this way: #5706 (diff)

@e-dard
Copy link
Contributor

e-dard commented Feb 18, 2016

Small nit but LGTM 👍

-hostname is back \o/

This fixes several issues related to the bind address and hostname:
* Allows bind addresses where a hostname or IP is not specified to
work correct and bind to all interfaces by default.
* Fixes the top-level "hostname" config option to allow overridding
all bind address hostnames.  This allows a node to advertise a different
hostname than what is defined in the bind address setting.
* Adds the -hostname command-line option back to allow specifing
both -join and -hostname as command-line flags.
* Enforces a configuration precedence and overriding ability defined
as config file is overridden by env vars which are overriden by command-line
flags.

Fixes #5670 #5671
Dropping a meta node that had already been removed from the config
would fail because the raft.RemovePeers call would return an error
that the address was unknown.  This change skips calling RemovePeer
if it doesn't exist.

Dropping a non-existing ID would hang for 10 seconds becuase the
meta.Client retryUntilExec didn't differentiate before command errors
and redirect errors.  In this case, the command would return an error
but we'd try 10 more times and ultimately give up and return the error.
We now return immediately if the command returned and error because
retrying it will not succeed.

Finally, the join loop had no delay and would immediately try to join
the other nodes hundreds of times a second.  We now pause a second if we've
tried every node at least once.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants