Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node redirection #2154

Merged
merged 3 commits into from
Apr 4, 2015
Merged

Node redirection #2154

merged 3 commits into from
Apr 4, 2015

Conversation

jwilder
Copy link
Contributor

@jwilder jwilder commented Apr 2, 2015

This change allows a given node to handle broker (/raft, /messages) and data (/data_nodes_*, etc..) endpoints even when they are not operating as that role. The behavior works as follows:

For a broker only node, requests for broker endpoints are handled as before. Requests for data node endpoints are returned with a HTTP redirect to the first data node URL that the broker knows about. Brokers learn about data node URLs through the heartbeat for the topics they are subscribed.

For a data only node, requests for broker endpoints are handled as before. Requests for broker node endpoints are returned with a HTTP redirect to the first broker node URL that the data node knows about. Data nodes learn about brokers boot time through join URLs or through the config store in the metastore on a restart.

When a node is running as both a broker and a data node, the all requests are handled by the node itself.

@jwilder
Copy link
Contributor Author

jwilder commented Apr 2, 2015

4f5bf04 might fix #2152

return
}

if h.Server == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How could this happen? If we're not a broker, and not a server, something has gone badly wrong, right?? If we do want this check, I feel like it should be at the top of the function, i.e.

if h.Server == nil && h.Broker == nil {
    http.Error(w, "invalid node state", http.StatusInternalError)
    return
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can happen when starting several servers concurrently and they race to connect to each other. Happens in the tests frequently and could happen in real world too. The real issue is that the listeners are started before the broker/server are ready at server startup. I originally tried to fix this by starting the listener last but it caused other startup problems. Going to be revisiting that and this block after #1934 lands.

@otoolep
Copy link
Contributor

otoolep commented Apr 3, 2015

OK, this generally makes sense. I'd like some feedback on my review comments.

@jwilder
Copy link
Contributor Author

jwilder commented Apr 3, 2015

All comments addressed.

@otoolep
Copy link
Contributor

otoolep commented Apr 3, 2015

Thanks for the changes. +1

@otoolep
Copy link
Contributor

otoolep commented Apr 3, 2015

CHANGELOG needs updating. I think this is a feature.

This is a pre-requisite for #1934.  When running separate
broker and data nodes, you currently need to know what role
a host is performing.  This complicates cluster setup in
that you must configure separate broker URLs and data node
URLs.

This change allows a broker only node to redirect data nodes endpoints
to a valid data node and a data only node to redirect broker
endpoints to a valid broker.
When a data node starts up, the broker URLs were not set before
they were actually being used.  The call to client.Open() in
turn triggers the raft streamer and heartbeat which try to connect
to the broker.   If those started before the subsequent client.SetURLs()
call, you would see the following error in the logs at startup:

[messaging] 2015/04/01 11:59:22 reconnecting to broker: url={  <nil>  /messaging/messages index=2&streaming=true&topicID=0 }, err=Get /messaging/messages?index=2&streaming=true&topicID=0: unsupported protocol scheme ""

Fixing this race uncovered another bug where the join urls would be
cleared the first time the broker was started.  In this case, the
join urls should be left alone since they were set properly w/ SetURLs.

Fixes #2152
jwilder added a commit that referenced this pull request Apr 4, 2015
@jwilder jwilder merged commit e6a7d6b into master Apr 4, 2015
@jwilder jwilder deleted the redirects branch April 4, 2015 03:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants