Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STORM-1492 With nimbus.seeds set to default, a nimbus for localhost may appear "Offline" #1082

Merged
merged 1 commit into from
Sep 11, 2017

Conversation

HeartSaVioR
Copy link
Contributor

First of all, AFAIK, it's just an issue of API output, and it doesn't hurt functionality although we don't fix it.

  • when determining nimbus addresses from nimbus.seeds, filter out localhost addresses
    • localhost address: 127.0.0.1, localhost, 0:0:0:0:0:0:0:1
  • there may be a chance that nimbus (localhost) is not reachable so we can't get info from alive-nimbuses
    • but it means ui process itself can't reach nimbus so we can ignore this case
  • there may be another change that user specifies nimbus.seeds to both unreachable localhost address and other reachable nimbuses fqdn
    • ui process can reach other nimbus and localhost itself is lost
    • but I think we can treat it as configuration error

@HeartSaVioR
Copy link
Contributor Author

Three builds are failing, which are all intermittent.

Close and reopen to retrigger.

@HeartSaVioR HeartSaVioR closed this Feb 5, 2016
@HeartSaVioR HeartSaVioR reopened this Feb 5, 2016
@HeartSaVioR
Copy link
Contributor Author

Probability of random test failures becomes very high, though I don't know why.
Anyway, I ran test which is same to travis CI but without native profile (it doesn't run on my Mac) and build succeed on storm-core. So test failure is not related to this patch.

Btw, build fails on storm-hdfs, throwing java.lang.NoClassDefFoundError: org/apache/storm/thrift/TBase which I don't know why it occurs.

@HeartSaVioR
Copy link
Contributor Author

@ptgoetz @revans2
Could you please take a look? Or Bobby you want to craft your pull request with your suggestion on issue comment?
Otherwise we may want to define STORM-1492 as 'Won't fix'?

@wuchong
Copy link
Member

wuchong commented Feb 26, 2016

I find that when we set nimbus.seeds as ip, such as nimbus.seeds: [10.0.0.10]. Then a nimbus for 10.0.0.10 will be 'Offline'.

That's because the nimbus information saved in zk is CanonicalHostName. But the nimbus seeds we set maybe ip. At last, the nimbus with ip will be 'Offline'.

offline-nimbuses (clojure.set/difference nimbus-seeds alive-nimbuses)

2016-02-26 2 51 43

Change all nimbus-seeds to hostname, before do set/difference, will fix it.

@HeartSaVioR
Copy link
Contributor Author

@wuchong
You're right. Actually this PR just tries to address weird behavior when users leave the setting to default.

Btw, there're more to cover if we want to address this to more ideal way.
For example, canonical host name for each machine can be overwritten by setting storm.local.hostname.
Please refer NimbusConf#fromConf for details.

In result, every nimbuses should know what's the return value of NimbusConf#fromConf for other nimbuses, and every nimbuses should match nimbus.seeds and corresponding values by 1 on 1.
I don't have an idea how to do it for now, and I really appreciate ideas which address this.

Otherwise users should describe the same value which NimbusConf#fromConf will return, for nimbus.seeds.

Copy link
Contributor

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 this change looks fine to me. We do need to upmerge it as Utils has moved around some.

…ay appear "Offline"

* when determining nimbus addresses from nimbus.seeds, filter out localhost addresses
  * localhost address: 127.0.0.1, localhost, 0:0:0:0:0:0:0:1
* there may be a chance that nimbus (localhost) is not reachable so we can't get info from alive-nimbuses
  * but it means ui process itself can't reach nimbus so we can ignore this case
* there may be another change that user writes nimbus.seeds both unreachable localhost address and other reachable nimbuses fqdn
  * ui process can reach other nimbus and localhost itself is lost
  * but I think we can treat it as configuration error
@HeartSaVioR
Copy link
Contributor Author

@revans2 Thanks for reviewing. Just got rebased against current master.

@revans2
Copy link
Contributor

revans2 commented Sep 11, 2017

Still +1

@asfgit asfgit merged commit 6fbd3d4 into apache:master Sep 11, 2017
asfgit pushed a commit that referenced this pull request Sep 11, 2017
…o STORM-1492Q

STORM-1492 With nimbus.seeds set to default, a nimbus for localhost may appear "Offline"

This closes #1082
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants