Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple IP address per NIC awareness #3333

Closed
wey-gu opened this issue Nov 19, 2021 · 10 comments
Closed

Multiple IP address per NIC awareness #3333

wey-gu opened this issue Nov 19, 2021 · 10 comments
Assignees
Labels
type/bug Type: something is unexpected
Milestone

Comments

@wey-gu
Copy link
Contributor

wey-gu commented Nov 19, 2021

version: 2.6

In case for some reason, the host comes with multiple IP addresses in the same network, the metaD won't bootup:

https://discuss.nebula-graph.com.cn/t/topic/6540/10

[root@node2 logs]# cat nebula-metad.node2.root.log.ERROR.20211118-165644.35357
Log file created at: 2021/11/18 16:56:44
Running on machine: node2
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
E1118 16:56:44.078560 35357 MetaDaemon.cpp:256] 10.210.38.69 is not a valid ip in current host, candidates: 10.210.38.70,127.0.0.1
@wey-gu wey-gu added the type/bug Type: something is unexpected label Nov 19, 2021
@wey-gu
Copy link
Contributor Author

wey-gu commented Nov 19, 2021

After follow-up questions, it seems to be not a bug, but a rare case hit where the IP address situation is strange.

@microeastcowboy
Copy link

version:3.3

I have the same error when one more IP (one ip is vip) address in the same network.

图片_lanxin_20221115144406
The log is displayed as follows:
Running duration (h:mm:ss): 0:00:00 Log line format: [IWEF]yyyymmdd hh:mm:ss.uuuuuu threadid file:line] msg E20221115 14:22:23.162267 3723 MetaDaemon.cpp:131] 10.48.20.233 is not a valid ip in current host, candidates: 192.168.1.1,192.168.1.0,127.0.0.1,172.17.0.1,10.42.99.128

@wey-gu Do we have a better way to solve this problem?

@wey-gu
Copy link
Contributor Author

wey-gu commented Nov 15, 2022

OK, I see, now we have the minimal reproduction procedure, that is, when we have multiple addresses per interface, only one of them was considered as listening/configurable candidates.

Before a fix to address this, could you please make vip on IP range other than those for inter-network for nebulaGraph(or you could control the vip and physical ip's order, which seems not possible though, as vip will be floating to always as the secondary one)?

@wey-gu wey-gu reopened this Nov 15, 2022
@wey-gu wey-gu removed the need info Solution: need more information (ex. can't reproduce) label Nov 15, 2022
@wey-gu
Copy link
Contributor Author

wey-gu commented Nov 15, 2022

@Sophie-Xie with the help of @microeastcowboy , now we are able to reproduce this issue.

It's related to the side-effect/assumption that each interface comes with only one address, which isn't true.

@wey-gu wey-gu changed the title Multiple IP address awareness Multiple IP address per NIC awareness Nov 15, 2022
@Sophie-Xie Sophie-Xie added this to the v3.4.0 milestone Nov 15, 2022
@critical27
Copy link
Contributor

10.48.20.233 is not a valid ip in current host, candidates: 192.168.1.1,192.168.1.0,127.0.0.1,172.17.0.1,10.42.99.128

Do you have another log which is as below

10.42.99.128 is not a valid ip in current host, candidates: 192.168.1.1,192.168.1.0,127.0.0.1,172.17.0.1,10.42.99.128

@wey-gu
Copy link
Contributor Author

wey-gu commented Nov 15, 2022

10.48.20.233 is not a valid ip in current host, candidates: 192.168.1.1,192.168.1.0,127.0.0.1,172.17.0.1,10.42.99.128

Do you have another log which is as below

10.42.99.128 is not a valid ip in current host, candidates: 192.168.1.1,192.168.1.0,127.0.0.1,172.17.0.1,10.42.99.128

Yes, quick reproduce just now :)

Oh you meant to expect multi-lines to be logged where the other address of the NIC was listed in candidates?

Then no, there is only this single line of log and the process wasn't up then.

$ sudo ip addr add 192.168.2.4/24 dev eth0

$ ip a
...
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:0d:3a:81:05:8b brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.4/24 brd 10.0.0.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet 192.168.2.4/24 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::20d:3aff:fe81:58b/64 scope link
       valid_lft forever preferred_lft forever

$ grep local_ip /usr/local/nebula/etc/nebula-graphd.conf
--local_ip=10.0.0.4

$ sudo /usr/local/nebula/scripts/nebula.service restart graphd

$ tail /usr/local/nebula/logs/nebula-graphd.INFO
Log file created at: 2022/11/15 07:48:47
Running on machine: atlas-0
Running duration (h:mm:ss): 0:00:00
Log line format: [IWEF]yyyymmdd hh:mm:ss.uuuuuu threadid file:line] msg
E20221115 07:48:47.858683 197064 GraphDaemon.cpp:110] 10.0.0.4 is not a valid ip in current host, candidates: 172.19.0.1,192.168.49.1,127.0.0.1,172.17.0.1,192.168.2.4

@critical27
Copy link
Contributor

I see, @wey-gu, I will send you a patch later this week or next week, would you help to verify it?

@wey-gu
Copy link
Contributor Author

wey-gu commented Nov 15, 2022

I see, @wey-gu, I will send you a patch later this week or next week, would you help to verify it?

Sure! Thanks @critical27 , drop me the patch I can verify real quick :)

@microeastcowboy
Copy link

@wey-gu Thank you very much for your suggestions and comments. I have make vip on other addresses.

@wey-gu
Copy link
Contributor Author

wey-gu commented Nov 17, 2022

@critical27 tested the patch with git apply, and it all looks good now.

$ ip -f inet addr show eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    inet 10.0.0.4/24 brd 10.0.0.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet 192.168.2.4/24 scope global eth0
       valid_lft forever preferred_lft forever
$ sudo /usr/local/nebula/scripts/nebula.service restart graphd
[INFO] Stopping nebula-graphd...
[INFO] Done
[INFO] Starting nebula-graphd...
[INFO] Done
$ tail /usr/local/nebula/logs/nebula-graphd.INFO
I20221117 03:16:08.163239 213325 WebService.cpp:124] Web service started on HTTP[19669]
I20221117 03:16:08.163314 213324 GraphDaemon.cpp:136] Number of networking IO threads: 2
I20221117 03:16:08.163331 213324 GraphDaemon.cpp:145] Number of worker threads: 2
I20221117 03:16:08.167465 213324 MetaClient.cpp:80] Create meta client to "127.0.0.1":9559
I20221117 03:16:08.167500 213324 MetaClient.cpp:81] root path: /usr/local/nebula, data path size: 0
I20221117 03:16:08.182987 213324 MetaClient.cpp:3114] Load leader ok
I20221117 03:16:08.183965 213324 MetaClient.cpp:162] Register time task for heartbeat!
I20221117 03:16:08.184466 213324 GraphSessionManager.cpp:331] Total of 0 sessions are loaded
I20221117 03:16:08.185299 213324 Snowflake.cpp:16] WorkerId init success: 1
I20221117 03:16:08.185449 213352 GraphServer.cpp:59] Starting nebula-graphd on 10.0.0.4:9669
$ grep local_ip /usr/local/nebula/etc/nebula-graphd.conf
--local_ip=10.0.0.4
$ date
Thu Nov 17 03:17:02 UTC 2022

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Type: something is unexpected
Projects
None yet
Development

No branches or pull requests

4 participants