Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dead Node #23

Open
challengeteamttdh opened this issue May 2, 2016 · 18 comments
Open

Dead Node #23

challengeteamttdh opened this issue May 2, 2016 · 18 comments

Comments

@challengeteamttdh
Copy link

Hi,

I using this lib to intergrate in my application. I saw a issue. when I shutdown a node and turn on again, other node don't know this node.
Example: I have 2 node 1,2. First I open all node. Second I turn off node 2 then turn on again but Node 2 don't know node 1 is UP and node 1 also don't know node 2 is UP.
Please help me resolve this issue.
Thank for your support.

@edwardcapriolo
Copy link
Owner

Questions

  1. are you sure your system clock is in sync
  2. how long was the node down for
  3. Can you set the logging on both servers to debug. and record and relevant output?
  4. what is your configuration? what are the inital contact points

We have a unit tests which does this with 5 nodes so it would be interesting to understand if the same logic does with two nodes

@challengeteamttdh
Copy link
Author

Currently, I applying gossip for Spring boot application. Each node is a instance of Spring Boot. Let's me some advice for apply gossip to Spring Boot Application.

This is my configuration for Application 1 with port 8081.
[{
"cluster":"",
"id":"",
"port":8081,
"gossip_interval":1000,
"cleanup_interval":10000,
"members":[
{"cluster": "","id": "", "host":"192.168.1.90", "port":8084},
{"cluster": "","id": "", "host":"192.168.1.90", "port":8083},
{"cluster": "","id": "", "host":"192.168.1.90", "port":8082}
]
}]

This is my configuration for Application 2 with port 8082.
[{
"cluster":"",
"id":"",
"port":8082,
"gossip_interval":1000,
"cleanup_interval":10000,
"members":[
{"cluster": "","id": "", "host":"192.168.1.90", "port":8084},
{"cluster": "","id": "", "host":"192.168.1.90", "port":8083},
{"cluster": "","id": "", "host":"192.168.1.90", "port":8081}
]
}]
Let's me know if I'm wrong.

@edwardcapriolo
Copy link
Owner

Each node needs an id. In your case you can generate a string or a uuid that will persist between restarts

@challengeteamttdh
Copy link
Author

My application using

io.teknek
gossip
0.0.3

Maybe it isn't generate id when use method
public GossipService(StartupSettings startupSettings) throws InterruptedException, UnknownHostException { this(InetAddress.getLocalHost().getHostAddress(), startupSettings.getPort(), "", startupSettings.getGossipMembers(), startupSettings .getGossipSettings(), null); }
is it right? 0.0.3 version is different to latest code on github. do you have any update version on maven?

@edwardcapriolo
Copy link
Owner

Yes. This looks like a bug of that version. The id was not required in original versions but now it is. Can you please try trunk version. I will release the current trunk later today.

@challengeteamttdh
Copy link
Author

I reviewed code. I think on StartupSetting class.
RemoteGossipMember member = new RemoteGossipMember(memberJSON.getString("cluster"), memberJSON.getString("host"), memberJSON.getInt("port"), "");
also need to generate ID for RemoteGossipMember.
Currently, I changed latest code but It's still error. Please help me resolve this issuse. I hope that you have a release on today.
Thank for your support in this issue.

@edwardcapriolo
Copy link
Owner

RemoteGossipMember member = new RemoteGossipMember(memberJSON.getString("cluster"),
memberJSON.getString("host"), memberJSON.getInt("port"), "");

This code is ok. We would not know the remote id until will connect to that host.

Can you give a strip down example of your Spring boot example?

@challengeteamttdh
Copy link
Author

When I use latest code. It's occur exception. I don't know why.

Exception in thread "pool-6-thread-1" java.lang.NullPointerException
at com.google.code.gossip.mana ger.PassiveGossipThread.run(PassiveGossipThread.java:102)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)

This is my configuration for gossip
[{
"cluster":"1",
"id":"1",
"port":8081,
"gossip_interval":1000,
"cleanup_interval":10000,
"members":[
{"cluster": "1","id": "4", "host":"192.168.1.90", "port":8084, "heartbeat":0},
{"cluster": "1","id": "3", "host":"192.168.1.90", "port":8083, "heartbeat":0},
{"cluster": "1","id": "2", "host":"192.168.1.90", "port":8082, "heartbeat":0}
]
}]

@challengeteamttdh
Copy link
Author

This is my sample code. Please help me review code.
https://github.com/challengeteamttdh/springbootgossip
Thanks. My application have a shedule run each 20s. It's print number base on number of node alive and position of node alive. when It's have a node DOWN or UP. others node need to know and update rule print number for system.
Thank for your support very much.

@edwardcapriolo
Copy link
Owner

    if (memberJSONObject.length() == 5
                  && cluster.equals(memberJSONObject.get(GossipMember.JSON_CLUSTER))) {

This is a new piece of code. I will look at this.

@edwardcapriolo
Copy link
Owner

I found the bug you mentioned. The startup setting code was not setting the cluster name. I am looking at the unit test there because it is suspect. Sorry for the problems. Really cool app I want to take a deeper look at it. Please try the latest trunk again. SOrry for the issues, the cluster name is a new bit and I do not use the StartupSettings code path!

@challengeteamttdh
Copy link
Author

I updated code it isn't occur Exeption. But when I start 2 instance of Spring Boot with port 8081 and 8082 corresponding to gossip.conf are:

  • 8081
    [{
    "cluster":"1",
    "id":"1",
    "port":8081,
    "gossip_interval":1000,
    "cleanup_interval":10000,
    "members":[
    {"cluster": "1","id": "4", "host":"192.168.1.90", "port":8084, "heartbeat":0},
    {"cluster": "1","id": "3", "host":"192.168.1.90", "port":8083, "heartbeat":0},
    {"cluster": "1","id": "2", "host":"192.168.1.90", "port":8082, "heartbeat":0}
    ]
    }]
  • 8082
    [{
    "cluster":"2",
    "id":"2",
    "port":8082,
    "gossip_interval":1000,
    "cleanup_interval":10000,
    "members":[
    {"cluster": "2","id": "4", "host":"192.168.1.90", "port":8084, "heartbeat":0},
    {"cluster": "2","id": "3", "host":"192.168.1.90", "port":8083, "heartbeat":0},
    {"cluster": "2","id": "1", "host":"192.168.1.90", "port":8081, "heartbeat":0}
    ]
    }]

We still don't know member node is UP.
Firstly, I change port 8081 in application.properties and use gossip.conf for 8081 and start spring boot.
Secondly, I change port 8082 in application.properties and use gossip.conf for 8081 and start spring boot.
However, We do not know each other UP or DOWN.
Let's take look at this. I really love your gossip code to integrate to my application. Please spend time help me resolve this issue.

@edwardcapriolo
Copy link
Owner

Great. Keep in mind the getMemberList does not include yourself, so in a two node cluster each node has me + getMemberList() = 1

@challengeteamttdh
Copy link
Author

What Do You Mean ? Am I implementing incorrect? So What I need to do to fix this?

@edwardcapriolo
Copy link
Owner

The only thing I am saying is. The member does not include the local member. The local member is assumed.

@challengeteamttdh
Copy link
Author

Do you have any ideal for my application?. I don't know how to apply gossip to my application. How to a instance of spring boot know to other instance of spring boot.

@edwardcapriolo
Copy link
Owner

How do you start two compies of the application?

mvn spring-boot:run -Drun.jvmArguments='-Dserver.port=8081'
mvn spring-boot:run -Drun.jvmArguments='-Dserver.port=8082'

Whehn i do this they take the same config

@challengeteamttdh
Copy link
Author

You need change port gossip.conf like port instance of Spring Boot.
This is gossip.conf for port 8081:
[{ "cluster":"1", "id":"1", "port":8081, "gossip_interval":1000, "cleanup_interval":10000, "members":[ {"cluster": "1","id": "4", "host":"192.168.1.90", "port":8084, "heartbeat":0}, {"cluster": "1","id": "3", "host":"192.168.1.90", "port":8083, "heartbeat":0}, {"cluster": "1","id": "2", "host":"192.168.1.90", "port":8082, "heartbeat":0} ] }]
Then run mvn spring-boot:run -Drun.jvmArguments='-Dserver.port=8081'

This is gossip.conf for port 8082:
[{ "cluster":"2", "id":"2", "port":8082, "gossip_interval":1000, "cleanup_interval":10000, "members":[ {"cluster": "2","id": "4", "host":"192.168.1.90", "port":8084, "heartbeat":0}, {"cluster": "2","id": "3", "host":"192.168.1.90", "port":8083, "heartbeat":0}, {"cluster": "2","id": "1", "host":"192.168.1.90", "port":8081, "heartbeat":0} ] }]
Then run mvn spring-boot:run -Drun.jvmArguments='-Dserver.port=8082'

is it necessary run same gossip.conf?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants