Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Advise on building a fail-over solution on top of NFSdb (with two+ nodes) #29

Closed
vguhesan opened this issue Jan 12, 2015 · 7 comments
Closed
Assignees
Labels
New feature Feature requests
Milestone

Comments

@vguhesan
Copy link

Hello:
I've been testing the replication options available in NFSdb.

  • I like the benchmark numbers I see on NFSdb
  • I like the fact that I can setup a client to replicate the data onto one or many nodes

In order to use NFSdb in a production environment, fail-over and data-redundancy becomes an question.

Is there any plans to implement a fail-over implementation or sample code/pseudo-code that can be shared that will help me in building a fail-over solution on top of NFSdb?

Scenario:

  • {Node1} has a server running NFSdb
  • {Node2} is running a client replicating the database
  • Node1 goes down, I need Node2 to switch from a client mode to a server mode.
  • More data is appended into Node2 and at some later point of time Node1 is started
  • Node1 needs to be aware that there is already a master running on Node2 and join it as a client (instead of a server)

Is this possible using the (multicast) foundation in NFSdb.

Any advise on how this can be achieved?

Thank you and looking forward to your feedback.

Venkatt

@bluestreak01
Copy link
Member

Hi Venkatt,

The bad news is that at present NFSdb can only run in-process, so its failover is that of parent process failover mechanism. There is on-going effort to have nfsdb run out-of-process, in which case it will have its own client with failover built-in.

The good news is that getting a client to be a server at the same time can be done pretty easily. I'll write up an example very shortly.

Server recovery after fail over is relatively easy. Because updates are incremental it is possible to wrap a journal in a client instance and have it replicate from former client-now-server.

Multicast is not supported for data, not yet anyway. This is partly because nfsdb protocol allows each client to have different state and replication is tailored to state of client. But in BAU over dedicated network link i guess multicast will have an advantage. May be one day :)

@bluestreak01
Copy link
Member

Hi Venkatt,

I had a recap on replication and failover and it isn't possible to fail over writer automatically. Client can reconnect if server goes down, but that is all that is atumatic in current version.

Making automated failover for your scenario is not difficult and there is a plan to do it now! Here is sample logic:

  1. Node 1 and Node 2 are identical, they both will run "ClusterNode", which is both server and client
  2. On startup nodes automatically decide who master is, this will depend on startup order.
  3. "ClusterNode" will signal application that it is a master and provide you with a way to get JournalWriter(s)
  4. On client "ClusterNode" will signal your code that is in standby mode
  5. Both server and client will maintain heartbeat and once it is lost "client" will become server and will notify your code of state change
  6. When old server is restarted it will assume role of client automatically due to other node being present.
  7. Client will automatically recover itself and will start replicating from server node.

Let me know if this works for you.

Vlad

@vguhesan
Copy link
Author

Vlad:

I believe that the last model you have described with "ClusterNode" will work. So is "ClusterNode" a code/class that you will be adding in an upcoming release or is this something I can develop with your guidance and/or examples? Please advise.
What I can do on my application side is programmatically determine if the underlying instance is running on master or on standby mode. If running on standby mode, I can have my web application send a HTTP-302 redirect for the REST API onto the master server which will consume the POST data normally.
Question - in the example you had described, is there a way, I could get a list of all other nodes participating in this group? For example, if I POST to the client, can it determine what the IP for the master is and send the redirect URL with the correct master IP?
Please advise on how I can proceed forward with the "ClusterNode" implementation.
Thanks in advance.

Venkatt

bluestreak01 added a commit that referenced this issue Jan 16, 2015
@bluestreak01
Copy link
Member

Implementing cluster will require changes in both server and client code, so i'll do that. Changes are not very complex, so it won't be long.

It should be possible to announce cluster winner to other nodes. After voting for master all remaining nodes will have to connect their clients there and this information can be published to the app code.

I'll post more details on usage model very soon, need to prove that all the parts work first.

@bluestreak01
Copy link
Member

Hi Venkatt,

I have an example of creating a cluster of producers for you: ClusteredProducerMain.java

Although it is for two producers, you can extend it for three or more as you need. Important thing to be aware of that each cluster node must have their unique integer instance id. It is used in logging and also for tie break voting in case two nodes start up at the same time.

As things stand it is safe to have nodes started by either monitoring tools or schedulers, if they come up at the same time they will resolve their roles automatically.

Shutdown procedure is as graceful as possible and will wait for all in-flight network transmissions before cutting the wire. I will expose a timeout API though in case waiting is not in option. In this case in-flight transactions may be lost.

There is more work needed to make reades fail over between cluster nodes and automatically error correct. But that should not take long at all.

Let me know if you think current API can improve in some way or if anything doesn't work for you.

Regards,
Vlad

@vguhesan
Copy link
Author

Hi Vlad,

Thank you very much on devising this solution. I will try this out either
tonight or in the next few days and get back to you.

Best Regards,
Venkatt Guhesan

On Thu, Jan 22, 2015 at 12:07 PM, Vlad Ilyushchenko <
notifications@github.com> wrote:

Hi Venkatt,

I have an example of creating a cluster of producers for you:
ClusteredProducerMain.java
https://github.com/NFSdb/nfsdb/blob/master/nfsdb-examples/src/main/java/org/nfsdb/examples/network/cluster/ClusteredProducerMain.java

Although it is for two producers, you can extend it for three or more as
you need. Important thing to be aware of that each cluster node must have
their unique integer instance id. It is used in logging and also for tie
break voting in case two nodes start up at the same time.

As things stand it is safe to have nodes started by either monitoring
tools or schedulers, if they come up at the same time they will resolve
their roles automatically.

Shutdown procedure is as graceful as possible and will wait for all
in-flight network transmissions before cutting the wire. I will expose a
timeout API though in case waiting is not in option. In this case in-flight
transactions may be lost.

There is more work needed to make reades fail over between cluster nodes
and automatically error correct. But that should not take long at all.

Let me know if you think current API can improve in some way or if
anything doesn't work for you.

Regards,
Vlad


Reply to this email directly or view it on GitHub
#29 (comment).

@bluestreak01
Copy link
Member

this feature is complete, lets open another issue should we discover defects with it.

bluestreak01 added a commit that referenced this issue Mar 25, 2015
…d Chang and Reberts algo (http://en.wikipedia.org/wiki/Chang_and_Roberts_algorithm). Modifications include:

- dead node detection
- acks
- hop counting to prevent infinite loops
- leader reassertion to prevent current leader from being demoted.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
New feature Feature requests
Projects
None yet
Development

No branches or pull requests

2 participants