-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High Availability of switchio itself? #60
Comments
Hi @robwilkes thanks for considering the project :) Couple notes:
For HA I've thought about trying to introduce the RAFT protocol but have never had a reason to toy with it. This would definitely be an interesting problem to solve though I have little experience with it. Maybe @moises-silva can comment.
You are correct currently this is the default but could be changed easily though testing would take a bit of work.
No, unfortunately, but we'd gladly accept a PR for a test. |
Although RAFT is nice and it'd be interesting there's probably simpler (easier to achieve in the short term) solutions to get good enough HA. For many years I always wanted to have FreeSWITCH ESL to be able to do outbound connection(s) when the module is loaded (and take care of re-connects). This is different from the existing outbound socket mode in that is meant to be a global control connection, not per session. This would mean FreeSWITCH would initiate the connection to a switchio server (or pool) on startup. You could then let FreeSWITCH connect via haproxy to your switchio pool. When FreeSWITCH recovers after a crash and loads mod_esl it would reconnect to an available switchio server (and haproxy takes care of deciding who is available, load-balance or whatever). This could be implemented with relative ease on mod_event_socket, or, if that becomes hard to push upstream for the version of FreeSWITCH you're using (e.g the maintainers of FreeSWITCH may not want to add it to v1.6/v1.8), it can be a separate switchio proxy component in python or something else that is always run side-by-side with your FreeSWITCH instance. That certainly doesn't solve the problem of existing calls state, but it allows you to serve new calls immediately with switchio upon FS recovery. Now, for step two, recovering state of ongoing calls. This is where raft could help, but, you could also follow FreeSWITCH's approach and save relevant state in a database. FreeSWITCH basically does this to delegate the call state to the database, and if you have a database cluster already for FreeSWITCH you can reuse the same setup for switchio. This means switchio would store state that needs to be preserved after a crash in a database using an odbc driver (e.g https://github.com/aio-libs/aioodbc). Finally, you monitor/control the cluster using whatever else you're already using for FreeSWITCH cluster resources (eg. corosync/pacemaker) Those are just some thoughts top off my head. |
Hi Guys,
Just discovered this project and really like the look of it, and the documentation is excellent.
I will more than likely move over to it, as the language/syntax looks really nice, however before/whilst I do I'm hoping you could answer a question for me.
Whilst switchio can be used to control a FreeSWITCH cluster, I'm wondering how can you make switchio itself highly available, or clustered?
I have a need to build a highly available solution in FreeSWITCH, with programmable call control, and what I have currently built is:
I am able to failover FreeSWITCH back and forth repeatedly with no issue, however the socket is broken, and so too is my ability to manage the call.
Do you have a solution, or idea, how this would work with switchio (or switchy)?
I presume I cannot have multiple switchio instances with the same config pointing to the same FreeSWITCH servers, as they with both want to manage the call simultaneously?
Another option might be to use VMware Fault Tolerance (limited to a single vCPU from memory, so won't scale well), any idea how switchio will behave if I 'sofia recover' the calls to another FreeSWITCH server?
I will eventually test this last one myself, and when I do report back, if I haven't heard anything back prior.
I know they're not easy questions and I appreciate you taking the time to read it.
The text was updated successfully, but these errors were encountered: