Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare crate for library use #139

Merged
merged 2 commits into from
Jan 24, 2019

Conversation

mvd-ows
Copy link
Contributor

@mvd-ows mvd-ows commented Dec 21, 2018

This PR's primary purpose is to decouple the shadowsocks service from the console. Doing so will enable one to use the crate as a library, and start/stop the service independently of the hosting process' lifetime.

The CTRL+C handler is no longer hardwired into the service logic, and no longer invokes a process shutdown when activated, but merely a shutdown of the service.

Additionally, it improves the monitoring of plugin subprocesses and ensures they are cleaned up when the shadowsocks service is stopped.

@zonyitoo
Copy link
Collaborator

This PR has conflicts with the lastest commit. Arc<Config> is replaced by Context.

@mvd-ows
Copy link
Contributor Author

mvd-ows commented Jan 1, 2019

Alright, I'll look into this beginning of next week.

@zonyitoo
Copy link
Collaborator

zonyitoo commented Jan 2, 2019

BTW. Because local::run and server::run are both return Future<Item = (), Error = io::Error, so I think it would be better to add a option in Config, which will allow you to disable the default signal monitor. And then, you can do whatever you want with the returned Future<..>. That would be the most simple, elegant and flexible way.

@faern
Copy link
Contributor

faern commented Jan 7, 2019

In the context of being a library it's a bit strange, and dangerous, to even have code that touches on signals and direct process exits. That's something only a binary should ever do basically. Also having the binaries main function end with a panic, because they are not supposed to terminate from that execution path, is a bit odd IMO. With the redesign proposed here I feel that the entire Tokio runtime is being taken care of better. Futures are being properly evaluated and the reactor can gracefully shut down before the process exits cleanly from returning from main.

EDIT: To decouple the signal/break and process exit even more from the library it would be nice not to have the shorthand run implementations in place. The binaries should probably have to call create_signal_monitor by themselves. Best would be if that function was not even exposed by the library, but was added as a module under the binaries in some way that does not duplicate the code.

@zonyitoo
Copy link
Collaborator

zonyitoo commented Jan 7, 2019

Also having the binaries main function end with a panic, because they are not supposed to terminate from that execution path, is a bit odd IMO.

That's because Rust's main function doesn't have a return value. If the server exited with an error, it should be reported to the process manager (supervisord, systemd, initd). Otherwise, the process manager will not report the error and restart the process.

I completely agree to remove that signal handler in the library.

  • there are many ways to handle signals, and most of them are quite simple. A pure function that listens to signal events is not a part of shadowsocks' feature.
  • Uses have to handle signal events, for example, kill the process, call runtime.shutdown_now() or others to shutdown the service gracefully.

zonyitoo added a commit that referenced this pull request Jan 7, 2019
@zonyitoo
Copy link
Collaborator

zonyitoo commented Jan 7, 2019

This is a commit for demonstrating my idea.

@mvd-ows
Copy link
Contributor Author

mvd-ows commented Jan 8, 2019

Thanks. However, that commit doesn't address the case where a different signal monitor is needed. When using the service as a library we need to pass in a signal monitor that can listen for requests from the owning application to shut down the service.

@zonyitoo
Copy link
Collaborator

zonyitoo commented Jan 8, 2019

Yes, you can.

let config = ...; // Somewhere constructed Config

let option = Options {
    enable_signal_monitor: false, // Disable the default monitor
};
let ss_service = run_opt(option);

let your_monitor = ...; // Your monitor Future

futures::select(your_monitor, ss_service);  // This future will return if any of them returns. 

So you can just let your_monitor returns and then the whole services will stop immediately, because the Future of ss_service is dropped.

@mvd-ows
Copy link
Contributor Author

mvd-ows commented Jan 8, 2019

OK. Perhaps there is some misconception on my part.

If you check the code I submitted, you'll see that all sub processes (plugins) are terminated in a controlled manner.

How can the same thing be achieved with the code you posted?

@zonyitoo
Copy link
Collaborator

zonyitoo commented Jan 8, 2019

I can't see the difference. Your commit is to pass a monitor Future into run_with_monitor:

let mon = create_signal_monitor();
run_with_monitor(config, mon);

The implementation of run_wish_function is simply put the mon into the Vec for futures_unordered, which means that any of the futures (run_tcp, run_udp, mon) returns, the whole future will return.

So it is exactly (nearly) the same as my code I just posted. run_opt will return if any of (run_tcp, run_udp) returns. And futures::select(your_monitor, ss_service) will return if any of (your_monitor, run_opt), which is exactly the same as run_with_monitor in your commit.

Plugins are stored in the Context, which will be destroyed after the ss_service Future is dropped. So you can terminate the plugins if you just let your_monitor Future return. It is always under controlled, right?

@zonyitoo
Copy link
Collaborator

zonyitoo commented Jan 12, 2019

@mvd-ows I read your commits twice, I think I got your idea of "all sub processes (plugins) are terminated in a controlled manner".

  1. If the main server is exited unexpectedly, the code will ensure the plugin monitor run until its end.
  2. When signals were received, any subprocesses exited, or server listeners exited, set the abort_signal flag, monitor threads will kill all the subprocesses.

I finally got time to review your code seriously. I have to say that you have done a great job. But I still have some questions:

  1. Are there any other ways to monitor subprocesses instead of opening threads and loop on calling waitpid? Monitoring SIGCHLD already fulfill the same goal, right?
  2. Plugin::drop() will always call .terminate() on its Popen object. So if the Context of server is dropped, subprocesses of plugins will be killed as expected. So why we still need the abort_signal flag and kill subprocesses by hand?

@mvd-ows
Copy link
Contributor Author

mvd-ows commented Jan 14, 2019

Hello,

Thanks for your comments, your analysis of my code is correct. Regarding your questions:

  1. Using a dedicated thread for each subprocess is needlessly wasteful. I'm not very happy with how the code looks, but could not device a better method using the subprocess crate. If this was Windows-only I would use a single extra thread and the WaitForMultipleObjects API function. I've heard good things about the duct crate. Perhaps using that instead, it would be possible to create a neater design?

SIGCHLD doesn't exist as a concept on Windows. In Windows, you have to obtain a process handle to the child process and initialize a wait using said handle. The handle becomes signaled when the process has exited, at which point you can use the GetExitCodeProcess API function (if you care about the exit code).

The only way you could have SIGCHLD on Windows is if some crate you're using is providing this interface on top of the logic I described above. The subprocess crate has zero references to SIGCHLD.

  1. This is an oversight on my part. I didn't notice Plugin::drop until now. But as I understand it, it's the vector of Plugin that is interesting, not the Context? The Contexts Config has been updated with some address when the plugin is started, but there is no connection to the Popen instance used to represent the plugin process?

@faern
Copy link
Contributor

faern commented Jan 14, 2019

The benefit that duct provides is that it allows cloning and sharing of process handles between threads. So it becomes possible to block one thread on waiting (we need this to detect in real time if a plugin dies) and to also have a separate handle that can be used to kill a subprocess.

@zonyitoo
Copy link
Collaborator

not the Context? The Contexts Config has been updated with some address when the plugin is started, but there is no connection to the Popen instance used to represent the plugin process?

Because Config struct is designed for passing configurations. For historical reason, it is the easiest way to support plugins by just modifying the addresses inside the Config. I don't think that is a good design.

Well, I don't know much about Windows. If it is only for compatibility reason, we could just add a monitor sub-module, which implements subprocesses monitor platform dependently (SIGCHLD on posix compatible platform and WaitForMultipleObjects on Windows).

subprocess was the only process management crate I could find at that time. I am very happy to switch to duct if it is actually better.

@faern
Copy link
Contributor

faern commented Jan 14, 2019

Implementing select functionality for waiting for multiple subprocesses via SIGCHLD/WaitForMultipleObjects sounds out of scope for this crate and should likely be something implemented in some generic subprocess library. And a user is not expected to run a ton of plugins I guess. So having one thread per plugin should not really be a problem?

@zonyitoo
Copy link
Collaborator

@faern How about ssserver with multiple uses?

@faern
Copy link
Contributor

faern commented Jan 14, 2019

How many plugins do you expect the most advanced setup to have? The extra threads will not do anything, just sit and wait for process to exit. If we swap to duct that is, so we get rid of the polling. Even the polling is very cheap, but a bit less so than duct I guess.

@faern
Copy link
Contributor

faern commented Jan 14, 2019

A thread will cost way less resources than a subprocess. And if the number of threads vs subprocesses always have a 1:1 relationship the amount of resources "wasted" on threads will be dwarfed by those of the actual subprocess. So no matter how many plugins are started, the threads will not be the issue.

@faern
Copy link
Contributor

faern commented Jan 21, 2019

@zonyitoo I could swap out subprocess for duct here. But I now see you have a separate branch, feature-optional-monitor where you also do plugin work. What is the goal here? Should I spend some time on getting rid of the polling here and move the ctrl-c monitoring into the binaries only? But that would be partially wasted if you substantially change the plugin system before this becomes merged.

@faern faern force-pushed the modularize-wait-ops branch 2 times, most recently from 739b38d to 7c0fbde Compare January 21, 2019 10:25
@faern
Copy link
Contributor

faern commented Jan 21, 2019

Sorry.. Force pushed a completely wrong branch. I reset it. Still working on fixing this PR as well, but that comes later.

@zonyitoo
Copy link
Collaborator

@faern That branch is for adding a Option for configuring server, currently is for disabling the default monitor.

@faern
Copy link
Contributor

faern commented Jan 21, 2019

@zonyitoo Is the plan to merge that branch? Or was it an experiment? Since it touches on a lot of stuff this PR also changes it would be good to know if it's worth spending time on coding on the code base as it looks pre feature-optional-monitor or if I should base my changes off of that branch directly.

A different, unrelated question: Why is running the TCP part of the code optional in the server, but not in the client? I'm talking about this code: https://github.com/shadowsocks/shadowsocks-rust/blob/master/src/relay/server.rs#L53
This if does not exist in local.rs, there it unconditionally runs the TCP stuff.

@zonyitoo
Copy link
Collaborator

Why is running the TCP part of the code optional in the server,

Because shadowsocks-libev does the same.

but not in the client?

Because it is required for UDP associate in Socks5, before associating UDP ports, authentications and commands must be sent via TCP to the client server.

Is the plan to merge that branch? Or was it an experiment?

Currently it is an experiment for adding optional signal handler, if user wants to handle signals by his own.

@faern
Copy link
Contributor

faern commented Jan 21, 2019

Because it is required for UDP associate in Socks5, before associating UDP ports, authentications and commands must be sent via TCP to the client server.

How can it do that if the server does not run a TCP endpoint? It feels like they should be mandatory on both ends, and not just one? shadowsocks-rust did exactly this just a few weeks ago. It was recently made optional. Here is a recent state when it was mandatory, and not behind an if statement: https://github.com/shadowsocks/shadowsocks-rust/blob/92c4058be8004d2cb5e04d7264ad570252e1e1ae/src/relay/server.rs

So what I wonder is if it was a bug to force it, and it's now fixed. From my understanding it feels like it should be mandatory, since otherwise no one can do the initial authentication / port negotiation etc.

Currently it is an experiment for adding optional signal handler, if user wants to handle signals by his own.

Yes, this is exactly the feature this PR is trying to introduce :D which is why I wonder. I will continue work towards running with arbitrary stop signal futures in this PR.

@zonyitoo
Copy link
Collaborator

How can it do that if the server does not run a TCP endpoint? It feels like they should be mandatory on both ends, and not just one?

No. BROWSER -------socks5------> SSClient -------SSProtocol---------> SSServer

So, do you want me to merge feature-optional-monitor?

@faern
Copy link
Contributor

faern commented Jan 21, 2019

So, do you want me to merge feature-optional-monitor?

I don't know. I have not read that code. All I want is for this project to have good structure and code quality and be usable both as a library and as binaries 🙂

EDIT: I think we are on to something good in this PR right here. But master has changed quite substantially. So I and @mvd-ows need to take what we have learned from this discussion and apply it on top of master again. From that perspective it would be better if feature-optional-monitor was not merged. Because it makes it a moving target for us if the underlying upstream branch changes. Then I think it's better to discuss a good solution in this PR and come up with a solution here, rather than creating separate branches.

@zonyitoo
Copy link
Collaborator

zonyitoo commented Jan 21, 2019

Merged.

EDIT: Ah.... I haven't seen your EDIT yet!!! I have already merged!!!

@zonyitoo
Copy link
Collaborator

I will guarantee I will not make any breaking changes unless someone has found any fatal bugs until this PR was merged.

@faern
Copy link
Contributor

faern commented Jan 23, 2019

While working on this I found a few other stuff that might be worth fixing. So I did those in a separate branch and PR: #142.

@faern
Copy link
Contributor

faern commented Jan 24, 2019

I have a really nice solution going on here. Will polish and test it a bit more, and maybe wait for #142 to be discussed/merged before I push to this PR. But this solution solves detecting dying plugins on Windows, and it also does not cost any extra threads per plugin.

@faern
Copy link
Contributor

faern commented Jan 24, 2019

Now I pushed a completely different implementation, with the same goal as this PR.
First I take the signal monitors that @mvd-ows implemented and moved them into src/bin/monitor. Thus eliminating all signal monitoring from the library part of the code base completely.

Then I swap out the subprocess crate for the tokio-process crate. The tokio-process crate allows handling subprocesses like Futures and they can be waited on without costing an extra thread. So I have changed launch_plugins so instead of returning a vector of Plugins to hold on to it returns a future that will resolve with an error if any plugin crashes for any reason. After that it became very easy to plug it in in such a way that the run function will monitor the plugins and the tcp/udp futures at the same time.

Also changed some panic!s into just error! logging + returning Results instead. Gives some nicer terminal output for certain cases.

@faern
Copy link
Contributor

faern commented Jan 24, 2019

On top of making this crate usable as a library, this also solves the problem of the Windows version not properly monitoring plugins for exits. This code should now do that, thanks to tokio-process. It should instantly detect if any subprocess dies.

tokio-process is implemented with SIGCHLD on POSIX platforms, and with RegisterWaitForSingleObject on Windows.

@faern
Copy link
Contributor

faern commented Jan 24, 2019

Travis is currently having an error on nightly. But it's while building miscreant. So it does not feel like something I caused in my change of code? 🤔

Compiling miscreant v0.4.2
error: unsupported target platform. Either enable appropriate target-features (+aes,+ssse3) in RUSTFLAGS or enable the 'soft-aes' cargo feature to fallback to a software AES implementation

EDIT: I have accidentally upgraded it from 0.4.0-beta2 to 0.4.2... I'll downgrade it if I can..

@zonyitoo
Copy link
Collaborator

zonyitoo commented Jan 24, 2019

Just keep it as 0.4.2. For crypto libraries, the newer, the safer.
The other modifications looks good to me.
Let's see why 0.4.2 build failed on travis.

@faern
Copy link
Contributor

faern commented Jan 24, 2019

But it fails building? Is that maybe something about the Linux distro on travis is too old? Do you plan on merging it despite Travis failures or how should we proceed?

@zonyitoo
Copy link
Collaborator

It seems that they requires this flags for building: https://github.com/miscreant/miscreant.rs/blob/master/.travis.yml#L18
And miscreant is not can be built with stable.
I will merge it now. Because stable builds without miscreant are passed.

@zonyitoo zonyitoo merged commit 36d91c6 into shadowsocks:master Jan 24, 2019
@faern faern deleted the modularize-wait-ops branch January 24, 2019 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants