Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Receiving an ipc channel uses an fd #240

Open
asajeffrey opened this issue Aug 2, 2019 · 4 comments
Open

Receiving an ipc channel uses an fd #240

asajeffrey opened this issue Aug 2, 2019 · 4 comments

Comments

@asajeffrey
Copy link
Member

When ipc channels are sent over ipc oin unix, their fds are sent using sendmsg, and the matching recvmsg creates a new fd. This means that even programs which use a fixed number of channels can end up using an unbounded number of fds. For example:

fn main() {
   let (send1, recv1) = ipc::channel::<IpcSender<bool>>().unwrap();
   let (send2, recv2) = ipc::channel::<bool>().unwrap();
   thread::spawn(move || {
       let mut senders = vec![];
       while let Ok(send2) = recv1.recv() {
           let _ = send2.send(true);
           // The fd is private, but this transmute lets us get at it
           // let fd: &std::sync::Arc<u32> = unsafe { std::mem::transmute(&send2) };
	   // println!("fd = {}", *fd);
	   // Stop the ipc channel from being dropped
	   senders.push(send2);
       }
   });
   for _ in 0..10000 {
       let _ = send1.send(send2.clone());
       let _ = recv2.recv();
   }
}

runs out of fds even though it only uses two ipc channels.

@gterzian
Copy link
Member

gterzian commented Aug 5, 2019

This could be solved by somehow reducing the number of fds that would be used in the above script, or in practice it could be solved by refactoring the above script to not send clones of the sender for each operation.

If a program distributes many clones of senders from one process to many different processes, it will be because the program intends to setup a communication channel between that one process and all others.

In such a scenario, does the number of processes the machine would be able to manage not provide an even stronger limitation on the feasibility of the design, versus the number of fds?

And would there be a way to setup communication with more than one process while avoiding using an fd for each?

If the program only needs to setup a communication channel between two processes to handle many operations, and that is done by distributing the clone of a sender for each operation, could the communication be refactored to use one sender only, thereby avoiding sending clones for each operation?

@asajeffrey
Copy link
Member Author

We could ask people to refactor their programs, but that's not very satisfactory, as it means the cost model for ipc is very different than it is for mpsc, where sending a channel is cheap.

@gterzian
Copy link
Member

gterzian commented Aug 7, 2019

the cost model for ipc is very different than it is for mpsc, where sending a channel is cheap.

I agree, and the question is whether that is something ipc-channel wants to "hide" from it's users. I think it's one thing to have a similar api to other channels, I was personally quite happy to not even be aware of the difference when I started contributing to Servo.

I also think it's another thing to make the current implementation more complicated to attempt to make using an ipc-channel as cheap as a threaded channel.

@gterzian
Copy link
Member

gterzian commented Aug 7, 2019

I think we also need to consider that in practice, caching fds of cloned senders on the receiving process(the process which receives clones of a sender half of a channel repeatedly), will only help in cases where a channel can be re-used.

In cases where many different channels are created, for example if a page(or several pages), fetch a lot of stuff in Servo. Each fetch needs a logically separate channel, so you can't just store a sender/receiver pair somewhere and clone the sender for each operation.

Hence even caching fds might not actually solve the "too many files open" problem, if that is mainly caused by creating channels(and sharing their sender only once, which caching will not address).

Such a situation can I think only be solved through a routing layer that would use a single channel, yet give the various "routes" the illusion that they have their own channel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants