Unix: Why not use Unix Domain Sockets for Named Pipes? #14633

bpschoch · 2015-05-27T21:25:21Z

System.IO.Pipes map to native windows implementations of anonymous and named pipes.

Windows anonymous are close in implementation to Unix pipes in that they are one way and byte oriented (not datagram or message oriented).

On the other hand, Named pipes can be either full (supporting I/O in both directions) or half (I/O in one direction) and messages can either be byte oriented or message/datagram oriented. Pipe connections can also be made across systems through the network (I believe using SMB)

Unix has named pipes called FiFo's but they are one way only and not message oriented. The current Pipes port for corefx under unix uses FIfo's and therefore is a subset implementation.

Instead of using Fifo's why don't we use Unix Domain Sockets? They support most of the functionality of windows named pipes including full/half duplex, byte or message oriented. They however don't support cross system connections. see overview of Unix Domain Sockets here: http://www.thomasstover.com/uds.html

Ultimately, what is the goal of the corefx library on a different platform? To bind to existing similar functionality? or to provide strict portability of the api's across platforms (what I'm calling emulation)?

If we strictly bind, then there will always be limitations and differences on functionality provided thus decreasing portability. However by binding, we are hooking into existing OS capabilities and can in theory hook up to other non-net apps on the host platform.

If we strictly emulate (e.g. say grab a chunk of shared memory and then using semaphores to strictly emulate named pipes functionality under Unix), then we can be extremely portable however we can't communicate with anything else.

A possible solution is to extend what can be used as a 'pipe name' in the system.io.pipes implementation where the syntax can be extended to give hints as to what to bind to in the underlying implementation but with the default (without any extended syntax) defaulting to an emulated functional solution.

Comments?

stephentoub · 2015-05-28T12:36:34Z

@bpschoch, thanks for the suggestion. For the initial implementation, I went with named pipes / FIFOs as the named pipes implementation because a) it would allow for integration with other tools (as you mentioned), and b) it provided a good-enough mapping of most of the features (as you say, not all) and seemed to make logical sense to use the OS' implementation of the mechanism. That said, if you believe that Unix Domain Sockets would yield a better implementation, we'd certainly be open to exploring that; if you're interested, please feel free to submit a PR with such an implementation, and we'll happily take a look.

bpschoch · 2015-05-28T13:49:47Z

I had multiple purposes in making that post if you read between the lines:

Can we do better in implementing pipes functionality in Unix? yes using Unix Domain Sockets
Do we then have a completely portable implementation in Unix? no (i.e. side effects, number of servers, read/write semantics etc)
Start a discussion on portability and what it means (may only be reasonable for pipes but perhaps more)

What do I mean by 'portable'? My purest definition is that I can take a working .net app running on windows and move it to a non-windows environment and have it work (limited by what is provided in the library, there is no forms support because library doesn't support forms).

So that being the case, I should be able to take an app that uses pipes as documented in MSDN for windows and move it a non-windows platform and it should work.

The tensions we have are that other platforms don't support the same underlying functionality so if you want pure portability then the only option is emulation of the functionality. However as I brought up, then it prevents interoperability which then I would argue is no longer an portability issue but an adaptation issue (i.e. things I have to tweak to make it work on an other platform).

So it seems that they are potentially two competing goals: One to provide a portable implementation that works without changing the app; Two to provide a binding to underlying platforms similar but different implementation.

First is this even a valid concern?
If yes, then what are possible approaches?

'decorate' the pipename field in the constructor with stuff to indicate what is desired. If taking that approach, I would argue that a plain name, we support pure emulation and decorations would signal the appropriate underlying bindings. e.g. pipe:pipe name socket:pipe name
Add new pipe options enum to indicate what is desired
Add static property somewhere to indicate what is desired.

and yes I can do an implementation but wanted to get through this discussion first as it may affect the implementation.

Bernie

stephentoub · 2015-05-29T18:12:42Z

If we can achieve better portability, we should do so; I believe that should be the goal first-and-foremost, getting the best possible behavior/perf/etc. that matches the existing .NET APIs. if we have a choice between two mechanism, both of which have equivalent portability but one of which uses the "more appropriate" underlying mechanism, then we should use that one of course, but if we can emulate the functionality better on top of a different underlying mechanism, we should do so. It's then just question of whether we can actually do it better; that's not just a question of behavior, but also of performance, reliability, etc. I don't think we should have two completely separate underlying implementations and switch between them based on some user request; we could consider something as a separate feature in the distant future, but for now we should pick one implementation and go with it.

bpschoch · 2015-06-05T16:47:49Z

I'm researching the possibility of using shared memory to fully implement named pipes (except for cross machine). My research includes trying to understand the exact semantics of how named pipes work (e.g. when things get blocked etc) and I'm a bit confused about input and output buffers. I found this information here (see remarks section): https://msdn.microsoft.com/en-us/library/windows/desktop/aa365150%28v=vs.85%29.aspx describing how output buffers work, but I'm not sure the relevance of input buffers. It seems that when you read, if there is nothing in the output buffer and you wait until there is. Perhaps one of you on the 'inside' could look at the win32 implementation of named pipes to find the semantics around the input buffers for named pipes.

Thanks
Bernie

stephentoub · 2015-07-15T15:58:49Z

@bpschoch, checking in on this. Is this something you're still investigating?

bpschoch · 2015-07-15T20:27:41Z

well, it got put on hold. was waiting on some feedback on how the buffers worked. Also was finishing up the re-factoring of the pipes tasks (which I got sidetracked on), but doing pr now.

bpschoch · 2015-08-03T17:57:00Z

Named Pipes Overview (part 1)

Core Classes

Name Pipes in System.IO.Pipes are separated into two classes: one for client use and one for server use.
Basically, an app that will be the server, instantiates a NamedPipeServerStream class and can specify the following information:

Pipe name as a string
enum Pipe direction e.g. In or Out (one-way) or InOut (two way)
int maximum number of server instances (or a special constant for no limit)
enum Transmission mode e.g. byte message oriented or message oriented (1 write equals 1 read)
enum PipeOptions: async or not; write through bypass cache
pair of ints to specify recommend buffer sizes for input and output buffer sizes

There is also a special case where you can pass in an existing SafePipeHandle.

An app that will be a client, instantiates a NamedPipeClientStream class and can specify the following information:

Pipe name as a string
optional server name for cross system pipes
enum Pipe direction e.g. In or Out (one-way) or InOut (two way)
enum PipeOptions: async or not; write through bypass cache
Impersonation level (what to allow the server to find out/impersonate about the client)

As is the case with the server, the client also a special case where you can pass in an existing SafePipeHandle.

Operations

Basically a server is the first to start and create a NamedPipeServerStream and specifying all the options that describes the operation of the pipe. The server than calls on of the WaitForConnect() method variants to pause until a client connects. A client connects to the server by creating a NamedPipeClientStream and calling one of the Connect() variants. When both of these calls are complete, a connect is made and the two side can communicate using the Read() Write() variants.

Both client and server versions of named pipes have various properties to examine and in some cases reset pipe properties. One example is that a client can fetch the number of server instances on the pipe.

From what I understand, you need to create additional NamedPipeServerStream instances for each simultaneous client that needs to be connected. The additional NamedPipeServerStream specify the same pipe name and options.

Summary

Named Pipes support both simplex (one-way) or duplex (two-way) communications between client and server using both byte or message oriented boundaries. There is a difference in operations between a client and a server in that clients connect to servers while servers wait for connections from clients. Named Pipes also support the ability to reach out across the network to servers on other systems using SMB protocols.

bpschoch · 2015-08-05T20:02:09Z

Named Pipes Implementation Matrix (part 2)

Named Pipe Feature	Win32	*nix FIfo	*nix Domain Sockets	Shared Memory Emulation
One-way (In or Out)	Y	Y		Y
Two-way (InOut)	Y	N		Y
Byte - oriented	Y	Y		Y
Msg - oriented	Y	N		Y
True Async support	Y	N		Y
Specify Buffer Sizes	Y	Y		Y
Fetch Buffer Sizes	Y	Y*2		Y
Support # of Servers	Y	N		Y
Fetch # of Servers	Y	N		Y
WaitForPipeToDrain	Y	N		Y
Impersonation Support	Y	N		N
Network Support	Y	N*1	N*1	N*1

*1 can be implemented via proxy process plus client code to access proxy
*2 Only out buffer size and not on OSX

(NOTE: in progress)

bpschoch · 2015-08-05T20:12:10Z

Named Pipes; Shared Memory Implementation

Conceptual Idea

A shared memory segment would be obtained to transact sending information back and forth on the 'pipe'. A *nix semaphore would be used for inter-process synchronization.
The base of the memory segment would contain the core configuration information for the 'pipe' e.g. number of servers, buffer sizes, pipe properties. Following this base portion would be an array of memory for each potential server. The base of memory for each server would contain semaphore information and the logical state of the pipe e.g. what's queued etc.
Because the raw elements of communications are all in the shared memory, most of the capabilities of Named Pipes can be emulated properly.
Now because pipes have names, something needs to map a name to the shared memory which could be a file name under a special directory.

bpschoch · 2015-08-05T23:43:40Z

@stephentoub comments?

stephentoub · 2015-08-14T19:40:34Z

Interesting idea. I have concerns around lifetime management and how that would work, since many of the concepts you're describing are not cleaned up when all open descriptors go away, when processes die, etc. With the current implementation, worst case is we're left with a temporary FIFO file on disk that doesn't contain any interesting data. But I believe with the approach you describe, we could end up in a situation where state from one run ends up persisting unintentionally to subsequent runs. I think we'd really need to see a strong proof-of-concept before switching to such an approach.

bpschoch · 2015-08-15T20:26:50Z

My first goal was see if the functionality can be implemented (assuming portability is important), which I believe it can through shared memory.

I agree the lifetime issues are important. Worst case, a caretaker unix proc can run that supervises what's going on and can do any cleanup. The obviously question that then comes to mind is what happens if the caretaker proc goes away and yes that is a problem but could be mitigated by having the users of required supervised resources (e.g. shared memory and/or semaphores in this case) make sure that a copy of the caretaker is always running. BTW such a caretaker proc could also deal with other such issues the library may need.

Ok then on to a proof of concept:
But there are some questions to do more research as the *nix versions have multiple shared memory and semaphore implementation each with pro's and cons. It's been a few years since I was a hard hitting *nix programmer so I need to catch up some and examine some of these trade-offs.

It also would be nice to ultimately implement the solution on the .NET library side vs C side but then the appropriate 'binding' libraries need to be implemented say for semaphores etc. Do you think that this would be a good approach vs doing all the emulation on the C side of things?

Also what do you think of a caretaker proc in general? It would only startup on demand when something the library required supervision.

stephentoub · 2015-08-16T00:14:44Z

Also what do you think of a caretaker proc in general?

I personally think we very much want to avoid needing something like that.

Mart-Bogdan · 2016-06-25T23:30:57Z

@bpschoch there are one BIG problem with shared memory approach:

Anyone can connect to that shared region of RAM and mess with it's structure, what isn't possible with kernel based pipes/whatever. That's security hole :-( .

One change to your matrix:
regarding Impersonation and sockets there should be Y*3

Domain sockets support getting caller process user id/group id, thou only root can "impersonate" other user's UID/GID.

So full support isn't possible, but i guess only way to make impersonation in .net is by using P/Invoke and handle, or sepver pipe has such method?

danmoseley · 2017-12-08T19:20:26Z

For those interested: general domain sockets support was added dotnet/corefx#25246

bpschoch closed this as completed Aug 3, 2015

bpschoch reopened this Aug 5, 2015

joshfree assigned stephentoub Oct 12, 2015

stephentoub closed this as completed Feb 16, 2016

stephentoub reopened this Mar 11, 2016

stephentoub closed this as completed in dotnet/corefx#6833 Mar 15, 2016

msftgits transferred this issue from dotnet/corefx Jan 31, 2020

msftgits added this to the 1.0.0-rc2 milestone Jan 31, 2020

CarolEidt mentioned this issue Apr 4, 2019

Revisit the handling of SIMD types during crossgen #10152

Open

ghost locked as resolved and limited conversation to collaborators Jan 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unix: Why not use Unix Domain Sockets for Named Pipes? #14633

Unix: Why not use Unix Domain Sockets for Named Pipes? #14633

bpschoch commented May 27, 2015

stephentoub commented May 28, 2015

bpschoch commented May 28, 2015

stephentoub commented May 29, 2015

bpschoch commented Jun 5, 2015

stephentoub commented Jul 15, 2015

bpschoch commented Jul 15, 2015

bpschoch commented Aug 3, 2015

bpschoch commented Aug 5, 2015

bpschoch commented Aug 5, 2015

bpschoch commented Aug 5, 2015

stephentoub commented Aug 14, 2015

bpschoch commented Aug 15, 2015

stephentoub commented Aug 16, 2015

Mart-Bogdan commented Jun 25, 2016

danmoseley commented Dec 8, 2017