New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added: IPv6, UDP and UNIX sockets #332

Merged
merged 12 commits into from Jan 26, 2015

Conversation

Projects
None yet
7 participants
@ysbaddaden
Member

ysbaddaden commented Jan 20, 2015

I refactored and extended the socket stdlib. It somehow models the Ruby library. I skipped the Socket methods (like connect, bind, listen) since the C methods are directly accessible. Classes are organized as follows:

- FileDescriptorIO
  - Socket
    - IPSocket
      - TCPSocket
        - TCPServer
      - UDPSocket
    - UNIXSocket
      - UNIXServer

I changed the signature of TCPServer.new because I couldn't get initialize(port, backlog = 128) to compile along with initialize(host, port, backlog = 128). I didn't try hard, so maybe it's doable.

I also added IPv6 support by replacing gethostbyname for getaddrinfo. This is working fine on my Linux computer, but Darwin uses different sockaddr structs, so maybe I broke support for Macs (I can't check). I tried to have an Addrinfo class/struct but didn't get anything working to create sockets from (I got segfaults, invalid af family errors, etc. I blame pointer casts). I'll push another pull request later.

Note: using "::" as a host for TCPServer#new or UDPSocket#bind as the extra benefit of binding to all IPv4 and IPv6 addresses with a single server.

@asterite

This comment has been minimized.

Show comment
Hide comment
@asterite

asterite Jan 20, 2015

Contributor

Wow!

Just some days ago @waj did this. He originally wanted to do it in Crystal but we were missing UDP, IP, etc-related functionalities and since it had a bit of rush he chose to do it in Node since everything was already done there. So it's really amazing that some days later you come with this pull request! :-)

It'll take us some time to review the code, but it looks very good so far. I can see you started coding the darwin-related funcionality, so I'll wait for that (until the specs pass). Meanwhile I'll review it and enjoy it :-)

Thank you so much!!

Contributor

asterite commented Jan 20, 2015

Wow!

Just some days ago @waj did this. He originally wanted to do it in Crystal but we were missing UDP, IP, etc-related functionalities and since it had a bit of rush he chose to do it in Node since everything was already done there. So it's really amazing that some days later you come with this pull request! :-)

It'll take us some time to review the code, but it looks very good so far. I can see you started coding the darwin-related funcionality, so I'll wait for that (until the specs pass). Meanwhile I'll review it and enjoy it :-)

Thank you so much!!

@ysbaddaden

This comment has been minimized.

Show comment
Hide comment
@ysbaddaden

ysbaddaden Jan 21, 2015

Member

I tried to port Prax a few weeks ago. It went well until I tried some IPv6 and UNIX sockets, so I started hacking, and learnt a lot the hard way :)

I have a type error on Travis for UNIXSocket that I don't have on Linux. I do understand it StaticArray(UInt8, X) vs Pointer(UInt8) but I don't get why it passes on Linux but not on Darwin :'(
Fixed. Now there are some invalid af family, which mean a C struct, like addrinfo, are Darwin/BSD specific.

Member

ysbaddaden commented Jan 21, 2015

I tried to port Prax a few weeks ago. It went well until I tried some IPv6 and UNIX sockets, so I started hacking, and learnt a lot the hard way :)

I have a type error on Travis for UNIXSocket that I don't have on Linux. I do understand it StaticArray(UInt8, X) vs Pointer(UInt8) but I don't get why it passes on Linux but not on Darwin :'(
Fixed. Now there are some invalid af family, which mean a C struct, like addrinfo, are Darwin/BSD specific.

end
class Socket < FileDescriptorIO
def afamily(family)

This comment has been minimized.

@ysbaddaden

ysbaddaden Jan 21, 2015

Member

this method is required because the C addrinfo struct is an Int32 whereas the constants and socket methods take a UInt16 or UInt8 (depending on platform).

@ysbaddaden

ysbaddaden Jan 21, 2015

Member

this method is required because the C addrinfo struct is an Int32 whereas the constants and socket methods take a UInt16 or UInt8 (depending on platform).

This comment has been minimized.

@asterite

asterite Jan 21, 2015

Contributor

It's ok. You can also do it like this:

def afamily(family)
  C::AF_INET6.class.cast(family)
end

All Int and Float types have this class method to cast a value to their types (ir's written in Crystal so you can search for it, but as most things it's not yet documented, although it appears in the API docs (but apparently there's a bug with the anchors)). In this case the class is known at compile-time (it's a constant) so the end result will be as efficient as a bitcast (no dispatch or method call involved when compiled in release mode).

You can make that method protected or private, I guess.

@asterite

asterite Jan 21, 2015

Contributor

It's ok. You can also do it like this:

def afamily(family)
  C::AF_INET6.class.cast(family)
end

All Int and Float types have this class method to cast a value to their types (ir's written in Crystal so you can search for it, but as most things it's not yet documented, although it appears in the API docs (but apparently there's a bug with the anchors)). In this case the class is known at compile-time (it's a constant) so the end result will be as efficient as a bitcast (no dispatch or method call involved when compiled in release mode).

You can make that method protected or private, I guess.

@ysbaddaden

This comment has been minimized.

Show comment
Hide comment
@ysbaddaden

ysbaddaden Jan 21, 2015

Member

Design question: shall we allow to pass symbols along constants? Like:

UDPSocket.new(:INET6)
UNIXServer.new("/tmp/sock", :STREAM)
Member

ysbaddaden commented Jan 21, 2015

Design question: shall we allow to pass symbols along constants? Like:

UDPSocket.new(:INET6)
UNIXServer.new("/tmp/sock", :STREAM)
@asterite

This comment has been minimized.

Show comment
Hide comment
@asterite

asterite Jan 21, 2015

Contributor

I actually started preferring the use of enums. They are a bit more verbose but give you compile-time safety, which I think matters more. They are also self-documenting.

class Socket
  enum Type
    STREAM = C::SOCK_STREAM
    DATAGRAM = C::SOCK_DGRAM
    RAW = C::SOCK_RAW
  end
end

class UNIXServer < UNIXSocket
  # I don't prefix Type with Socket:: here because we inherit from Socket and it'll work
  # The type restriction is both for documentation and for type-safety
  def initialize(@path : String, socktype = Type::STREAM : Type, backlog = 128)
    ...
  end
end

UNIXServer.new("/tmp/sock", Socket::Type::STREAM)

Another benefit is that you can document each of the enum's constant, plus the constant, so in terms of documentation it's really much better. With symbols you have to document them in the method comment, and if they appear in many methods you have to duplicate this or refer to one of the other methods. Then you have to make a code to switch over the given symbol, translating values, or raise if an unexpected symbol was given.

I'm sure later we'll come up with a way to eliminate this small verbosity, but for now I prefer it (but others might not).

I'd also add a Socket::Family enum.

Enums are not used everywhere yet in the std because they appeared outside lib declarations in version 0.5.3, just a couple of months ago, but we should use them more.

Contributor

asterite commented Jan 21, 2015

I actually started preferring the use of enums. They are a bit more verbose but give you compile-time safety, which I think matters more. They are also self-documenting.

class Socket
  enum Type
    STREAM = C::SOCK_STREAM
    DATAGRAM = C::SOCK_DGRAM
    RAW = C::SOCK_RAW
  end
end

class UNIXServer < UNIXSocket
  # I don't prefix Type with Socket:: here because we inherit from Socket and it'll work
  # The type restriction is both for documentation and for type-safety
  def initialize(@path : String, socktype = Type::STREAM : Type, backlog = 128)
    ...
  end
end

UNIXServer.new("/tmp/sock", Socket::Type::STREAM)

Another benefit is that you can document each of the enum's constant, plus the constant, so in terms of documentation it's really much better. With symbols you have to document them in the method comment, and if they appear in many methods you have to duplicate this or refer to one of the other methods. Then you have to make a code to switch over the given symbol, translating values, or raise if an unexpected symbol was given.

I'm sure later we'll come up with a way to eliminate this small verbosity, but for now I prefer it (but others might not).

I'd also add a Socket::Family enum.

Enums are not used everywhere yet in the std because they appeared outside lib declarations in version 0.5.3, just a couple of months ago, but we should use them more.

@asterite

This comment has been minimized.

Show comment
Hide comment
@asterite

asterite Jan 25, 2015

Contributor

@ysbaddaden If you want you can rebase and I'll merge it.

Contributor

asterite commented Jan 25, 2015

@ysbaddaden If you want you can rebase and I'll merge it.

@ysbaddaden

This comment has been minimized.

Show comment
Hide comment
@ysbaddaden

ysbaddaden Jan 26, 2015

Member

Just rebased off master, along with some corrections. TCPSocket and UDPSocket are still broken on OSX. I don't own a MAC, so I can't really test.

Member

ysbaddaden commented Jan 26, 2015

Just rebased off master, along with some corrections. TCPSocket and UDPSocket are still broken on OSX. I don't own a MAC, so I can't really test.

@asterite

This comment has been minimized.

Show comment
Hide comment
@asterite

asterite Jan 26, 2015

Contributor

Thanks! I'll merge soon and add the missing bits for Mac :-)

Contributor

asterite commented Jan 26, 2015

Thanks! I'll merge soon and add the missing bits for Mac :-)

asterite added a commit that referenced this pull request Jan 26, 2015

Merge pull request #332 from ysbaddaden/std-socket
Added: IPv6, UDP and UNIX sockets

@asterite asterite merged commit a8d26a8 into crystal-lang:master Jan 26, 2015

1 check failed

continuous-integration/travis-ci The Travis CI build failed
Details
@asterite

This comment has been minimized.

Show comment
Hide comment
@asterite

asterite Jan 26, 2015

Contributor

Well, it took me a while to figure out why it didn't work in mac.

I did this man getaddrinfo. In some place it says:

hints is an optional pointer to a struct addrinfo, as defined by <netdb.h>:

     struct addrinfo {
             int ai_flags;           /* input flags */
             int ai_family;          /* protocol family for socket */
             int ai_socktype;        /* socket type */
             int ai_protocol;        /* protocol for socket */
             socklen_t ai_addrlen;   /* length of socket-address */
             struct sockaddr *ai_addr; /* socket-address for socket */
             char *ai_canonname;     /* canonical name for service location */
             struct addrinfo *ai_next; /* pointer to next in list */
     };

Exactly the same as you defined it. Only that I always get ai_addr to be zero. I replicated a small C code in Crystal that used getaddrinfo and I got different results. I really didn't understand what was going on. Then I searched for the definition of addrinfo in /usr/include:

struct addrinfo {
    int ai_flags;   /* AI_PASSIVE, AI_CANONNAME, AI_NUMERICHOST */
    int ai_family;  /* PF_xxx */
    int ai_socktype;    /* SOCK_xxx */
    int ai_protocol;    /* 0 or IPPROTO_xxx for IPv4 and IPv6 */
    socklen_t ai_addrlen;   /* length of ai_addr */
    char    *ai_canonname;  /* canonical name for hostname */
    struct  sockaddr *ai_addr;  /* binary address */
    struct  addrinfo *ai_next;  /* next structure in linked list */
};

😠 😡 👿

Contributor

asterite commented Jan 26, 2015

Well, it took me a while to figure out why it didn't work in mac.

I did this man getaddrinfo. In some place it says:

hints is an optional pointer to a struct addrinfo, as defined by <netdb.h>:

     struct addrinfo {
             int ai_flags;           /* input flags */
             int ai_family;          /* protocol family for socket */
             int ai_socktype;        /* socket type */
             int ai_protocol;        /* protocol for socket */
             socklen_t ai_addrlen;   /* length of socket-address */
             struct sockaddr *ai_addr; /* socket-address for socket */
             char *ai_canonname;     /* canonical name for service location */
             struct addrinfo *ai_next; /* pointer to next in list */
     };

Exactly the same as you defined it. Only that I always get ai_addr to be zero. I replicated a small C code in Crystal that used getaddrinfo and I got different results. I really didn't understand what was going on. Then I searched for the definition of addrinfo in /usr/include:

struct addrinfo {
    int ai_flags;   /* AI_PASSIVE, AI_CANONNAME, AI_NUMERICHOST */
    int ai_family;  /* PF_xxx */
    int ai_socktype;    /* SOCK_xxx */
    int ai_protocol;    /* 0 or IPPROTO_xxx for IPv4 and IPv6 */
    socklen_t ai_addrlen;   /* length of ai_addr */
    char    *ai_canonname;  /* canonical name for hostname */
    struct  sockaddr *ai_addr;  /* binary address */
    struct  addrinfo *ai_next;  /* next structure in linked list */
};

😠 😡 👿

@jhass

This comment has been minimized.

Show comment
Hide comment
@jhass

jhass Jan 27, 2015

Member

lol, that's subtle. Though, does that mean I can now claim that OS X is (no longer) POSIX compliant? :P

Member

jhass commented Jan 27, 2015

lol, that's subtle. Though, does that mean I can now claim that OS X is (no longer) POSIX compliant? :P

@ysbaddaden

This comment has been minimized.

Show comment
Hide comment
@ysbaddaden

ysbaddaden Jan 27, 2015

Member

Come on... I'll check, but I'm pretty sure this is not what headers in xnu source code says, nor what I found on freebsd.

I'm glad you found it, better sockets!

Member

ysbaddaden commented Jan 27, 2015

Come on... I'll check, but I'm pretty sure this is not what headers in xnu source code says, nor what I found on freebsd.

I'm glad you found it, better sockets!

@asterite

This comment has been minimized.

Show comment
Hide comment
@asterite

asterite Jan 27, 2015

Contributor

There's still one spec failing in travis related to UDP socket. I had it failing on my machine, did a change (stored the @family and pass it instead of nil in connect and bind) and it started working. I reverted the change and it kept working 😮 . I don't know what's going on, but I'll try to test it on another mac, see if I can reproduce it.

Contributor

asterite commented Jan 27, 2015

There's still one spec failing in travis related to UDP socket. I had it failing on my machine, did a change (stored the @family and pass it instead of nil in connect and bind) and it started working. I reverted the change and it kept working 😮 . I don't know what's going on, but I'll try to test it on another mac, see if I can reproduce it.

@ysbaddaden ysbaddaden deleted the ysbaddaden:std-socket branch Jan 27, 2015

asterite added a commit that referenced this pull request Jan 27, 2015

@chendo

This comment has been minimized.

Show comment
Hide comment
@chendo

chendo Jul 17, 2015

Not sure if this is the best place to ask, but is there a method similar to Ruby's IPSocket#recvfrom? It returns data from a datagram up to a specified length, generally used with UDP sockets where you want to deal with datagrams rather than a stream.

I can't use unbuffered_read because it's protected. Should I call out to C functions to do this?

chendo commented Jul 17, 2015

Not sure if this is the best place to ask, but is there a method similar to Ruby's IPSocket#recvfrom? It returns data from a datagram up to a specified length, generally used with UDP sockets where you want to deal with datagrams rather than a stream.

I can't use unbuffered_read because it's protected. Should I call out to C functions to do this?

@tatey

This comment has been minimized.

Show comment
Hide comment
@tatey

tatey Aug 27, 2015

Contributor

@chendo I've had a crack at implementing recvfrom in IPSocket. Getting something going is easy enough, but this implementation doesn't feel in the spirit of the FileDescriptorIO class.

# src/socket/libc.cr
LibC
  fun recvfrom(sock : Int, buffer : Void*, length : SizeT, flags : Int, addr : SockAddr*, addr_len : SocklenT*) : SSizeT
end
# src/socket/ip_socket.cr
class IPSocket
  def recvfrom(length)
    raise ArgumentError.new "negative length" if length < 0

    slice = Slice(UInt8).new(length)
    bytes_read = LibC.recvfrom(fd, (slice.to_unsafe as Void*), LibC::SizeT.cast(length), 0, nil, nil)
    if bytes_read != -1
      return slice
    end

    if LibC.errno == Errno::EAGAIN
      wait_readable
    else
      raise Errno.new("Error receiving")
    end
  ensure
    add_read_event unless readers.empty?
  end
end

Should recvfrom append directly to the internal buffer (@in_buffer_rem), or should it exist stand alone and return the slice? If it's appended directly you'd get an API like this:

 client << "message"
 server.recvfrom(1024)
 server.read(1024) # => "message..."

Where as if it's stand alone we'd get something like:

 client << "message"
 server.recvfrom(1024) # => [109, ...]

The latter feels more UNIX-y and is how it's used in Ruby. I'm looking for direction on how to make this API fit nicely into Crystal.

Contributor

tatey commented Aug 27, 2015

@chendo I've had a crack at implementing recvfrom in IPSocket. Getting something going is easy enough, but this implementation doesn't feel in the spirit of the FileDescriptorIO class.

# src/socket/libc.cr
LibC
  fun recvfrom(sock : Int, buffer : Void*, length : SizeT, flags : Int, addr : SockAddr*, addr_len : SocklenT*) : SSizeT
end
# src/socket/ip_socket.cr
class IPSocket
  def recvfrom(length)
    raise ArgumentError.new "negative length" if length < 0

    slice = Slice(UInt8).new(length)
    bytes_read = LibC.recvfrom(fd, (slice.to_unsafe as Void*), LibC::SizeT.cast(length), 0, nil, nil)
    if bytes_read != -1
      return slice
    end

    if LibC.errno == Errno::EAGAIN
      wait_readable
    else
      raise Errno.new("Error receiving")
    end
  ensure
    add_read_event unless readers.empty?
  end
end

Should recvfrom append directly to the internal buffer (@in_buffer_rem), or should it exist stand alone and return the slice? If it's appended directly you'd get an API like this:

 client << "message"
 server.recvfrom(1024)
 server.read(1024) # => "message..."

Where as if it's stand alone we'd get something like:

 client << "message"
 server.recvfrom(1024) # => [109, ...]

The latter feels more UNIX-y and is how it's used in Ruby. I'm looking for direction on how to make this API fit nicely into Crystal.

@technorama

This comment has been minimized.

Show comment
Hide comment
@technorama

technorama Aug 27, 2015

Contributor

How you would rely on buffering with recvfrom? Packets can arrive in any order and from a multitude of hosts. I think recvfrom should return "record sockaddr, buffer". The sockaddr struct is necessary to distinguish packet sources when running servers or p2p.

def recvfrom length
  slice = Slice(UInt8).new length
  recvfrom slice
end

# allows reuse of slices in tight loops
def recvfrom slice
  # full implementation
Contributor

technorama commented Aug 27, 2015

How you would rely on buffering with recvfrom? Packets can arrive in any order and from a multitude of hosts. I think recvfrom should return "record sockaddr, buffer". The sockaddr struct is necessary to distinguish packet sources when running servers or p2p.

def recvfrom length
  slice = Slice(UInt8).new length
  recvfrom slice
end

# allows reuse of slices in tight loops
def recvfrom slice
  # full implementation
@chendo

This comment has been minimized.

Show comment
Hide comment
@chendo

chendo Aug 28, 2015

@tatey yeah as @technorama said, you wouldn't really be able to buffer, and it does need to return sockaddr. That said, I'm not sure what the behaviour of Ruby is when you don't read the entire packet. I'd think it would just drop it but I'd have to check to be sure.

chendo commented Aug 28, 2015

@tatey yeah as @technorama said, you wouldn't really be able to buffer, and it does need to return sockaddr. That said, I'm not sure what the behaviour of Ruby is when you don't read the entire packet. I'd think it would just drop it but I'd have to check to be sure.

@waj

This comment has been minimized.

Show comment
Hide comment
@waj

waj Aug 28, 2015

Member

Would it make sense to actually disable the possibility of use a UDPSocket as a true IO? The current design inherits from FileDescriptorIO but we could redefine read and write to raise exceptions.

Member

waj commented Aug 28, 2015

Would it make sense to actually disable the possibility of use a UDPSocket as a true IO? The current design inherits from FileDescriptorIO but we could redefine read and write to raise exceptions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment