New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed errors with the previous changes to socket selector. #683

Closed
wants to merge 2 commits into
base: master
from

Conversation

Projects
None yet
3 participants
@binary1248
Member

binary1248 commented Aug 18, 2014

Fixes unintended behaviour on Windows introduced by the last change due to Window's different interpretation of FD_SETSIZE. (#153)

Fixed SocketSelector not being able to accept sockets with IDs larger…
… than FD_SETSIZE on Windows (#153) and added the same checks to other affected methods as well.
@LaurentGomila

This comment has been minimized.

Show comment
Hide comment
@LaurentGomila

LaurentGomila Jul 17, 2014

Member

Is this check, and the one in remove(), really needed? What do FD_CLR and FD_ISSET do with a handle which is out of range? (sorry I have no access to the definition of these macros right now)

Member

LaurentGomila commented on src/SFML/Network/SocketSelector.cpp in ddd259e Jul 17, 2014

Is this check, and the one in remove(), really needed? What do FD_CLR and FD_ISSET do with a handle which is out of range? (sorry I have no access to the definition of these macros right now)

This comment has been minimized.

Show comment
Hide comment
@binary1248

binary1248 Jul 17, 2014

Member

You really don't want to know 😉. Implementations are free to implement fd_set as a simple bitfield consisting of FD_SETSIZE bits (so... something like int bits[32] for FD_SETSIZE = 1024). Those FD_ macros are often implemented directly into asm as bit manipulation instructions without any bounds checking, so specifying a handle which is out of range will write to some memory location, possibly even on the stack and doing all kinds of nasty things (like overwriting the return address, etc.).

See here:

The behavior of these macros is undefined if the fd argument is less than 0 or greater than or equal to FD_SETSIZE, or if fd is not a valid file descriptor, or if any of the arguments are expressions with side effects.

Member

binary1248 replied Jul 17, 2014

You really don't want to know 😉. Implementations are free to implement fd_set as a simple bitfield consisting of FD_SETSIZE bits (so... something like int bits[32] for FD_SETSIZE = 1024). Those FD_ macros are often implemented directly into asm as bit manipulation instructions without any bounds checking, so specifying a handle which is out of range will write to some memory location, possibly even on the stack and doing all kinds of nasty things (like overwriting the return address, etc.).

See here:

The behavior of these macros is undefined if the fd argument is less than 0 or greater than or equal to FD_SETSIZE, or if fd is not a valid file descriptor, or if any of the arguments are expressions with side effects.

This comment has been minimized.

Show comment
Hide comment
@LaurentGomila

LaurentGomila Jul 17, 2014

Member

I see, thanks for the explanation.

But what about the FD_ISSET check before calling FD_CLR? This operation should be safe even if the descriptor was not added to the set.

Member

LaurentGomila replied Jul 17, 2014

I see, thanks for the explanation.

But what about the FD_ISSET check before calling FD_CLR? This operation should be safe even if the descriptor was not added to the set.

This comment has been minimized.

Show comment
Hide comment
@binary1248

binary1248 Jul 17, 2014

Member

Reading from 1 bit is cheaper than writing to 1 bit. Reading from 1 bit is even cheaper than writing to 2 bits. It all has to do with cache 😉.

On Windows, this difference is bigger, since Windows would have to iterate over the registered handles to actually remove the correct one since fd_set isn't a bitfield there, more like a std::vector. FD_CLR Performance on Linux: O(1). FD_CLR Performance on Windows: O(N).

Member

binary1248 replied Jul 17, 2014

Reading from 1 bit is cheaper than writing to 1 bit. Reading from 1 bit is even cheaper than writing to 2 bits. It all has to do with cache 😉.

On Windows, this difference is bigger, since Windows would have to iterate over the registered handles to actually remove the correct one since fd_set isn't a bitfield there, more like a std::vector. FD_CLR Performance on Linux: O(1). FD_CLR Performance on Windows: O(N).

This comment has been minimized.

Show comment
Hide comment
@LaurentGomila

LaurentGomila Jul 17, 2014

Member

Ok but I don't think we need that extra nanosecond when we remove a socket from the selector. I prefer cleaner code, unless experience proves that we need this specific optimization.

Member

LaurentGomila replied Jul 17, 2014

Ok but I don't think we need that extra nanosecond when we remove a socket from the selector. I prefer cleaner code, unless experience proves that we need this specific optimization.

This comment has been minimized.

Show comment
Hide comment
@binary1248

binary1248 Jul 17, 2014

Member

Actually... when I said that the difference is bigger on Windows... that is on the order of milliseconds for large FD_SETSIZEs... so this is just a matter of scalability. With a handful of sockets, it will not be that prominent. But if you want, I can show you benchmarks of Windows not scaling well when using a lot of sockets with select(). Any "scalable" (although it is a joke to use Windows for this) networking software on Windows uses IOCP because of this.

Member

binary1248 replied Jul 17, 2014

Actually... when I said that the difference is bigger on Windows... that is on the order of milliseconds for large FD_SETSIZEs... so this is just a matter of scalability. With a handful of sockets, it will not be that prominent. But if you want, I can show you benchmarks of Windows not scaling well when using a lot of sockets with select(). Any "scalable" (although it is a joke to use Windows for this) networking software on Windows uses IOCP because of this.

@LaurentGomila

This comment has been minimized.

Show comment
Hide comment
@LaurentGomila

LaurentGomila Jul 17, 2014

Member

This commit makes me think of a few modifications:

  • The members of SocketSelectorImpl are still using the CamelCase notation instead of camelCase
  • We should store pointers to sf::Socket instances in the selector, and build the FD_SET only when needed; this would solve the problem of the selector's handle becoming invalid if the corresponding socket is created/recreated after it has been added to the selector
Member

LaurentGomila commented on ddd259e Jul 17, 2014

This commit makes me think of a few modifications:

  • The members of SocketSelectorImpl are still using the CamelCase notation instead of camelCase
  • We should store pointers to sf::Socket instances in the selector, and build the FD_SET only when needed; this would solve the problem of the selector's handle becoming invalid if the corresponding socket is created/recreated after it has been added to the selector

This comment has been minimized.

Show comment
Hide comment
@binary1248

binary1248 Jul 17, 2014

Member

The members of SocketSelectorImpl are still using the CamelCase notation instead of camelCase

You mean PascalCase right? 😋

We should store pointers to sf::Socket instances in the selector, and build the FD_SET only when needed; this would solve the problem of the selector's handle becoming invalid if the corresponding socket is created/recreated after it has been added to the selector

I don't think this "solves" the problem per se. With the current implementation (and fixes), no matter what the user does, the SocketSelector will not crash even if the user does something wrong, like forgetting to remove a socket before it gets destroyed. Using the SocketSelector after destroying a socket without removing it will function as normal, since the user will not be able to call isReady on it anyway. As soon as another socket with the same ID gets "added" back to the selector, it will be as if the user never did anything wrong, since the handle just stayed registered even though an associated socket didn't exist for a while.

If you store pointers to sf::Socket instances, the selector will crash every time it has to do something (like recreate the fd_set) and the socket doesn't exist any longer. This will probably show the user they did something wrong in a more obvious way, but if the current implementation makes such mistakes harmless, I don't think it's a bad thing. I prefer a more resilient API over one that forces the user to care about doing non-standard things that don't even have a performance penalty.

Also... For me, any operation that results in the internal handle being recreated is equivalent to creating a new sf::Socket object all together. You really can't expect the SocketSelector to still function properly after you disconnect and re-connect a TCP connection, or if you rebind a UDP port (which would only make sense if you changed the bound port number anyway). I would rather add some kind of explanation to the SocketSelector documentation that any such operation requires removing and re-adding the socket back to the selector.

And finally, from a performance point of view, having to iterate over a container of sf::Socket pointers every frame and checking whether their handles changed will incur a major performance hit. With that system, there is no way for a socket to notify the SocketSelector that its handle changed unless you want to register the SocketSelector in the socket itself and that would make things much more complicated 😉. Not to mention, iterating through a container of pointers and calling a member function on them will cause a lot of cache misses and become the main bottleneck in connection intensive applications. SFML already punishes people who want to do some sort of "high performance" networking by still relying on select(). There is no need to make it more horrible for them. Such changes can wait for the network module rewrite 😉.

Member

binary1248 replied Jul 17, 2014

The members of SocketSelectorImpl are still using the CamelCase notation instead of camelCase

You mean PascalCase right? 😋

We should store pointers to sf::Socket instances in the selector, and build the FD_SET only when needed; this would solve the problem of the selector's handle becoming invalid if the corresponding socket is created/recreated after it has been added to the selector

I don't think this "solves" the problem per se. With the current implementation (and fixes), no matter what the user does, the SocketSelector will not crash even if the user does something wrong, like forgetting to remove a socket before it gets destroyed. Using the SocketSelector after destroying a socket without removing it will function as normal, since the user will not be able to call isReady on it anyway. As soon as another socket with the same ID gets "added" back to the selector, it will be as if the user never did anything wrong, since the handle just stayed registered even though an associated socket didn't exist for a while.

If you store pointers to sf::Socket instances, the selector will crash every time it has to do something (like recreate the fd_set) and the socket doesn't exist any longer. This will probably show the user they did something wrong in a more obvious way, but if the current implementation makes such mistakes harmless, I don't think it's a bad thing. I prefer a more resilient API over one that forces the user to care about doing non-standard things that don't even have a performance penalty.

Also... For me, any operation that results in the internal handle being recreated is equivalent to creating a new sf::Socket object all together. You really can't expect the SocketSelector to still function properly after you disconnect and re-connect a TCP connection, or if you rebind a UDP port (which would only make sense if you changed the bound port number anyway). I would rather add some kind of explanation to the SocketSelector documentation that any such operation requires removing and re-adding the socket back to the selector.

And finally, from a performance point of view, having to iterate over a container of sf::Socket pointers every frame and checking whether their handles changed will incur a major performance hit. With that system, there is no way for a socket to notify the SocketSelector that its handle changed unless you want to register the SocketSelector in the socket itself and that would make things much more complicated 😉. Not to mention, iterating through a container of pointers and calling a member function on them will cause a lot of cache misses and become the main bottleneck in connection intensive applications. SFML already punishes people who want to do some sort of "high performance" networking by still relying on select(). There is no need to make it more horrible for them. Such changes can wait for the network module rewrite 😉.

This comment has been minimized.

Show comment
Hide comment
@LaurentGomila

LaurentGomila Jul 17, 2014

Member

Hmm yes, you have good points.

But I still think that such a code should work, it's really not intuitive, even if explained in the documentation.

sf::UdpSocket socket;
selector.add(socket);
socket.bind(3000);

It's not obvious that binding a UDP socket to a port recreates the internal socket and gets a new descriptor -- in fact it shouldn't, it's just an implementation detail that made something (don't even remember what) easier to handle for me.

But yes, I guess it can wait until the next major rewrite.

SFML already punishes people who want to do some sort of "high performance" networking by still relying on select()

Why?

Member

LaurentGomila replied Jul 17, 2014

Hmm yes, you have good points.

But I still think that such a code should work, it's really not intuitive, even if explained in the documentation.

sf::UdpSocket socket;
selector.add(socket);
socket.bind(3000);

It's not obvious that binding a UDP socket to a port recreates the internal socket and gets a new descriptor -- in fact it shouldn't, it's just an implementation detail that made something (don't even remember what) easier to handle for me.

But yes, I guess it can wait until the next major rewrite.

SFML already punishes people who want to do some sort of "high performance" networking by still relying on select()

Why?

This comment has been minimized.

Show comment
Hide comment
@binary1248

binary1248 Jul 17, 2014

Member

Why?

Because select() still relies on fd_set, which is fixed on most Unix systems with FD_SETSIZE = 1024 compiled into the kernel itself due to it being implemented as a bitfield. On Windows, you have more freedom. If you bother to define FD_SETSIZE to some value before including winsock2.h it will use your value instead, so fd_set's maximum size on Windows really only depends on the maximum array size you can allocate.

Also... the whole point of Tank's change was to prevent handles higher than FD_SETSIZE from being inserted into the bitfield on Unix systems since that will result in undefined behaviour. Nobody said that the first socket handle that you get from the operating system will be below FD_SETSIZE 😉. You might have 1024 other kinds of handles (files, pipes, etc.) already and then decide to create your first sf::Socket. When you try to add it into the sf::SocketSelector, you will trigger Tank's warning, since the operating system has to give you a handle that is distinct from the ones before, and it isn't that smart to think "Hmm... SFML needs socket handles to be below 1024... I better start all file and pipe handles above 1024 then..." 😉. This is of course assuming you don't try to handle more than 1024 connections simultaneously with SFML anyway. I actually tried to do that once... And ended up writing SFNUL 😋.

There are enough performance benchmarks comparing select()-like facilities. You can search for "select vs poll vs epoll vs kqueue". Most of the time the benchmarks don't even contain data for select() since it is that horrible.

Member

binary1248 replied Jul 17, 2014

Why?

Because select() still relies on fd_set, which is fixed on most Unix systems with FD_SETSIZE = 1024 compiled into the kernel itself due to it being implemented as a bitfield. On Windows, you have more freedom. If you bother to define FD_SETSIZE to some value before including winsock2.h it will use your value instead, so fd_set's maximum size on Windows really only depends on the maximum array size you can allocate.

Also... the whole point of Tank's change was to prevent handles higher than FD_SETSIZE from being inserted into the bitfield on Unix systems since that will result in undefined behaviour. Nobody said that the first socket handle that you get from the operating system will be below FD_SETSIZE 😉. You might have 1024 other kinds of handles (files, pipes, etc.) already and then decide to create your first sf::Socket. When you try to add it into the sf::SocketSelector, you will trigger Tank's warning, since the operating system has to give you a handle that is distinct from the ones before, and it isn't that smart to think "Hmm... SFML needs socket handles to be below 1024... I better start all file and pipe handles above 1024 then..." 😉. This is of course assuming you don't try to handle more than 1024 connections simultaneously with SFML anyway. I actually tried to do that once... And ended up writing SFNUL 😋.

There are enough performance benchmarks comparing select()-like facilities. You can search for "select vs poll vs epoll vs kqueue". Most of the time the benchmarks don't even contain data for select() since it is that horrible.

This comment has been minimized.

Show comment
Hide comment
@LaurentGomila

LaurentGomila Jul 17, 2014

Member

Yes, it's true that the limitations of select() are really annoying. I didn' know that it compared to bad to other equivalent solutions.

By the way, nothing prevents us from using a more optimized / less limited OS-specific implementation in sf::SocketSelector. It's just more work ;)

Member

LaurentGomila replied Jul 17, 2014

Yes, it's true that the limitations of select() are really annoying. I didn' know that it compared to bad to other equivalent solutions.

By the way, nothing prevents us from using a more optimized / less limited OS-specific implementation in sf::SocketSelector. It's just more work ;)

This comment has been minimized.

Show comment
Hide comment
@binary1248

binary1248 Jul 17, 2014

Member

Well... I was thinking that the next major rework of the networking module should provide more "modern" i.e. asynchronous alternatives to the classic POSIX networking. If it is possible, SFML should try to abstract from things like blocking reads/writes and focus more on the communication aspect of networking, getting data over the wire, reliably or unreliably. Since it is such a broad topic, it would probably be a good idea to outsource the lower level details to an external library like ASIO. If we were to write everything ourself, we would end up writing Impls for each platform that would be multiple thousand lines long since we can't assume that only a single backend will always be available on each platform. In any case, this is something to consider after C++11. It will make our lives muuuuch simpler 😉.

Member

binary1248 replied Jul 17, 2014

Well... I was thinking that the next major rework of the networking module should provide more "modern" i.e. asynchronous alternatives to the classic POSIX networking. If it is possible, SFML should try to abstract from things like blocking reads/writes and focus more on the communication aspect of networking, getting data over the wire, reliably or unreliably. Since it is such a broad topic, it would probably be a good idea to outsource the lower level details to an external library like ASIO. If we were to write everything ourself, we would end up writing Impls for each platform that would be multiple thousand lines long since we can't assume that only a single backend will always be available on each platform. In any case, this is something to consider after C++11. It will make our lives muuuuch simpler 😉.

This comment has been minimized.

Show comment
Hide comment
@binary1248

binary1248 Jul 18, 2014

Member

Amended commit can be found at d5d7d21.

Member

binary1248 replied Jul 18, 2014

Amended commit can be found at d5d7d21.

This comment has been minimized.

Show comment
Hide comment
@LaurentGomila

LaurentGomila Jul 18, 2014

Member

Thanks!

Member

LaurentGomila replied Jul 18, 2014

Thanks!

Renamed SocketSelectorImpl's members to use camelCase, removed unnece…
…ssary checks from SocketSelector::add and SocketSelector::remove.

@binary1248 binary1248 added bug labels Aug 18, 2014

@binary1248 binary1248 self-assigned this Aug 18, 2014

@eXpl0it3r

This comment has been minimized.

Show comment
Hide comment
@eXpl0it3r

eXpl0it3r Aug 18, 2014

Member

I tested it and it seems to fix the issue, if anyone else want to try, here's the ugly testing code. 😉

Merged in b96d330

Member

eXpl0it3r commented Aug 18, 2014

I tested it and it seems to fix the issue, if anyone else want to try, here's the ugly testing code. 😉

Merged in b96d330

@eXpl0it3r eXpl0it3r closed this Aug 18, 2014

@eXpl0it3r eXpl0it3r deleted the bugfix/socket_selector branch Aug 18, 2014

@eXpl0it3r eXpl0it3r added this to the 2.2 milestone Aug 18, 2014

@eXpl0it3r eXpl0it3r added the p:windows label Aug 18, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment