Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

select() portability and FD limits on RTEMS #300

Open
mdavidsaver opened this issue Nov 1, 2022 · 11 comments
Open

select() portability and FD limits on RTEMS #300

mdavidsaver opened this issue Nov 1, 2022 · 11 comments

Comments

@mdavidsaver
Copy link
Member

From https://docs.rtems.org/branches/master/user/migration/v4_11-to-v5.html

In RTEMS 5.1, the list of free file descriptors has a LIFO ordering in contrast to previous versions where it was a FIFO. This means if an application regularly opens and closes files (or sockets) it sees the whole range of file descriptors. The reason for this change was to increase the time before file descriptors are reused to more likely catch a file descriptor use after close.

This change may surface application issues. If the configured file descriptor maximum (CONFIGURE_MAXIMUM_FILE_DESCRIPTORS) is greater than the FD_SETSIZE defined by Newlib to 64, then calls to select() are undefined behaviour and may corrupt the thread stack. In particular, FD_SET() may result in an out of bounds access. It is possible to define a custom FD_SETSIZE. The application must ensure that the custom FD_SETSIZE is defined before <sys/select.h> is included in all modules used by the application, for example via a global compiler command line define. This applies also to all third-party libraries used by the application.

This is the reason that the configured maximum number of file descriptors has been reduced from 150 (RTEMS <= 4.x) to 64 (RTEMS >= 5.x)

#define CONFIGURE_LIBIO_MAXIMUM_FILE_DESCRIPTORS 150

#define CONFIGURE_MAXIMUM_FILE_DESCRIPTORS 64

To be clear this reduction is made out of an abundance of caution. FD_SETSIZE effects mainly (only?) the select() function. Applications/sites which don't call select() may safely raise this limit.

At present, IOC core and the PVA modules do not use select().

The only usage of select() in Base is in the fdManager code, which is mainly encountered in the PCAS and ca-gateway code.

@mdavidsaver
Copy link
Member Author

mdavidsaver commented Nov 1, 2022

The only usage of select() in Base is the fdManager utility[1]. I'm not aware of any usage in IOCs.

int status = select (this->maxFD, pReadSet, pWriteSet, pExceptSet, &tv);

Another module I've looked at is asyn, which only uses select() on vxWorks. On RTEMS the SO_SNDTIMEO socket option is used instead. It could also use poll(), which avoids the FD_SETSIZE limit.

Searching around in epics-modules, I find several directly calling select().

I'm not sure if any of these are, or could be, run on RTEMS. Nor have I investigated whether they may use alternatives. (poll() being the most obvious)

fyi. @MarkRivers @kasemir @coretl

@MarkRivers
Copy link

dxpSITORO and mca use select() in vendor libraries for discovering devices on the network. These could possibly run on RTEMS but it seems unlikely, since they run on Linux and Windows.

measComp uses select() in test programs to test for keyboard input on Linux. Can not be used on RTEMS, vendor library only supported on Linux and Windows.

I think tpmac is obsolete, and replaced by pmac https://github.com/dls-controls/pmac. In that module select() seems to be called only in pmacSerial.c which only runs on vxWorks.

@coretl
Copy link

coretl commented Nov 7, 2022

I think tpmac is obsolete, and replaced by pmac https://github.com/dls-controls/pmac. In that module select() seems to be called only in pmacSerial.c which only runs on vxWorks.

Yes, tpmac has been replaced by pmac, although I think our version of vxWorks at DLS is too old to compile it.

@gilesknap I think our plan is to run RTEMS on the MVME5500 boards that talk to the VME PMACs, would this select issue affect us when doing this?

@gilesknap
Copy link

@coretl it looks like this might affect us. I expect that pmacAsynVMEPortSrc is going to need reviewing for RTEMS anyway though.

@mdavidsaver
Copy link
Member Author

I've updated the doc. comment in posix/rtems_config.c with fe9995c in an attempt to avoid further confusion.

It is also worth noting that FD_SETSIZE on Linux/glibc is 1024, and can not be increased by user applications. I guess this implies that no one has ever (successfully) had more than ~512 clients/IOCs connected via a cagateway?

@coretl
Copy link

coretl commented May 2, 2023

It is also worth noting that FD_SETSIZE on Linux/glibc is 1024, and can not be increased by user applications. I guess this implies that no one has ever (successfully) had more than ~512 clients/IOCs connected via a cagateway?

We might be approaching that limit, but I don't know if we've ever crossed it.

@peteleicester do you know if we've ever exceeded this number?

@mdavidsaver
Copy link
Member Author

@coretl @peteleicester One symptom of exceeding FD_SETSIZE with ca-gateway would be seeing fd > FD_SETSIZE ignored printed to stderr.

@peteleicester
Copy link

I dont recall seing this error in any of our gateway logs to date.

@anjohnson
Copy link
Member

There might be some code in the gateway to prevent it from trying to use more FDs than are available. I know Ken Evans added a feature that reserves an FD so it could still open a file (maybe for a logfile rotation) when there were too many FDs in use, but I don't remember any details. I know that APS had to configure the gateway processes to use a smaller than default thread stack size on a 32-bit OS because the number of threads created for all our client-side connections was more than could fit in the virtual address space.

@mdavidsaver
Copy link
Member Author

recsync uses select(), although it hasn't so far worked on RTEMS. cf. ChannelFinder/recsync#68

@kiwichris
Copy link
Contributor

kiwichris commented Feb 20, 2024

I am in the processing of changing the FD_SETSIZE in RTEMS from 64 to 256. The current size is the default for newlib. This will require changes to RTEMS tools and in EPICS to the number of file ops allocated. Maybe that setting becomes the FD_SETSIZE so it tracks the tools a user is using?

The RTEMS ticket with the patch is https://devel.rtems.org/ticket/4993

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants