Description
runexec
has an option --maxOutputSize
and will truncate the tool log if it exceeds the given size. However, this is applied only after the run ended. While the run is executing, the output file can grow arbitrarily and occupy disk space (it is created outside of the container and not affected by any limits).
This is of course suboptimal and we should try to find a solution that avoids this.
So far I see the following possibilities:
- Send tool output to a pipe, and have a thread/process that continuously reads from the pipe and writes to the output file as long as the output is not too large. This would also solve Prevent truncation of output file by tool #535, but there is some implementation complexity and overhead due to the additional thread/process (and if that thread/process is not fast enough, it could block the tool when it tries to write more).
- Send tool output to a file on a
tmpfs
in the container. Then it would be limited by the memory limit just like any other file that the tool creates. After the runrunexec
would copy the file from thetmpfs
to the real disk. This would solve the most important part of Prevent truncation of output file by tool #535, but the tool could still truncate its own output. And of course this would only work in container mode, and there would be no way for watching the tool output while the run is executing.
The implementation effort of 1. varies depending on how well the truncation should work. If we just want the last n
or first n
bytes of the log, then we could just pipe through tail
or head
. If want the first and last n/2
bytes (because these are the most important), we would need to reimplement head
and tail
internally. And if we want to keep the current behavior, which is to get as many full lines from the beginning and end of the file as possible while still not exceeding the maximum size, the bookkeeping of the of our implementation would be even more complex.