Skip to content

Prevent creation of large log files #536

Open
@PhilippWendler

Description

@PhilippWendler

runexec has an option --maxOutputSize and will truncate the tool log if it exceeds the given size. However, this is applied only after the run ended. While the run is executing, the output file can grow arbitrarily and occupy disk space (it is created outside of the container and not affected by any limits).

This is of course suboptimal and we should try to find a solution that avoids this.

So far I see the following possibilities:

  1. Send tool output to a pipe, and have a thread/process that continuously reads from the pipe and writes to the output file as long as the output is not too large. This would also solve Prevent truncation of output file by tool #535, but there is some implementation complexity and overhead due to the additional thread/process (and if that thread/process is not fast enough, it could block the tool when it tries to write more).
  2. Send tool output to a file on a tmpfs in the container. Then it would be limited by the memory limit just like any other file that the tool creates. After the run runexec would copy the file from the tmpfs to the real disk. This would solve the most important part of Prevent truncation of output file by tool #535, but the tool could still truncate its own output. And of course this would only work in container mode, and there would be no way for watching the tool output while the run is executing.

The implementation effort of 1. varies depending on how well the truncation should work. If we just want the last n or first n bytes of the log, then we could just pipe through tail or head. If want the first and last n/2 bytes (because these are the most important), we would need to reimplement head and tail internally. And if we want to keep the current behavior, which is to get as many full lines from the beginning and end of the file as possible while still not exceeding the maximum size, the bookkeeping of the of our implementation would be even more complex.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions