Skip to content

Conversation

@maxux
Copy link
Contributor

@maxux maxux commented Apr 17, 2020

In order to solve #718 we will move container logging feature from contd to shim. They provide a way to use external library/binary to handle logging.

This pull request move logging functions from contd to a new shim-zoslog binary decidated to logging purpose.

When this pull request will be ready and approved, restarting contd won't cut logs anymore and fifo will never be not read, except if something really bad happen to the new shim-zoslog which still need to be tested :)

@codecov
Copy link

codecov bot commented Apr 17, 2020

Codecov Report

Merging #738 into master will increase coverage by 9.32%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #738      +/-   ##
==========================================
+ Coverage   24.74%   34.07%   +9.32%     
==========================================
  Files          63       68       +5     
  Lines        4470     5113     +643     
==========================================
+ Hits         1106     1742     +636     
+ Misses       3204     3118      -86     
- Partials      160      253      +93     
Impacted Files Coverage Δ
pkg/container/container.go 0.00% <0.00%> (ø)
pkg/provision/reservation.go 3.40% <0.00%> (-13.27%) ⬇️
pkg/provision/source.go 24.77% <0.00%> (-3.35%) ⬇️
pkg/app/boot.go 0.00% <0.00%> (ø)
pkg/provision/primitives/network.go
pkg/provision/primitives/container.go
pkg/provision/primitives/kubernetes.go
pkg/provision/primitives/cache/cache.go
pkg/provision/primitives/debug.go
pkg/provision/primitives/volume.go
... and 24 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8ef3939...e81cefb. Read the comment docs.

Copy link
Contributor

@zaibon zaibon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of remarks on the code itself.
I also would like we decide whether we keep logging to disk at all or not.
Normally nobody would ever be able to read those logs, so I'm not sure this is worth it to create them at all.
If we decide not to log to always log to disk, then the code needs to be adapted to only start the logging-shim when the user asked for it.

I'm also a bit concerned regarding the memory usage of those shim. Maybe we could measure how much capacity they use and count that in the resource unit computation of the reservation.

Another idea to limit the amount of memory for for those daemon is to only have one per user per node. But this requries a bit more management

@zaibon zaibon linked an issue Apr 20, 2020 that may be closed by this pull request
maxux and others added 8 commits April 23, 2020 04:11
Since we potentially need to run lot of logger instance because we need
one per container, we need it to have the smallest memory footprint
possible.

After some test, the Go version was using ~3.6 MB of memory for a 3 MB
binary, with a basic Rust PoC, memory dropped down to ~1.6 MB.

This C version with static jansson and hiredis libraries bundled,
stripped binary size is 87K (138K fully static with musl).

On runtime, using glibc, shim-logs with 2 redis connections (one for
stdout and one for stderr) while transfering data, consume ~230 KB of
memory. The static musl versions seems to use even less but could not
get accurate values.

This only depends on hiredis and jansson external libs which are
compiled staticly by default with the Makefile.

In addition, there is a tools directory which contains a wrapper which
simulate a binary execution like it would be with containerd. This is
useful to test without actually run a container.
@zaibon zaibon merged commit 6dc778a into master Apr 23, 2020
@zaibon zaibon deleted the logs-shim branch April 23, 2020 18:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

provisiond: stuck on a 0-db namespace decommission

3 participants