New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logging #534

Merged
merged 19 commits into from Apr 29, 2016

Conversation

Projects
None yet
4 participants
@samoht
Copy link
Member

samoht commented Apr 27, 2016

No description provided.

talex5 and others added some commits Mar 14, 2016

Add configure/runtime keys to parse log patterns.
`-l foo:debug,info` means "print all the info message and the debug message
coming from the source named 'foo'"
Re-implement the logging support.
The new solution is a bit less powerful, but hopefuly simpler to
parametrise -- also it can now be configured using the command-line
options both at configuration and runtime.
@samoht

This comment has been minimized.

Copy link
Member

samoht commented Apr 27, 2016

@Drup @talex5 can you please review? I'm not super happy with the solution, but that's the best mixed between your two viewpoints that I can found.

The neat thing about the approach, though, is that you can configure the log reporters at configure and runtime ... (ie. I think I'm happy with the user-facing interface).

@samoht

This comment has been minimized.

Copy link
Member

samoht commented Apr 27, 2016

An example of usage:

$ mirage configure -l foo:debug --no-opam && make
[ ... ]
$ ./main.native
Hello World!
2016-04-27 17:26.15: DBG [foo] Haha
Hello World!
2016-04-27 17:26.16: DBG [foo] Haha
$ ./main.native -l bar:debug,foo:debug
Hello World!
2016-04-27 17:26.31: DBG [bar] Sleeping!
2016-04-27 17:26.31: DBG [foo] Haha
Hello World!
2016-04-27 17:26.32: DBG [bar] Sleeping!
2016-04-27 17:26.32: DBG [foo] Haha

@samoht samoht force-pushed the samoht:logging branch from 425a0ef to ee378d7 Apr 27, 2016

@samoht samoht force-pushed the samoht:logging branch from ee378d7 to 2da5f0f Apr 27, 2016

| None -> jobs
| Some tracing ->
tracing :: jobs
let jobs = in_parallel jobs in

This comment has been minimized.

@Drup

Drup Apr 27, 2016

Member

Why is this needed ? Unless you changed it, the jobs are already in parallel.

This comment has been minimized.

@samoht

samoht Apr 28, 2016

Member

Then the current implementation has an issue then. I'll fix mirage/functoria#58 (comment)

match ring_size with
| None -> Fmt.string ppf "None"
| Some i -> Fmt.pf ppf "Some %d" i
in

This comment has been minimized.

@Drup

Drup Apr 27, 2016

Member

That looks like Fmt.(Dump.option int).

@Drup

This comment has been minimized.

Copy link
Member

Drup commented Apr 27, 2016

What are you not happy with ? It corresponds quite closely to what I had in mind, and the implementation of the command line stuff is nicely shared between configure and runtime.

@talex5

This comment has been minimized.

Copy link
Contributor

talex5 commented Apr 27, 2016

Runtime logging configuration is certainly very nice. I believe that is independent of the other changes, right?

So, things we lose in exchange for the different implementation are:

  • It is no longer possible to get the debug logs dumped on error. However, you could do this in your unikernel, which should be OK since it's mostly useful for long-running processes. If the unikernel is crashing reliably at boot you might as well just start at debug level. So this is only a problem if you have a unikernel that very occasionally crashes at boot and you want debug logs only in that case.
  • It is only possible to define a single log reporter, since every reporter you define now automatically becomes the installed reporter. This is probably not a major limitation.
  • It is no longer possible to get the clock initialised before using it. However, the default Mirage clock doesn't need to be initialised, and the CLOCK type doesn't even allow this (it has time: unit -> float, rather than time: t -> float). So we don't lose anything here unless we want to support reporters with other dependencies (e.g. a syslog reporter that needs a network stack).

Anyway, I'm happy to have anything merged, so if @Drup is happy then I am too :-)

val mprof_trace : size:int -> unit -> tracing impl
(** Use mirage-profile to trace the unikernel. On Unix, this creates
and mmaps a file called "trace.ctf". On Xen, it shares the trace
buffer with dom0. @param size: size of the ring buffer to use. *)

This comment has been minimized.

@talex5

talex5 Apr 27, 2016

Contributor

Was this reformatting intentional? It complicates the diff and seems less readable to me, but if it's necessary for some style guide then fine.

@param tracing Enable tracing.
@param reporter Configure logging. The default log reporter is
[Some {!default_reporter}]. To disable logging, use the [None].

This comment has been minimized.

@talex5

talex5 Apr 27, 2016

Contributor

the [None] -> [None]

(** [default_reporter ?clock ?level ()] is the log reporter that
prints log messages to the console, timestampted with [clock]. If
not provided, the default clock is {!default_clock]}. [level] is
the default log threshold. It is [Logs.Warning] is not

This comment has been minimized.

@talex5

talex5 Apr 27, 2016

Contributor

is not -> if not

@@ -14,6 +14,8 @@
* OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
*)

This comment has been minimized.

@talex5

talex5 Apr 27, 2016

Contributor

Ah, I hadn't noticed this module. I could have put that annoying helper functor here :-/

@Drup

This comment has been minimized.

Copy link
Member

Drup commented Apr 27, 2016

It is no longer possible to get the debug logs dumped on error.

Indeed, but I think we should have a different solution for that, in particular in relation to the use of pp_error everywhere for error reporting.

It is no longer possible to get the clock initialised before using it.

I'm not sure I understand. What do you mean ? Those are still normal devices, so everything should be initialized in order properly.

@talex5

This comment has been minimized.

Copy link
Contributor

talex5 commented Apr 27, 2016

When I said "errors", I meant unexpected errors (i.e. exceptions). Although it could sometimes be useful to get detailed logs for normal errors like disk-not-found too.

About initialisation: A syslog logger will want to start logging before its dependencies are connected. The order would be:

  1. Initialise clock.
  2. Start logging time-stamped events to ring buffer.
  3. Initialise network.
  4. Dump ring to network.
  5. Initialise unikernel / other dependencies.

(and maybe do 3 and 5 in parallel)

The new system no longer allows this, right? It only works in the current case because the default logger only needs a clock, and clocks don't need initialising.

You could perhaps have a separate syslog device that takes over from the default one, though.

@talex5

This comment has been minimized.

Copy link
Contributor

talex5 commented Apr 27, 2016

More generally, the new system assumes the logger and its dependencies will initialise instantly. This should be true for the default clock and the default logger. However, if a logger takes time to get ready, early log messages will be lost.

@Drup

This comment has been minimized.

Copy link
Member

Drup commented Apr 27, 2016

I think you could do that just fine, actually. Use the ring-buffer logger by default, then have another job (that is started in parallel with everyone) that launches the actual logger and swap reporter with set_reporter.

More generally, the new system assumes the logger and its dependencies will initialise instantly. This should be true for the default clock and the default logger. However, if a logger takes time to get ready, early log messages will be lost.

That was the point of the init list, It's initialized before the other jobs, but I realized this patch actually doesn't use it, which is indeed problematic.

@samoht

This comment has been minimized.

Copy link
Member

samoht commented Apr 28, 2016

The generated code looks like that:

let mirage_logs1 = lazy (
  let __clock1 = Lazy.force clock1 in
  __clock1 >>= function
  | `Error _e -> fail (Failure "clock1")
  | `Ok _clock1 ->
  let console_threshold =
  Some (Mirage_runtime.threshold ~default:Logs.Warning (Key_gen.logs ()))
  in
  let ring_size = None in
  let reporter = Mirage_logs1.create ?ring_size ?console_threshold () in
  let level = match (Key_gen.logs ()) with
    | [] -> None
    | l  -> Some (Mirage_runtime.log_level l)
  in
  Logs.set_level level;
  Mirage_logs1.set_reporter reporter;
  Lwt.return (`Ok ())
  )

let mirage1 = lazy (
  let __key1 = Lazy.force key1 in
  let __mirage_logs1 = Lazy.force mirage_logs1 in
  let __f11 = Lazy.force f11 in
  __key1 >>= function
  | `Error _e -> fail (Failure "key1")
  | `Ok _key1 ->
  __mirage_logs1 >>= function
  | `Error _e -> fail (Failure "mirage_logs1")
  | `Ok _mirage_logs1 ->
  __f11 >>= function
  | `Error _e -> fail (Failure "f11")
  | `Ok _f11 ->
  Lwt.return_unit
  )

So the log reporter is a normal device, and is ready when the dependencies are ready (I reverted the previous implementation that I showed you @talex5).

The current missing feature compared to the previous one is the ability to dump the ring buffer on early errors. I'd be tempted to either add an at_exit call somewhere in the reporter, or add the ability to add default exception handler in functoria somehow.

samoht added some commits Apr 28, 2016

@samoht

This comment has been minimized.

Copy link
Member

samoht commented Apr 28, 2016

More updates after talking with @Drup. Now the init stuff seems to work (although lookup-by-name is a bit fragile, but should be alright).

I've also added a way to disable command-line parsing: with these new init hooks, it's now very simple to do.

@samoht

This comment has been minimized.

Copy link
Member

samoht commented Apr 28, 2016

Note: this requires mirage/functoria#58

@talex5

This comment has been minimized.

Copy link
Contributor

talex5 commented Apr 29, 2016

Also, I tried:

let main =
  foreign "Unikernel.Main" (console @-> reporter @-> job)

let reporter = default_reporter ~level:Logs.Debug ()

let () =
  register "console" [main $ default_console $ reporter]
    ~reporter:(Some reporter)

But this fails, because it passes () as the reporter instance:

File "main.ml", line 61, characters 36-49:
Error: This expression has type unit but an expression was expected of type
         Mirage_logs1.t = Mirage_logs.Make(Clock).t
@talex5

This comment has been minimized.

Copy link
Contributor

talex5 commented Apr 29, 2016

Another strange thing I missed before: it seems that -l sets console_threshold. It's more common to want to set the source threshold (which controls which log messages are generated), rather than the console threshold (which controls which messages are printed to the console).

An example of why we have both:

qubes-mirage-firewall was occasionally crashing with an unhelpful error from XenStore. I wanted to configure debug logging from xenstore so I could see what the request was. However, this leads to very verbose console output and slows everything down in the working case. So instead I set the source threshold to Debug and the console threshold to Info and enable the ring buffer. The debug messages are collected into the ring while it runs. If it crashes, the ring gets dumped and I see the debug messages leading up to the crash.

So probably foo:debug should mean to set foo's source level. And maybe foo:debug:info could mean to collect at debug level but only send info level to the console (although this won't work reliably now that we're not calling Mirage_logs.run).

Finally, it seems to be setting Logs.set_level (the default level) to the maximum of the configured levels, which makes no sense to me. I guess this means that if I ask for bar:error then Error becomes the default level for unspecified sources? What is this for? (correction noted by @samoht: in Logs, Error is considered "low", so we actually take the min. However, doing this is inefficient because we generate logs and throw them away. We just need to make sure that the console level is included in the corresponding source's level.)

@hannesm

This comment has been minimized.

Copy link
Member

hannesm commented Apr 29, 2016

@talex5 IMHO: command-line arguments should set the source level. If you want to have the advanced-ringbuffer-writing-at-other-level-dump-on-error thing, this should get another set of command line arguments

EDIT: and provide sane defaults: a ringbuffer of reasonable size, and all log sources on debug level

@talex5

This comment has been minimized.

Copy link
Contributor

talex5 commented Apr 29, 2016

@hannesm I think we're saying the same thing: console thresholds are an advanced feature and the default should be source levels. I don't have any opinion about whether to include console levels as an optional extra in the regular -l, to have a separate option, or to exclude them completely from the CLI.

If console thresholds aren't used, there is no need for a ring buffer, since everything will have been printed anyway.

This might change if we add a syslog reporter. Ideally, such a reporter would automatically enable the ring buffer. However, I don't believe this is possible with the new API because if you set syslog as your reporter then it won't initialise until after networking is configured, so you'd have to configure the default reporter and wire it up to syslog manually. (yes, I still think the original design was better ;-) )

Make the API more consistent.
Now ~argv and ~reporter just take impl values, and there is a no_argv
and no_reporter values to use empty implementations.
@hannesm

This comment has been minimized.

Copy link
Member

hannesm commented Apr 29, 2016

@talex5 we do not agree entirely I believe:
IMHO the command line argument should control what you see on the output (console/syslog).

in case you want to control (IMHO you don't, it should be debug for all log sources) what is written to a potential ring buffer which is dumped on error, this should be done via a disjoint set of command line arguments.

for me, the output to console/syslog is the thing a user wants to control, whereas the ring buffer is the special thingy a user shouldn't need to care about

@talex5

This comment has been minimized.

Copy link
Contributor

talex5 commented Apr 29, 2016

If you log at debug for all sources then everything will run really slowly, so this certainly shouldn't be the default. If you don't use console thresholds, then setting the source level has the effect of controlling what appears on the console anyway.

@samoht

This comment has been minimized.

Copy link
Member

samoht commented Apr 29, 2016

I've fixed the issues reported by @talex5 and I've updated the API to be more consistent, and now individual sources' levels are set. There is also a warning if you specify a source which doesn't exist:

$ mirage configure --no-opam -l info,foo:debug,toto:debug
$ make
[...]
$ ./main.native
Warning: toto is not a valid log source.
Hello World!
2016-04-29 10:04.52: DBG [foo] Haha

The generated code source is now:

let mirage_logs1 = lazy (
  let __clock1 = Lazy.force clock1 in
  __clock1 >>= function
  | `Error _e -> fail (Failure "clock1")
  | `Ok _clock1 ->
  let console_threshold =
  Some (Mirage_runtime.threshold ~default:Logs.Warning (Key_gen.logs ()))
  in
  let ring_size = None in
  let reporter = Mirage_logs1.create ?ring_size ?console_threshold () in
  Mirage_runtime.set_level ~default:Logs.Warning (Key_gen.logs ());
  Mirage_logs1.set_reporter reporter;
  Lwt.return (`Ok reporter)
  )
@hannesm

This comment has been minimized.

Copy link
Member

hannesm commented Apr 29, 2016

@talex5 depends on what you store in the ring buffer, doesn't it? shouldn't there be only closures which get evaluated once you hit the error and want to print the data?

Remove console_threshold
After a quick chat with @talex5 it appears that we don't need that as we now
set the source levels correctly. And this function is slow anyway.
@talex5

This comment has been minimized.

Copy link
Contributor

talex5 commented Apr 29, 2016

@hannesm I don't do that because if the logger is printing mutable state, it might be different by the time the ring is dumped (or not exist at all, as in the case of dumping a Xen shared page, for example).

Also, it might cause things to be freed later than expected.

@samoht

This comment has been minimized.

Copy link
Member

samoht commented Apr 29, 2016

After a quick chat with @talex5, more code-generation simplification:

let mirage_logs1 = lazy (
  let __clock1 = Lazy.force clock1 in
  __clock1 >>= function
  | `Error _e -> fail (Failure "clock1")
  | `Ok _clock1 ->
  let ring_size = None in
  let reporter = Mirage_logs1.create ?ring_size () in
  Mirage_runtime.set_level ~default:Logs.Warning (Key_gen.logs ());
  Mirage_logs1.set_reporter reporter;
  Lwt.return (`Ok reporter)
  )
@talex5

This comment has been minimized.

Copy link
Contributor

talex5 commented Apr 29, 2016

BTW, the previous default log level was Info, which corresponds to the fairly verbose output mirage libraries give currently with printf / Console.log. Changing it to Warning means mirage will now produce no output by default. Is that what we want?

@talex5

This comment has been minimized.

Copy link
Contributor

talex5 commented Apr 29, 2016

In particular, if you want to see a unikernel's IP address when it boots, you'll want the level set at Info.

@hannesm

This comment has been minimized.

Copy link
Member

hannesm commented Apr 29, 2016

IMHO default should be info, and we should adjust the levels (e.g. IP address) when updating libraries to use logs... I find e.g. the ARP messages on console unnecessary in the default configuration

@samoht

This comment has been minimized.

Copy link
Member

samoht commented Apr 29, 2016

Ok logs default is now Info. I'm trying to fix the CI scripts, then test it a bit more and then we can start integrating all @talex5 patches to the various libraries. yay!

@samoht samoht force-pushed the samoht:logging branch from cc018c8 to 8066362 Apr 29, 2016

@samoht samoht merged commit 1c1c3ac into mirage:master Apr 29, 2016

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@samoht samoht deleted the samoht:logging branch Apr 29, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment