Enable users to access the jupyter server logs #684

consideRatio · 2022-02-01T17:06:40Z

Problem

A long standing challenge for JupyterHub admins/users has been that only admins have had access to the user servers logs. In this jupyter forum post @manics suggests that a workaround is to copy stdin/stderr streams via a custom intervention.

I wonder if this could be solved in a different way to make it easier to expose jupyter server logs for users.

Proposed Solution

Could we for example allow jupyter_server to be configured to emit logs in some way? I'm thinking of writing continuously to a file or exposing a REST API for example.

Additional context

The key problem I'd like to solve would be to ensure that jupyterhub user's would be able to have as much access to the logs as the jupyterhub admin would have. Currently, a jupyterhub user won't have such access. With a feature like this where the jupyter_server could be configured to emit logs to a file or via a REST API, I imagine a JupyterLab extension could be developed to provide easy access to server logs. This would help a JupyterHub admin help its users by for example asking them to include these logs when asking for help.

This isn't the first time a need like this has surfaced, but I've never had a good idea on how to go about it. This idea seems somewhat reasonable to me at glance. The most recent need surfaced here https://discourse.pangeo.io/t/start-up-errors-on-pangeo-google-cloud-deployments/2101.

Could something like this be reasonable to implement in jupyter_server?
Are there examples of other software that expose their logs in a similar way to draw experience from?

/cc: @manics, @akhmerov, @sgibson91, @fperez who I think may be interested in this discussion.

The text was updated successfully, but these errors were encountered:

welcome · 2022-02-01T17:06:42Z

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! 🤗

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.

You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! 👋

Welcome to the Jupyter community! 🎉

yuvipanda · 2022-02-01T17:27:03Z

JupyterHub used to allow this with extra_log_file, but that was deprecated a while ago by @minrk https://github.com/jupyterhub/jupyterhub/blob/36cad38ddf00c3fe92d813fd7bf8715fb876d006/jupyterhub/app.py#L1398. The reasoning given is:

                extra_log_file only redirects logs of the Hub itself,
                and will discard any other output, such as
                that of subprocess spawners or the proxy.
                It is STRONGLY recommended that you redirect process
                output instead, e.g.
                    jupyterhub &>> '{}'

bollwyvl · 2022-02-01T17:27:39Z

Streaming logs also came up in the context of jupyter-server-proxy and jupyter-lsp (can't find a link).

As it might need to ship a non-trival UI, and we've been trying to shed those, perhaps thinking about it as a jupyter_server_logs that offered, in addition to a facility for capturing logs and REST/WebSocket handlers to view them, opt-in logging of jupyter-server itself, as well as a standalone/embeddable log viewer application.

Such a log viewer could use xtermjs and/or luimino's datagrid for structured logs, perhaps.

kevin-bates · 2022-02-01T17:52:13Z

This sounds like a matter of configuring the handlers of a LoggingConfigurable class (from which Jupyter Server derives): ipython/traitlets#688. At least that feels like the right thing to do.

cc: @oliver-sanders

minrk · 2022-02-02T08:58:39Z

As I mentioned in the JupyterHub issue, I don't think Python logging is the right level at which to do this. Instead, I think process-level FD capture is where it should happen.

In repo2docker, we do this with an entrypoint, but you can also duplicate stdout/err to a file in-process (at least on non-Windows) with os.dup2, as seen in wurlitzer.

Logging would be the right place for more structural capture of a specific subset of events, though.

oliver-sanders · 2022-02-02T10:24:09Z

Streaming logs

capturing logs and REST/WebSocket handlers to view them

Logging would be the right place for more structural capture of a specific subset of events, though.

I think the logging level would be the most flexible approach.

For my purposes I would like to be able to configure a persistent rotating log in a standard location and to have additional logging handlers with different filters for specific purposes to assist with monitoring and debugging. I'm mostly interested in the output of one particular server extension so might want to separate its logging from other extensions.

So for me Python's logging config object would be a pretty ideal solution. I think it would probably be fairly straight forward to implement this at the Traitlets level where we already support a subset of logging configuration (see the issue linked above), but have been a bit too distracted of late to try it out.

minrk · 2022-02-02T12:19:14Z

Supporting Python logging config in LoggingConfigurable is definitely something we should do and would cover your case. But I wouldn't say that it addresses this issue. For general server output, I think process-level capture is the only robust approach. It's the only one that will reliably capture logged output from kernels and other subprocesses (which may not be Python), including crash messages, for example.

This snippet in a jupyter_server_config.py tees the server's own stdout/err (including subprocesses) to a single file:

import atexit
import os
import sys

from wurlitzer import Wurlitzer


class Tee:
    def __init__(self, stream, file, mode="a"):
        if hasattr(file, "write"):
            self.log_file = file
        else:
            self.log_file = open(file, mode="a")

        self.stream = stream

    def write(self, buf):
        # write to both the log file and the original stream
        for f in (self.stream, self.log_file):
            f.write(buf)
            f.flush()


log_file = open("test.log", "a")

# keep a handle on the original FDs after redirecting
real_stdout = os.fdopen(os.dup(sys.stdout.fileno()), "w")
real_stderr = os.fdopen(os.dup(sys.stderr.fileno()), "w")

w = Wurlitzer(
    stdout=Tee(real_stdout, log_file),
    stderr=Tee(real_stderr, log_file),
)

w.__enter__()
atexit.register(w.__exit__)

and similar logic could be behind a capture_output flag on the application.

I don't know how to do this on Windows, but I know someone does.

yuvipanda · 2022-02-02T15:38:40Z

Based on my experience operating clusters over the last few years, I tend to agree with @minrk that capturing stdout / stderr is the way to go - not everything goes into python logging, and sometimes that is out of our control. This also matches the learnt wisdom of how 12 factor apps thinks this should be done.

minrk · 2022-02-03T08:24:11Z

Another option is to make this a feature request for Spawners so that JupyterHub could have a logs API that users can access for their own servers. Almost all spawners use an underlying mechanism that captures logs (k8s, docker, systemd).

Advantage of sending to a file, though, is that can be in a persistent volume and checked after crash. Container-based logs capture are typically inaccessible after the containers stop, unless you take a step up to the cloud provider's log aggregator API instead of k8s/docker directly, which would be harder to do at the Spawner level.

oliver-sanders · 2022-04-11T12:49:45Z

FYI: If anyone is interested / would like to test or review. I have raised a Traitlets PR to handle the Python logging side of things - ipython/traitlets#698 (stdout/err redirection mentioned above is a whole other thing).

Here's some example configuration for adding a FileHandler to the "base" server and the Jupyter Lab server extension application:

# jupyter_config.py
from pathlib import Path
 
# direct Jupyter Server logs to jupyter_server.log
# (preserving the default stderr "console" logging)
c.ServerApp.logging_config = {
    'version': 1,
    'handlers': {
        'file': {
            'class': 'logging.FileHandler',
            'level': 'DEBUG',
            'filename': Path.cwd() / 'jupyter_server.log',
        },
    },
    'loggers': {
        'ServerApp': {  
            'level': 'DEBUG',  
            'handlers': ['console', 'file'],  
        },  
    }
}
 
# direct Jupyter Lab logs to jupyter_lab.log
# (preserving the default stderr "console" logging)
c.LabApp.logging_config = {
    'version': 1,
    'handlers': {
        'file': {
            'class': 'logging.FileHandler',
            'level': 'DEBUG',
            'filename': Path.cwd() / 'jupyter_lab.log',   
        },
    },
    'loggers': {
        'LabApp': {
            'level': 'DEBUG',
            'handlers': ['console', 'file'],
        },
    }
}

Handlers / formatters / levels to be edited to preference.

Note: Because server extension applications are separate Traitlets applications to the "base" server they use different loggers so must be configured separately.

Example:

$ jupyter lab --config jupyter_config.py &
...
$ head jupyter_server.log -n5
Looking for jupyter_config in /var/tmp/...
Loaded config file: /var/tmp/...
Paths used for configuration of jupyter_server_config: 
    /etc/jupyter/jupyter_server_config.json
Paths used for configuration of jupyter_server_config: 
$ head jupyter_lab.log -n 5
Looking for jupyter_lab_config in /etc/jupyter
Looking for jupyter_lab_config in /usr/local/etc/jupyter
Looking for jupyter_lab_config in ~/<env>/etc/jupyter
Looking for jupyter_lab_config in ~/.local/etc/jupyter
Looking for jupyter_lab_config in ~/.jupyter

oliver-sanders · 2022-05-13T12:13:21Z

Traitlets 5.2.0 now provides a logging_config trait which allows additional file handlers to be configured, hope it helps.

This was the proposed solution from the OP and satisfies the use cases outlined there. I have opened a PR to document bump Jupyter Server onto Traitlets 5.2.1 (when it's released) and document usage of logging_config - #844.

This does not solve the trickier stdout/err redirection mentioned in other comments. I think this is a Spawner feature (as we couldn't reliably implement it from within the application itself?) so out of scope for Jupyter Server itself. Will leave you to decide what you want to do with this issue.

athornton · 2022-06-08T03:04:01Z

I disagree that process output redirection is the right thing. I want WARN and above to go to stderr, and INFO and below to stdout. Can't do that at the process level.

Also the logging_config doc refers to c.Application.logging_configurable which is wrong--it's config, not configurable.

consideRatio added the enhancement label Feb 1, 2022

oliver-sanders mentioned this issue May 13, 2022

docs: document the logging_config trait #844

Merged

consideRatio mentioned this issue Aug 12, 2022

Brainstorming: new use of entry_points for composability of Jupyter ecosystem software? jupyterhub/team-compass#552

Open

krassowski mentioned this issue Jan 12, 2023

Make JupyterLab logs visible inside JupyterLab jupyterlab/jupyterlab#13733

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable users to access the jupyter server logs #684

Enable users to access the jupyter server logs #684

consideRatio commented Feb 1, 2022

welcome bot commented Feb 1, 2022

yuvipanda commented Feb 1, 2022

bollwyvl commented Feb 1, 2022

kevin-bates commented Feb 1, 2022

minrk commented Feb 2, 2022

oliver-sanders commented Feb 2, 2022 •

edited

minrk commented Feb 2, 2022

yuvipanda commented Feb 2, 2022

minrk commented Feb 3, 2022

oliver-sanders commented Apr 11, 2022

oliver-sanders commented May 13, 2022

athornton commented Jun 8, 2022

Enable users to access the jupyter server logs #684

Enable users to access the jupyter server logs #684

Comments

consideRatio commented Feb 1, 2022

Problem

Proposed Solution

Additional context

welcome bot commented Feb 1, 2022

yuvipanda commented Feb 1, 2022

bollwyvl commented Feb 1, 2022

kevin-bates commented Feb 1, 2022

minrk commented Feb 2, 2022

oliver-sanders commented Feb 2, 2022 • edited

minrk commented Feb 2, 2022

yuvipanda commented Feb 2, 2022

minrk commented Feb 3, 2022

oliver-sanders commented Apr 11, 2022

oliver-sanders commented May 13, 2022

athornton commented Jun 8, 2022

oliver-sanders commented Feb 2, 2022 •

edited