# Logging in Jupyter notebooks

When evolving your notebook code, as it becomes more complex, your logging attempts will be forced to keepup: usually taking this path.

 - add `print` statements in your cells
 - tire of constant commenting/uncommenting the `prints`
 - research logging and learn that you can use python's loggers in notebooks! Yay!
 - research further and figure out that you can direct builting loggers for your imports as well.
 - Look at this verbose output 😖!! Discover the power of visual separation with colors! 

This notebook demonstrates the above evolution. 

## Basic Logging

The cell below shows how you'd setup basic logging. The `reload(logging)` is needed because of the singleton aspect of the config. In a jupyter environment _(and you might face this in your standalone scripts as well)_, something in the infrastructure already initializes the logger. Once that is done, subsequent calls to `logging.basicConfig` do not have any effect. This is where `reload` comes in: it clears the module state, thus allowing the subsequent `logging.basicConfig` call to take effect.

> If you already have code with `basicConfig` **without** a `reload(logging)`, you can restart the notebook kernel for a new log_level to take effect: switch from `logging.DEBUG` to `logging.INFO` say.

Further down this notebook there is code for 
 - Color coding log-levels _(not important enough to make it into the lib but you could add it in your fork)_
 - Strategies to separate the clutter of the log from the actual output.

👇 is going into the `jupyter_utils.py` module

In [5]:
import logging

def setup_logging(level = logging.DEBUG):
     """
     Supply one of logging.INFO|DEBUG|WARN|ERROR
     """
     # Setup logging 
    # Note that module needs to be reloaded for our config to take as Jupyter 
    # already configures it: this makes all future configs no-ops unless a reload
    # is performed.
     from importlib import reload     
     reload(logging)
     logging.basicConfig(format='%(asctime)s %(levelname)s:%(message)s', 
                    level=level, 
                    datefmt='%I:%M:%S')

and this is how you would introduce loggging into your code _(running in a jupyter cell)_. 

In [6]:
setup_logging(logging.DEBUG)

def my_func():
    # do stuff and then
    logging.debug("My debug statement")
    logging.warning("My warning")
    logging.error("My error")

my_func()

12:20:10 DEBUG:My debug statement
12:20:10 ERROR:My error


👆 should show the three log statements you just printed out from the cell above.

## Control logging from imports - OpenAI - Env vars

With LLMs _(any python package for that matter)_, for instance, there are many times when you want visibility into low level decision making. Particularly those that might cause latency spikes, like HTTP response codes, hitting rate-limits and automatic retries. Surfacing their log traces will offer additional detail and hopefully enough to help. Many of these log statements will also improve discovery: when you suddenly see something that might be relevant. You then go off and research the API in more detail regarding that new thing.

This section will demonstrate controlling `OpenAI` logging when using their LLM APIs. Their documentation shows that they use `OPENAI_LOG` environment variale to control their loggers. Other LLM vendors should work similarly.

In [1]:
# If you want to log OpenAI's python library itself, also set the log level for this
# normally, limit this to warning/error and keep your own logging at debug levels.
# If this doesn't work right away, restart the kernel after changing the log-level
import os
os.environ["OPENAI_LOG"]="debug"

import openai

# Expects a OPENAI_API_KEY env var
def get_completion(prompt, model="gpt-4o-mini", temperature=0) -> str:
    chat_history = [{"role":"user", "content":prompt}]
    response = openai.chat.completions.create(
        model=model,
        messages=chat_history,
        temperature=temperature)
    return response.choices[0].message.content

In [2]:
print(get_completion("Why is the sky blue"))

[2025-06-01 12:35:18 - openai._base_client:482 - DEBUG] Request options: {'method': 'post', 'url': '/chat/completions', 'files': None, 'idempotency_key': 'stainless-python-retry-303e8deb-ca6f-404d-b56b-fd3fbbb6c252', 'json_data': {'messages': [{'role': 'user', 'content': 'Why is the sky blue'}], 'model': 'gpt-4o-mini', 'temperature': 0}}
[2025-06-01 12:35:18 - openai._base_client:965 - DEBUG] Sending HTTP Request: POST https://api.openai.com/v1/chat/completions
[2025-06-01 12:35:22 - httpx:1025 - INFO] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
[2025-06-01 12:35:22 - openai._base_client:1003 - DEBUG] HTTP Response: POST https://api.openai.com/v1/chat/completions "200 OK" Headers([('date', 'Sun, 01 Jun 2025 19:35:22 GMT'), ('content-type', 'application/json'), ('transfer-encoding', 'chunked'), ('connection', 'keep-alive'), ('access-control-expose-headers', 'X-Request-ID'), ('openai-organization', 'user-uxl7oko9mdo17utucmetfrwn'), ('openai-processing-

The sky appears blue primarily due to a phenomenon called Rayleigh scattering. When sunlight enters the Earth's atmosphere, it is made up of different colors, each with varying wavelengths. Blue light has a shorter wavelength compared to other colors like red or yellow.

As sunlight passes through the atmosphere, it collides with gas molecules and small particles. Because blue light is scattered in all directions more than other colors due to its shorter wavelength, we see a predominance of blue when we look up at the sky.

During sunrise and sunset, the sky can appear red or orange because the sunlight has to pass through a greater thickness of the atmosphere. This longer path scatters the shorter blue wavelengths out of our line of sight, allowing the longer red wavelengths to dominate.


In [6]:
from importlib import reload
import logging
import openai
reload(logging)
reload(openai)

<module 'openai' from '/home/vamsi/mambaforge/envs/ml-pip/lib/python3.12/site-packages/openai/__init__.py'>

You'll notice two things 👆

 - Those logs 😍
 - Such log! much noise! Where's my actual output at 😟

We'll get to decluttering the visual a bit later.

## Control logging when no env vars are available

> This is useful when you want to change things other than log levels as well. See the formatter example below

It's nice that OpenAI provides the `OPENAL_LOG` env var: very easy to control that. However, in cases where you don't have access to such a variable, you can manipulate the logger directly: you just have to get to the logger in use.

### Examine the loggers avaiable.

Note that the loggers are usually initialized at the module level on first use. So you'll likely need to exercise some code to get to them. ALl of this just to get the logger by name. Once you know the name, it usually doesn't change unless some major revision occurs.

In [47]:
import logging
from IPython.display import display, Markdown

# Use this to explore available loggers
# If there is a logger and you are not provided an env-var top control log level, 
# you can directly call logger.setLevel(Logging.DEBUG) to collect logs.
def get_available_loggers():
    return [logging.getLogger(name) for name in logging.root.manager.loggerDict]


# To see them all.
all_loggers = get_available_loggers()
print([l.name for l in all_loggers])

# Say we are interested only in openai
# Long list, I want this formatted nicely. Markdown formatting is easy enough to generate
# compared to HTML
openai_logger_names = [l.name for l in get_available_loggers() if 'openai' in l.name]
display(Markdown(
    "\n".join([f" * {item}" for item in openai_logger_names])
    ))

['httpx', 'rich', 'openai', 'openai._legacy_response', 'openai._response', 'openai._base_client', 'openai.resources.beta.realtime.realtime', 'openai.resources.beta.realtime', 'openai.resources.beta', 'openai.resources', 'openai.audio.transcriptions', 'openai.audio', 'openai.resources.uploads.uploads', 'openai.resources.uploads', 'httpcore.http11', 'httpcore', 'httpcore.connection', 'httpcore.proxy']


 * openai
 * openai._legacy_response
 * openai._response
 * openai._base_client
 * openai.resources.beta.realtime.realtime
 * openai.resources.beta.realtime
 * openai.resources.beta
 * openai.resources
 * openai.audio.transcriptions
 * openai.audio
 * openai.resources.uploads.uploads
 * openai.resources.uploads

### Set the log level directly on the selected logger

The query above shows a logger called `openai`: likely the root logger with individual sub-modules having child loggers. This is how one normally does things so that while testing a sub-module, you can set it's log-level to `INFO` say while reducing the noise down to `ERROR` for everything else.

In [None]:
# Say we want to customize the 'openai` logger. It likely is inherited by the openai.xxx child-loggers
# but not sure if they copy the parent settings (and thus break the link) on reference it. Basically, 
# you may have to customize the individual child loggers if changes to root-logger customization 
# does not have any affect.
oai_logger = list(filter(lambda l: l.name == "openai", all_loggers))[0]

# Since we already have all_loggers, I am using a filter on it.
# However, once you know the name, you can also use
# 👉  oai_logger = logging.root.manager.loggerDict.get('openai')
#-----------------------
# Set the level directly
oai_logger.setLevel(logging.DEBUG)


## Distinguishing log output from your cell output 

The main problem _(as illustrated in a previous call to OpenAI's completion API)_ is that of noise. Simply too much stuff and it takes attention away from the output you really care about. Thankfully, there are several easy solutions. The simplest would be to make use of Jupyter notebook's builtin markdown renderer _(also immensely useful when you have LLM output in markdown or want to convert something to markdown for some easy formatting)_. 

> Definitely pays to know your markdown. Mich simpler and less verbose than HTML.
>
> There are advanced ways to manipulate IPython displays using ipywidgets and display(id). Explore along those lines if organizing logs into a separate cell turns out to be important for your use cases.

### Use a markdown separator

Simply throw in a markdown separator add/or a markdown section.

In [29]:
from IPython.display import display, Markdown

def markdown_separator(section_name = None):
    if section_name:
        display(Markdown(f"----\n### {section_name}\n"))
    else:
        display(Markdown(f"----"))

In [28]:
# Run to completion so all logs are printed out
res = get_completion("Why is the sky blue")

# print separator
markdown_separator("OpenAI Response")

# print your result
# The use of markdown here formats it into the space available.
# Otw you'll get horizontal scrollbars
display(Markdown(res))

08:00:53 DEBUG:Request options: {'method': 'post', 'url': '/chat/completions', 'files': None, 'json_data': {'messages': [{'role': 'user', 'content': 'Why is the sky blue'}], 'model': 'gpt-4o-mini', 'temperature': 0}}
08:00:53 DEBUG:Sending HTTP Request: POST https://api.openai.com/v1/chat/completions
08:00:53 DEBUG:close.started
08:00:53 DEBUG:close.complete
08:00:53 DEBUG:connect_tcp.started host='api.openai.com' port=443 local_address=None timeout=5.0 socket_options=None
08:00:53 DEBUG:connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x7f9fb520f460>
08:00:53 DEBUG:start_tls.started ssl_context=<ssl.SSLContext object at 0x7f9fb5256e70> server_hostname='api.openai.com' timeout=5.0
08:00:53 DEBUG:start_tls.complete return_value=<httpcore._backends.sync.SyncStream object at 0x7f9fb520f130>
08:00:53 DEBUG:send_request_headers.started request=<Request [b'POST']>
08:00:53 DEBUG:send_request_headers.complete
08:00:53 DEBUG:send_request_body.started request=<Req

----
# OpenAI Response


The sky appears blue primarily due to a phenomenon called Rayleigh scattering. When sunlight enters the Earth's atmosphere, it is made up of different colors, each with varying wavelengths. Blue light has a shorter wavelength compared to other colors like red or yellow.

As sunlight passes through the atmosphere, it collides with gas molecules and small particles. Because blue light is scattered in all directions more than other colors due to its shorter wavelength, we see a predominance of blue when we look up at the sky.

During sunrise and sunset, the sun is lower on the horizon, and its light has to pass through a greater thickness of the atmosphere. This increased distance scatters the shorter blue wavelengths out of our line of sight, allowing the longer wavelengths like red and orange to dominate, which is why the sky can appear red or orange during those times.

### Color the cell output

Take advantage of the `IPython.display.Html` object and render any HTML that you want. 

In [4]:
# For displaying HTML and Markdown responses from ChatGPT
from IPython.display import display, HTML

# Enhance with more Html (fg-color, font, etc) as needed but title is usually a good starting point.
def colorBox(txt, title=None):
    if title is not None:
        txt = f"<b>{title}</b><br><hr><br>{txt}"

    display(HTML(f"<div style='border-radius:15px;padding:15px;background-color:pink;color:black;'>{txt}</div>"))

In [5]:

# Run to completion so all logs are printed out
res = get_completion("Why is the sky blue")

# print your result
# The use of markdown here formats it into the space available.
# Otw you'll get horizontal scrollbars
colorBox(res, title="OpenAI Response")

[2025-06-01 12:47:14 - openai._base_client:482 - DEBUG] Request options: {'method': 'post', 'url': '/chat/completions', 'files': None, 'idempotency_key': 'stainless-python-retry-96af53e8-f605-43d9-9488-cd668e69eb1a', 'json_data': {'messages': [{'role': 'user', 'content': 'Why is the sky blue'}], 'model': 'gpt-4o-mini', 'temperature': 0}}
[2025-06-01 12:47:14 - openai._base_client:965 - DEBUG] Sending HTTP Request: POST https://api.openai.com/v1/chat/completions
[2025-06-01 12:47:17 - httpx:1025 - INFO] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
[2025-06-01 12:47:17 - openai._base_client:1003 - DEBUG] HTTP Response: POST https://api.openai.com/v1/chat/completions "200 OK" Headers({'date': 'Sun, 01 Jun 2025 19:47:17 GMT', 'content-type': 'application/json', 'transfer-encoding': 'chunked', 'connection': 'keep-alive', 'access-control-expose-headers': 'X-Request-ID', 'openai-organization': 'user-uxl7oko9mdo17utucmetfrwn', 'openai-processing-ms': '3222', 

### Color the log output to quickly zero in on errors

Code below mostly copied from https://stackoverflow.com/questions/68807282/rich-logging-output-in-jupyter-ipython-notebook 🙏

In [52]:
import logging
from IPython.display import display, HTML

class DisplayHandler(logging.Handler):
    def emit(self, record):
        message = self.format(record)
        display(message)

class HTMLFormatter(logging.Formatter):
    level_colors = {
        logging.DEBUG: 'lightblue',
        logging.INFO: 'dodgerblue',
        logging.WARNING: 'goldenrod',
        logging.ERROR: 'crimson',
        logging.CRITICAL: 'firebrick'
    }
    
    def __init__(self):
        super().__init__(
            '<span style="font-weight: bold; color: green">{asctime}</span> '
            '[<span style="font-weight: bold; color: {levelcolor}">{levelname}</span>] '
            '{message}',
            style='{'
        )
    
    def format(self, record):
        record.levelcolor = self.level_colors.get(record.levelno, 'black')
        return HTML(super().format(record))    

In [56]:
# One of the cells above reveals a logger named `openai`
# with some trial and error, turns out that some of these log statements come from `httpx`
# Lets target that and change it's formatter to the above colorful one
handler = DisplayHandler()
handler.setFormatter(HTMLFormatter())

for nm in ['openai', 'httpx']:
    lg = logging.root.manager.loggerDict.get(nm)
    lg.addHandler(handler)
    lg.setLevel(logging.DEBUG)

In [57]:
# Run to completion so all logs are printed out
res = get_completion("Why is the sky blue")

# print your result
# The use of markdown here formats it into the space available.
# Otw you'll get horizontal scrollbars
colorBox(res, title="OpenAI Response")

12:49:04 DEBUG:Request options: {'method': 'post', 'url': '/chat/completions', 'files': None, 'json_data': {'messages': [{'role': 'user', 'content': 'Why is the sky blue'}], 'model': 'gpt-4o-mini', 'temperature': 0}}


12:49:04 DEBUG:Sending HTTP Request: POST https://api.openai.com/v1/chat/completions
12:49:04 DEBUG:close.started
12:49:04 DEBUG:close.complete
12:49:04 DEBUG:connect_tcp.started host='api.openai.com' port=443 local_address=None timeout=5.0 socket_options=None
12:49:04 DEBUG:connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x7f9fb51efa70>
12:49:04 DEBUG:start_tls.started ssl_context=<ssl.SSLContext object at 0x7f9fb5256e70> server_hostname='api.openai.com' timeout=5.0
12:49:05 DEBUG:start_tls.complete return_value=<httpcore._backends.sync.SyncStream object at 0x7f9fcca99d00>
12:49:05 DEBUG:send_request_headers.started request=<Request [b'POST']>
12:49:05 DEBUG:send_request_headers.complete
12:49:05 DEBUG:send_request_body.started request=<Request [b'POST']>
12:49:05 DEBUG:send_request_body.complete
12:49:05 DEBUG:receive_response_headers.started request=<Request [b'POST']>
12:49:06 DEBUG:receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'

12:49:06 INFO:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
12:49:06 DEBUG:receive_response_body.started request=<Request [b'POST']>
12:49:06 DEBUG:receive_response_body.complete
12:49:06 DEBUG:response_closed.started
12:49:06 DEBUG:response_closed.complete


12:49:06 DEBUG:HTTP Response: POST https://api.openai.com/v1/chat/completions "200 OK" Headers({'date': 'Wed, 05 Mar 2025 20:49:07 GMT', 'content-type': 'application/json', 'transfer-encoding': 'chunked', 'connection': 'keep-alive', 'access-control-expose-headers': 'X-Request-ID', 'openai-organization': 'user-uxl7oko9mdo17utucmetfrwn', 'openai-processing-ms': '1427', 'openai-version': '2020-10-01', 'x-ratelimit-limit-requests': '10000', 'x-ratelimit-limit-tokens': '200000', 'x-ratelimit-remaining-requests': '9999', 'x-ratelimit-remaining-tokens': '199977', 'x-ratelimit-reset-requests': '8.64s', 'x-ratelimit-reset-tokens': '6ms', 'x-request-id': 'req_f9715aba22a756b1da8e4aaf0de6d9a9', 'strict-transport-security': 'max-age=31536000; includeSubDomains; preload', 'cf-cache-status': 'DYNAMIC', 'x-content-type-options': 'nosniff', 'server': 'cloudflare', 'cf-ray': '91bc7a5e3f842a85-LAX', 'content-encoding': 'gzip', 'alt-svc': 'h3=":443"; ma=86400'})


12:49:06 DEBUG:request_id: req_f9715aba22a756b1da8e4aaf0de6d9a9


The above colors some of the log outputs. If you care to, you could experiment with changing root logger formatting or expand it to all the loggers available _(from `[logging.getLogger(name) for name in logging.root.manager.loggerDict]`)_