Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ctypes pickling error #342

Closed
JulianOrteil opened this issue Oct 1, 2020 · 11 comments
Closed

ctypes pickling error #342

JulianOrteil opened this issue Oct 1, 2020 · 11 comments
Labels
enhancement Improvement to an already existing feature

Comments

@JulianOrteil
Copy link

JulianOrteil commented Oct 1, 2020

Good day, Delgan.

I have an error I am encountering in the handlers of loguru. This error is being raised when I use "logger.exception(msg)" in the except block of a try statement.

Some context:
I'm developing an application to interface with NVIDIA's AI Triton server. The server's client-side python bindings are written in both C/C++ and Python. I'm mentioning this because I am developing on Windows, and replicating this issue is fairly hard due to specifics in the environment for the Triton server. I can give you the files and procedures to set up an environment like mine for this if you'd like, but I'll leave that up to you.

The error:
Stacktrace:

Traceback (most recent call last):
  File "C:\Users\jules\.conda\envs\app\lib\site-packages\loguru\_handler.py", line 175, in emit
    self._queue.put(str_record)
  File "C:\Users\jules\.conda\envs\app\lib\multiprocessing\queues.py", line 362, in put
    obj = _ForkingPickler.dumps(obj)
  File "C:\Users\jules\.conda\envs\app\lib\multiprocessing\reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
ValueError: ctypes objects containing pointers cannot be pickled

Code throwing the issue:

try:
    # <Code here raises 'InferenceServerException'>
except InferenceServerException: # The specific exception being raised (Triton server-specific)
    logger.exception("An issue occurred while contacting the inference server:\n")

The exception being raised does have calls to dll's in it, so that could possibly be the root cause? Again, it is difficult to reproduce this error because of the environment.

Exception data as returned by 'sys.exc_info()':

type: <class 'tensorrtserver.api.InferenceServerException'>
value: InferenceServerException(c_void_p(2735452554832)) # This is what I think is causing the issue
traceback: <normal traceback>

Logger setup code:

import os
import sys

import stackprinter
from loguru import logger


class FilterLevel(object):
    """Used to filter logs based dynamic logging level by using the
    :prop:`level` property which can be either an :obj:`int` or
    :obj:`str`.
    """

    _level: Union[str, int]

    def __init__(self) -> None:
        super().__init__()

        self.level = "INFO"

    def __call__(self, record: dict) -> bool:
        levelno = logger.level(self.level).no
        return record["level"].no >= levelno

    @property
    def level(self) -> Union[str, int]:
        return self._level

    @level.setter
    def level(self, value: Union[str, int]) -> None:

        if not isinstance(value, (str, int)):
            raise TypeError("property 'level' must be of type 'str' or 'int': "
                            f"'{type(value).__name__}'")

        self._level = value

LOGFILE = "some_log_somewhere.log"
LOGGER_FILTER = FilterLevel()
LOGGER_FORMAT = "<green>{time:YYYY-MM-DD HH:mm:ss.SSS}</green> | <level>{level: <8}</level> | <cyan>{name}</cyan>:<cyan>{function}</cyan>:<cyan>{line}</cyan> - <level>{message}</level>"


def _format_stacktrace(record: dict):
    format_ = LOGGER_FORMAT # The default format found in loguru's '_defaults.py'

    if record["exception"] is not None:
        record["extra"]["stack"] = stackprinter.format(record["exception"])
        format_ += "<red><bold>{extra[stack]}</bold></red>\n"

    return format_ + "\n"

# Remove all current handlers
logger.remove()

# Add the needed handlers
logger.add(sys.stderr, format=_format_stacktrace, enqueue=True)
logger.add(LOGFILE, format=_format_stacktrace, filter=LOGGER_FILTER, enqueue=True)

# Create loguru attributes to be accessible from the logger anywhere
# the logger is found object
setattr(logger, "app_logfile", LOGFILE)
setattr(logger, "app_logfilter", LOGGER_FILTER)

Please let me know if there is anything more I can provide.
Jules

@JulianOrteil
Copy link
Author

This app is multithreaded and multiprocessed using the stdlib threading and multiprocessing modules and PyQt5's QThread class. So, using 'enqueue' is kind of forced here 😕

@Delgan
Copy link
Owner

Delgan commented Oct 1, 2020

Hello @JulianOrteil. Thanks for the extensive bug report!

Basically, when enqueue=True the log records need to be serialized using pickle.dumps(). This also applies to the logged exception, if any. The resulting string is sent to a multiprocessing queue and later processed by a background thread.

As you noticed, it seems the raised InferenceServerException() contains a non-picklable value: c_void_p(2735452554832).

Actually, Loguru tries to catch possible errors during the serialization procedure of exceptions:

def __reduce__(self):
try:
pickled_value = pickle.dumps(self.value)
except pickle.PickleError:
return (RecordException, (self.type, None, None))
else:
return (RecordException._from_pickled_value, (self.type, pickled_value, None))

However, what I didn't know is that pickle.dumps() could raise exception of type ValueError. The ValueError does not inherit from PickleError so obviously it's not caught by the try / except implemented in __reduce__.

import ctypes
import pickle

pickle.dumps(lambda: None)
# _pickle.PicklingError: Can't pickle <function <lambda> at 0x7f33528ec0d0>: attribute lookup <lambda> on __main__ failed

pickle.dumps(ctypes.c_void_p(2735452554832)) 
# ValueError: ctypes objects containing pointers cannot be pickled

I will probably change the implementation so it catch exception more generally instead of restricting it to PickleError.

In the meantime, I see two possible workarounds.

If you know for sure where the InferenceServerException is emitted, you can simply replace it's value during logging:

try:
    tensorrt.do_something()
except InferenceServerException:
    type, _, tb = sys.exc_info()
    logger.opt(exception=(type, None, tb)).error("An error occurred")

Otherwise, you can do this from within your _format_stacktrace() function:

if record["exception"] is not None:
    record["extra"]["stack"] = stackprinter.format(record["exception"])
    format_ += "<red><bold>{extra[stack]}</bold></red>\n"
    if record["exception"].type is InferenceServerException:
        record["exception"] = record["exception"]._replace(value=None)

@JulianOrteil
Copy link
Author

Excellent. Thank you for the prompt response.

I'll implement your suggestion and leave this bug report open if you wish to link a pull request to it.

Jules

@JulianOrteil
Copy link
Author

@Delgan

Update:
I'm unable to use the workarounds you proposed. Both produce errors, one in stackprinter and the other in loguru.

Proposed solution 1:

try:
    tensorrt.do_something()
except InferenceServerException:
    type, _, tb = sys.exc_info()
    logger.opt(exception=(type, None, tb)).error("An error occurred")

This produces a ValueError in stackprinter as it is expecting an exception for the value (where None is above). As I do not have control over the standards used in this application (it is for a client), not using stackprinter is unfortunately out of my control.

Proposed solution 2:

  if record["exception"] is not None:
       record["extra"]["stack"] = stackprinter.format(record["exception"])
       format_ += "<red><bold>{extra[stack]}</bold></red>\n"
       if record["exception"].type is InferenceServerException:
           record["exception"].value = None

This produces an AttributeError in loguru. The stacktrace is below:

Traceback (most recent call last):
  File "C:\Users\jules\.conda\envs\app\lib\site-packages\loguru\_handler.py", line 107, in emit
    dynamic_format = self._formatter(record)
  File "C:\Users\jules\Documents\app\app\utils\logging.py", line 88, in _format_stacktrace
    record["exception"].value = None
AttributeError: can't set attribute

I was able to get a work around going, but I just wanted to alert you and anyone else looking at this issue to errors encountered in the proposed solutions.

My solution:
This is the solution I used to work around this problem.
Note: I did not modify the _format_stacktrace() function proposed by @Delgan.

# Defined in the same file causing the error
class TRTISException(Exception):
    """
    Temporary exception used to bypass C pointers in the
    :obj:`tensorrtserver` library. It causes pickling errors in
    :obj:`loguru`.
    """

...

# The new try/except statement
try:
    # <Code here raises 'InferenceServerException'>
except InferenceServerException as e:
    type_, _, tb = sys.exc_info()
    # logger.exception("An issue occurred while contacting the inference server:\n")
    logger.opt(exception=(type_, TRTISException(str(e._msg)), tb)).error("An issue occurred while contacting the inference server:\n")

@Delgan
Copy link
Owner

Delgan commented Oct 2, 2020

@JulianOrteil Yeah, sorry. I realized there will be an error with the second snippet I provided because record["exception"] is an immutable namedtuple() but I forgot to update the comment. >.<

I was thinking the stackprinter would work, though. I created a PR on the stackprinter repository, hopefully it will soon support None value inside exception tuple.

Thanks for updating this ticket with an usable workaround! 😄

@JulianOrteil
Copy link
Author

@Delgan Anytime!
Thank you for creating a PR in the stackprinter repo for me.

I also thought I would bring another workaround to this issue as well. Per the docs, namedtuple() has a _replace() function that creates a new instance of the tuple and just refills it with the updated value. Knowing this, we just replace the entire record["exception"] portion of the record.

Code:

# Updated version of proposed solution 2
# Inside the _format_stacktrace() function
if record["exception"] is not None:
    if type(record["exception"].value) is InferenceServerException:
        record["exception"] = record["exception"]._replace(value=Exception(record["exception"].value._msg))
    record["extra"]["stack"] = stackprinter.format(record["exception"])
    format_ += "<red><bold>{extra[stack]}</bold></red>\n"

@Delgan
Copy link
Owner

Delgan commented Oct 2, 2020

Wow, I never noticed these methods existed nor that they were part of the documentation. O.o

Again, that a great workaround. Thanks for sharing it!

@holyachon
Copy link

Hello. Is this error fixed at latest version 0.6.0?

@Delgan
Copy link
Owner

Delgan commented May 20, 2022

@holyachon @JulianOrteil This issue is not yet fixed but I'll handle it soon.

@Delgan Delgan reopened this May 20, 2022
@pbednarz1
Copy link

Any progress on that?

@Delgan
Copy link
Owner

Delgan commented Aug 31, 2023

I'm closing this ticket because this has been fixed by #905 and should no longer be a problem with the next release. 👍

@Delgan Delgan closed this as completed Aug 31, 2023
jmctune referenced this issue in dqx-translation-project/dqxclarity Oct 9, 2023
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|
| [loguru](https://togithub.com/Delgan/loguru)
([changelog](https://togithub.com/Delgan/loguru/blob/master/CHANGELOG.rst))
| `==0.7.0` -> `==0.7.2` |
[![age](https://developer.mend.io/api/mc/badges/age/pypi/loguru/0.7.2?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![adoption](https://developer.mend.io/api/mc/badges/adoption/pypi/loguru/0.7.2?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![passing](https://developer.mend.io/api/mc/badges/compatibility/pypi/loguru/0.7.0/0.7.2?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![confidence](https://developer.mend.io/api/mc/badges/confidence/pypi/loguru/0.7.0/0.7.2?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|

---

### Release Notes

<details>
<summary>Delgan/loguru (loguru)</summary>

###
[`v0.7.2`](https://togithub.com/Delgan/loguru/blob/HEAD/CHANGELOG.rst#072-2023-09-11)

[Compare
Source](https://togithub.com/Delgan/loguru/compare/0.7.1...0.7.2)

\=====================

- Add support for formatting of `ExceptionGroup` errors (`#&#8203;805
<https://github.com/Delgan/loguru/issues/805>`\_).
- Fix possible `RuntimeError` when using
`multiprocessing.set_start_method()` after importing the `logger`
(`#&#8203;974 <https://github.com/Delgan/loguru/issues/974>`\_)
- Fix formatting of possible `__notes__` attached to an `Exception`
(`#&#8203;980 <https://github.com/Delgan/loguru/issues/980>`\_).

###
[`v0.7.1`](https://togithub.com/Delgan/loguru/blob/HEAD/CHANGELOG.rst#071-2023-09-04)

[Compare
Source](https://togithub.com/Delgan/loguru/compare/0.7.0...0.7.1)

\=====================

- Add a new `context` optional argument to `logger.add()` specifying
`multiprocessing` context (like `"spawn"` or `"fork"`) to be used
internally instead of the default one (`#&#8203;851
<https://github.com/Delgan/loguru/issues/851>`\_).
- Add support for true colors on Windows using ANSI/VT console when
available (`#&#8203;934 <https://github.com/Delgan/loguru/issues/934>`*,
thanks `@tunaflsh <https://github.com/tunaflsh>`*).
- Fix possible deadlock when calling `logger.complete()` with concurrent
logging of an asynchronous sink (`#&#8203;906
<https://github.com/Delgan/loguru/issues/906>`\_).
- Fix file possibly rotating too early or too late when re-starting an
application around midnight (`#&#8203;894
<https://github.com/Delgan/loguru/issues/894>`\_).
- Fix inverted `"<hide>"` and `"<strike>"` color tags (`#&#8203;943
<https://github.com/Delgan/loguru/pull/943>`*, thanks `@tunaflsh
<https://github.com/tunaflsh>`*).
- Fix possible untraceable errors raised when logging non-unpicklable
`Exception` instances while using `enqueue=True` (`#&#8203;329
<https://github.com/Delgan/loguru/issues/329>`\_).
- Fix possible errors raised when logging non-picklable `Exception`
instances while using `enqueue=True` (`#&#8203;342
<https://github.com/Delgan/loguru/issues/342>`*, thanks `@ncoudene
<https://github.com/ncoudene>`*).
- Fix missing seconds and microseconds when formatting timezone offset
that requires such accuracy (`#&#8203;961
<https://github.com/Delgan/loguru/issues/961>`\_).
- Raise `ValueError` if an attempt to use nanosecond precision for time
formatting is detected (`#&#8203;855
<https://github.com/Delgan/loguru/issues/855>`\_).

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/dqx-translation-project/dqxclarity).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4wLjMiLCJ1cGRhdGVkSW5WZXIiOiIzNy44LjEiLCJ0YXJnZXRCcmFuY2giOiJtYWluIn0=-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improvement to an already existing feature
Projects
None yet
Development

No branches or pull requests

4 participants