Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] UB from __Pyx_pretend_to_initialize not initializing #5278

Closed
rupprecht opened this issue Feb 28, 2023 · 16 comments · Fixed by #5920
Closed

[BUG] UB from __Pyx_pretend_to_initialize not initializing #5278

rupprecht opened this issue Feb 28, 2023 · 16 comments · Fixed by #5920
Milestone

Comments

@rupprecht
Copy link

Describe the bug

Apologies in advance: I don't use cython, but I'm debugging a bug caused by using cython with a fresh version of clang. I'll try to get a better repro soon, but for now I'm just filing this in case the cause/fix is obvious. My repro is representative of something internal, but I haven't reduced it to something standalone yet.

With a very new LLVM commit, we're seeing failures in Cython code. The issue seems to be caused by UB in Cython when returning uninitialized data.

Given some Cython code like this:

T decode(string encoded) except *:
  cdef T result
    if this.decoder.decode(encoded, &result):
      return result
  raise ValueError(f'Failed to decode {repr(encoded)}')

decode(.... <something invalid> ...)

The generated code for will look something like this:

static CYTHON_INLINE void __Pyx_pretend_to_initialize(void* ptr) { (void)ptr; }

template <typename T>
T decode(std::string __pyx_v_encoded) {
  T __pyx_v_result;
  T __pyx_r;
  ...
  __pyx_t_1 = (this->decoder.decode(__pyx_v_encoded, &__pyx_v_result) != 0);
  if (__pyx_t_1) {
    __pyx_r = __pyx_v_result;
    goto __pyx_L0;
  }
  ...
__pyx_L1_error:;
  __Pyx_pretend_to_initialize(&__pyx_r);
__pyx_L0:;
  return __pyx_r;  // This is only *actually* initialized in the happy case.
}

The UB is that __Pyx_pretend_to_initialize doesn't actually do any initialization, and so decode will return the uninitialized __pyx_r value. It's enough to silence static compiler warnings, but the optimizer will see that it doesn't do anything to the variable, and mis-optimize the UB accordingly.

There are two possible changes to __Pyx_pretend_to_initialize that would make it satisfy this requirement:

// Prevent the compiler from optimizing this away, e.g. see
// https://stackoverflow.com/questions/28287064/how-not-to-optimize-away-mechanics-of-a-folly-function
static CYTHON_INLINE void __Pyx_pretend_to_initialize(void* ptr) { asm volatile("" : "+r"(ptr)); }

// Or, make it actually do something. The compiler is hopefully smart enough to optimize this away, but I haven't checked.
// Requires changing the calls from __Pyx_pretend_to_initialize(&x) to __Pyx_pretend_to_initialize(&x, sizeof(x))
static CYTHON_INLINE void __Pyx_pretend_to_initialize(void* ptr, size_t size) { memset(ptr, 0, size); }

Code to reproduce the behaviour:

T decode(string encoded) except *:
  cdef T result
    if this.decoder.decode(encoded, &result):
      return result
  raise ValueError(f'Failed to decode {repr(encoded)}')

decode(.... <something invalid> ...)

Expected behaviour

When passed invalid data, it should raise an error. Instead, it crashes.

Environment

OS: Linux
Python version: 3.10
Cython version: 0.29.32 (but I still see the bug in trunk)
LLVM/Clang version: something very recent; newer than this: llvm/llvm-project@b6a0be8

Additional context

No response

@da-woods
Copy link
Contributor

Thanks. We'll need to look into this properly I think.

I'm a bit wary of the asm version for compiler-compatibility reasons (although maybe it's OK). I'm a bit wary of the memset version because we'd need to make sure that it doesn't get misapplied to default constructed C++ classes.

@alexfh
Copy link

alexfh commented Feb 28, 2023

I'm trying to find a temporary workaround to unblock the upgrade of LLVM, but so far no success. Is there any way we can modify the python source code to make cython generate initialization of __pyx_v_result (or rather __pyx_r) with a default value of type T (which is a template parameter in that code)? I tried cdef T result = T(), cdef T result = {} and even cdef T result = 0, but all of these generate cython syntax errors.

@da-woods
Copy link
Contributor

So I'm a little puzzled by this.

  1. I'm partly a bit puzzled by how you generate this templated C++ code from Cython. Are you using fused types rather than C++ templates?
  2. I'm not quite sure what the undefined behaviour is? Obviously reading the return value could give an arbitrary result but I don't think we do that. At least the material I was able to find for C suggests that taking the address is sufficient to avoid undefined behaviour just from returning it (https://stackoverflow.com/questions/11962457/why-is-using-an-uninitialized-variable-undefined-behavior/40674888#40674888) since it stops the variable being register-only. I'm not an expert in the fine details of either the C or C++ standards.

In terms of workarounds I'm not completely sure, since __pyx_r is Cython-generated rather than something you control yourself. Two (untested) suggestions:

  1. I think returning any T with a default constructor should be fine, so maybe returning std::optional<T> instead of T might work since that definitely default constructs to something empty.
  2. You could write a helper function to initialize it:
    cdef extern from *:
        """
        #define INIT_R memset(&__pyx_r, 0, sizeof(__pyx_r))
        """
        void INIT_R()
    
    then just call INIT_R() at the start of your functions. This is pretty hacky, but I think should work.

@alexfh
Copy link

alexfh commented Mar 1, 2023

  1. I'm partly a bit puzzled by how you generate this templated C++ code from Cython. Are you using fused types rather than C++ templates?

We're using C++ templates:

cdef cppclass CoderHelper[T]:
  Coder[T] _raw_coder

  CoderHelper(Coder[T] raw_coder):
      this._raw_coder = raw_coder

  T decode(string encoded) except *:
    cdef T result
    with nogil:
      if this._raw_coder.Decode(encoded, &result):
        return result
    raise ValueError(f'Failed to decode {repr(encoded)}')
  1. I'm not quite sure what the undefined behaviour is? Obviously reading the return value could give an arbitrary result but I don't think we do that. At least the material I was able to find for C suggests that taking the address is sufficient to avoid undefined behaviour just from returning it (https://stackoverflow.com/questions/11962457/why-is-using-an-uninitialized-variable-undefined-behavior/40674888#40674888) since it stops the variable being register-only. I'm not an expert in the fine details of either the C or C++ standards.

I'm not sure about C, but in C++ returning an uninitialized variable from a function is undefined behavior, regardless of whether the return value of the function is used or not. Compilers can utilize UB for optimization purposes in quite surprising ways (https://en.cppreference.com/w/cpp/language/ub).

As for the suggested workarounds, I'd be glad to try them, but I have problems expressing them within cython syntax. Ideally, I would like some way to tell cython to insert a snippet of C++to the output verbatim. Is there such an option?

@da-woods
Copy link
Contributor

da-woods commented Mar 1, 2023

I didn't actually know you could do that with templates. Thanks for the clarification.

Ideally, I would like some way to tell cython to insert a snippet of C++to the output verbatim. Is there such an option?

There isn't really, but option 2 is fairly close to that

@scoder
Copy link
Contributor

scoder commented Mar 1, 2023

As for the suggested workarounds, I'd be glad to try them, but I have problems expressing them within cython syntax. Ideally, I would like some way to tell cython to insert a snippet of C++to the output verbatim. Is there such an option?

https://docs.cython.org/en/latest/src/userguide/external_C_code.html#including-verbatim-c-code

@alexfh
Copy link

alexfh commented Mar 1, 2023

Thanks for the advice! I came up with the following two options for a (hopefully) short-term workaround:

cdef extern from *:
  """
  template<typename T> T default_value() { return {}; }
  """
  T default_value[T]()

cdef cppclass CoderHelper[T]:
  Coder[T] _raw_coder

  CoderHelper(Coder[T] raw_coder):
      this._raw_coder = raw_coder

  T decode(string encoded) except *:
    cdef T result = default_value[T]()
    with nogil:
      if this._raw_coder.Decode(encoded, &result):
        return result
    raise ValueError(f'Failed to decode {repr(encoded)}')
cdef extern from *:
  """
  #define ASSIGN_DEFAULT_VALUE(var) { (*var) = {}; }
  """
  void ASSIGN_DEFAULT_VALUE(void*)

cdef cppclass CoderHelper[T]:
  Coder[T] _raw_coder

  CoderHelper(Coder[T] raw_coder):
      this._raw_coder = raw_coder

  T decode(string encoded) except *:
    cdef T result
    ASSIGN_DEFAULT_VALUE(&result)
    with nogil:
      if this._raw_coder.Decode(encoded, &result):
        return result
    raise ValueError(f'Failed to decode {repr(encoded)}')

@maxbachmann
Copy link
Contributor

I would certainly not use the asm version. Compilers have very limited possibility to optimize around asm blocks, so this can seriously affect compiler optimizations.

@maxbachmann
Copy link
Contributor

maxbachmann commented Mar 3, 2023

I'm not quite sure what the undefined behaviour is? Obviously reading the return value could give an arbitrary result but I don't think we do that. At least the material I was able to find for C suggests that taking the address is sufficient to avoid undefined behaviour just from returning it (https://stackoverflow.com/questions/11962457/why-is-using-an-uninitialized-variable-undefined-behavior/40674888#40674888) since it stops the variable being register-only. I'm not an expert in the fine details of either the C or C++ standards.

Yes I am not really sure about this either, since in our specific case we actually do take the address:

  __Pyx_pretend_to_initialize(&__pyx_r);
  return __pyx_r;

One thing in there is that it could still be undefined behavior if the system supports trap representations, so maybe that's why it is undefined?

In C we can always just memset the value. In C++ since C++11 we can memset if std::is_trivially_copyable. Non trivially copyable types are out of our hands. I have absolutely no clue what to do about C++98/03.

@da-woods
Copy link
Contributor

da-woods commented Mar 3, 2023

One thing in there is that it could still be undefined behavior if the system supports trap representations, so maybe that's why it is undefined?

I think in C taking the address is sufficient to avoid this. Although that's based on a fairly superficial understanding of the standard.

I think in C++ we probably just need to initialize Type __pyx_r = Type() at the start of the function. That happens anyway for classes with non-trivial default constructor so it's only adding a little extra work for classes with trivial constructors.

@maxbachmann
Copy link
Contributor

Yes I was certainly making this more complicated in my head, than it actually was. Simply always initializing the return value should absolutely be enough

@scoder
Copy link
Contributor

scoder commented Mar 3, 2023

Simply always initializing the return value should absolutely be enough

Please validate that this also works in generator/coroutine functions, where we use goto->yield_label at the beginning.

@maxbachmann
Copy link
Contributor

Please validate that this also works in generator/coroutine functions, where we use goto->yield_label at the beginning.

Do you have an example, where you are concerned? At least the generators in my code base always return PyObjects, which are already set to NULL at the beginning. Can we even have generator/coroutine functions which do not return PyObjects?

@alexfh
Copy link

alexfh commented Mar 3, 2023

Changing T __pyx_r; to T __pyx_r = T();, T __pyx_r = {}; or T __pyx_r{}; should work. This is what I roughly did to work around the issue. FWIW, neither of the two workarounds I mentioned in #5278 (comment) actually worked. They both masked the issue in some configurations, but the generated code was still incorrect and the problem manifested, for example, when compiled with recent clang and thread sanitizer enabled. I had to directly assign to the __pyx_r variable to get rid of the UB in this part of the generated code:

cdef extern from *:
  """
  #define WORKAROUND_FOR_5278() { __pyx_r = T(); }
  """
  void WORKAROUND_FOR_5278()

...
  T decode(string encoded) except *:
    cdef T result
    with nogil:
      if this._raw_coder.Decode(encoded, &result):
        return result
    WORKAROUND_FOR_5278()
    raise ValueError(f'Failed to decode {repr(encoded)}')

da-woods added a commit to da-woods/cython that referenced this issue Mar 5, 2023
rather than relying on __Pyx_pretend_to_initialize which isn't
sufficient to avoid undefined behaviour there.

Fixes cython#5278
da-woods added a commit to da-woods/cython that referenced this issue Dec 22, 2023
In this case we actually have to initialize rather than just
do nothing. Fixes cython#5278.

Supercedes cython#5296 (I think this is better since it limits the
amount of work Cython itself has to do)
da-woods added a commit to da-woods/cython that referenced this issue Dec 22, 2023
In this case we actually have to initialize rather than just
do nothing. Fixes cython#5278.

Supercedes cython#5296 (I think this is better since it limits the
amount of work Cython itself has to do)
da-woods added a commit that referenced this issue Feb 10, 2024
In this case we actually have to initialize rather than just
do nothing. Fixes #5278.

* Move code to utility code file
@da-woods da-woods added this to the 3.1 milestone Feb 10, 2024
@scoder
Copy link
Contributor

scoder commented Feb 27, 2024

Is this something we should backport to 3.0.9?

@da-woods
Copy link
Contributor

Is this something we should backport to 3.0.9?

Probably.

da-woods added a commit to da-woods/cython that referenced this issue Mar 31, 2024
In this case we actually have to initialize rather than just
do nothing. Fixes cython#5278.

* Move code to utility code file
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants