Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows bug #33

Closed
hpwxf opened this issue Jul 2, 2020 · 15 comments · Fixed by #35
Closed

Windows bug #33

hpwxf opened this issue Jul 2, 2020 · 15 comments · Fixed by #35
Labels
bug Something isn't working

Comments

@hpwxf
Copy link
Collaborator

hpwxf commented Jul 2, 2020

To reproduce it:

  • enable Windows support in .travis-ci.yml and run CI test

When test_mat_to_arr.py is reduced to

print("Test armadillo matrix to  numpy array functions.")
import numpy as np

import test_carma as carma

print("Before")

print("Test mat_to_arr with c++ addition.")
sample = np.asarray(
    np.random.normal(size=(10, 2)),
    dtype=np.float,
    order='F'
)
mat = carma.mat_to_arr_plus_one(sample, False)
assert np.allclose(mat, sample + 1)


print("Test mat_to_arr with c++ addition.")
sample = np.asarray(
    np.random.normal(size=(10, 2)),
    dtype=np.float,
    order='F'
)
mat = carma.mat_to_arr_plus_one(sample, True)
assert np.allclose(mat, sample + 1)

print("After")

It is still failing. But if we remove first sub-test or second sub-test, it is OK.
NB: when it fails, there is no output even for the first prints.

  • By removing all other test, we get:
$ ctest -C Release
Test project C:/Users/AzureUser/carma/build
    Start 1: pytest
1/1 Test #1: pytest ...........................Exit code 0xc0000374
***Exception:   0.90 sec

0% tests passed, 1 tests failed out of 1

Total Test time (real) =   0.92 sec

or by hand:

 PYTHONPATH=../build/tests/Release python test_mat_to_arr.py

PS:

  • test_arr_to_mat.py is OK.
  • test_arraystore.py, test_nparray.py, test_type_caster.py are KO like test_mat_to_arr.py
@RUrlus RUrlus added the bug Something isn't working label Jul 2, 2020
@RUrlus
Copy link
Owner

RUrlus commented Jul 2, 2020

It sounds like this due to reusing the sample var name which will trigger destruction of the memory. If this is true a single test like below should trigger the same behaviour.

print("Test armadillo matrix to  numpy array functions.")
import numpy as np

import test_carma as carma

print("Before")

print("Test mat_to_arr with c++ addition.")
sample = np.asarray(
    np.random.normal(size=(10, 2)),
    dtype=np.float,
    order='F'
)
mat = carma.mat_to_arr_plus_one(sample, False)
assert np.allclose(mat, sample + 1)

# This should trigger a destructor call
sample = None
# This just removes the sample from the namespace
del sample

print("After")

This would also mean this is a fairly significant memory bug.
Any idea on why this would not show up in *NIX systems but it does with windows?

@RUrlus
Copy link
Owner

RUrlus commented Jul 2, 2020

But the above wouldn't explain why the first print statement isn't reached or why test_nparray.py has the same issue. Those tests only read or switch some flags...

@hpwxf
Copy link
Collaborator Author

hpwxf commented Jul 3, 2020

Exit code 0xc0000374 seems to be a memory corruption (e.g. https://forums.iis.net/t/1150912.aspx?0xc0000374).

If this corruption occurs in the Python interpreter, it is possible that it crashes before to able to print previously computer results or outputs (buffered outputs are not yet flushed).

Since the problem occurs, only this two successive tests (in any order; and ok with only one), it could help to understand where the corruption is.

@hpwxf
Copy link
Collaborator Author

hpwxf commented Jul 3, 2020

Next thing to do could be to try with valgrind on a *NIX system.

I plan also to add a mode with memory sanitizer options.

@hpwxf
Copy link
Collaborator Author

hpwxf commented Jul 6, 2020

Your example has also failed (cf https://travis-ci.com/github/hpwxf/carma/jobs/358021677).

I'm going to try to check what happens on Linux with valgrind (using https://pypi.org/project/pytest-valgrind/)

@hpwxf
Copy link
Collaborator Author

hpwxf commented Jul 6, 2020

An error seems to occur here:

Valgrind has detected an error:

**5286**
**5286** **********************************************************************
**5286** test_windows_bug.py::test_windows_bug
**5286** **********************************************************************
==5286== Mismatched free() / delete / delete []
==5286==    at 0x483708B: operator delete(void*, unsigned long) (vg_replace_malloc.c:585)
==5286==    by 0x1F2A4715: pybind11::capsule carma::create_capsule<double>(double*)::{lambda(void*)#1}::operator()(void*) const (utils.h:97)
==5286==    by 0x1F2A4735: pybind11::capsule carma::create_capsule<double>(double*)::{lambda(void*)#1}::_FUN(void*) (utils.h:91)
==5286==    by 0x1F280925: pybind11::capsule::capsule(void const*, void (*)(void*))::{lambda(_object*)#1}::operator()(_object*) const (pytypes.h:1168)
==5286==    by 0x1F280945: pybind11::capsule::capsule(void const*, void (*)(void*))::{lambda(_object*)#1}::_FUN(_object*) (pytypes.h:1169)
==5286==    by 0x4AA2FA: ??? (in /usr/bin/python3.7)
==5286==    by 0x8D1E847: array_dealloc (in /usr/local/lib/python3.7/dist-packages/numpy/core/_multiarray_umath.cpython-37m-x86_64-linux-gnu.so)
...

@hpwxf
Copy link
Collaborator Author

hpwxf commented Jul 6, 2020

If I remove the delete at utils.h:97, there is no more segfault in Windows (https://travis-ci.com/github/hpwxf/carma/jobs/358032696) but there is, of course, a memory leak (this was just to confirm the location of the error).

hpwxf added a commit to hpwxf/carma that referenced this issue Jul 6, 2020
Adds an additional argument to define how to manage associated memory for py::capsule.

Visible at runtime in Windows or using pytest-valgring in Linux.

(closes RUrlus#33)
hpwxf added a commit to hpwxf/carma that referenced this issue Jul 6, 2020
Adds an additional argument to define how to manage associated memory for py::capsule.

Visible at runtime in Windows or using pytest-valgring in Linux.

(closes RUrlus#33)
@RUrlus
Copy link
Owner

RUrlus commented Jul 7, 2020

I think the bug might be in get_data.

    template <typename armaT, typename = std::enable_if_t<is_convertible<armaT>::value>>
    inline typename armaT::elem_type * get_data(armaT * src, bool copy) {
        using T = typename armaT::elem_type;
        if (copy) {
            size_t N = src->n_elem;
            T * data = new T[N];
            std::memcpy(data, src->memptr(), sizeof(T) * N);
            return data;
        } else {
            T * data = src->memptr();
            arma::access::rw(src->mem) = 0;
            return data;
        }
    } /* get_data */

If you don't copy in but do copy out we have:

  • input_array correctly assumes it owns the data and will correctly free the memory when destructed
  • armadillo matrix incorrectly assumes it owns the data and will clear it
  • output_array will correctly free the memory when destructed

However, under conditions the input array will get copied, e.g. wrong memory order, owndata is false, which prevents us from always taking ownership away from armadillo.

Perhaps we can set a global flag that indicates when a copy occurred on the way in and use that to either take away ownership from armadillo when a copy is made out.

@hpwxf
Copy link
Collaborator Author

hpwxf commented Jul 7, 2020

What do you think about the fix I proposed in #35 ?

@RUrlus
Copy link
Owner

RUrlus commented Jul 7, 2020

Wouldn't an intermediate solution be to use a macro that sets delete when OS is not Windows? That way we can support windows albeit with a memory leak but don't interfere with performance on *NIX systems

@hpwxf
Copy link
Collaborator Author

hpwxf commented Jul 8, 2020

Since I've planned to use carma in another project (portable on Linux/Mac/Windows), I prefer to solve this problem without any leak in any OS.

@hpwxf
Copy link
Collaborator Author

hpwxf commented Jul 8, 2020

Using clang sanitizer, the analysis seems to be easier

# Tested on Debian Buster
CXXFLAGS="-fsanitize=address -fno-omit-frame-pointer" CXX=clang++-7 cmake -DBUILD_TESTS=on -DBUILD_EXAMPLES=on -DCMAKE_BUILD_TYPE=Debug ..
LD_PRELOAD=/usr/lib/gcc/x86_64-linux-gnu/7/libasan.so ASAN_SYMBOLIZER_PATH=/usr/bin/llvm-symbolizer-7  ctest -V

I got:

2: ==6943==ERROR: AddressSanitizer: alloc-dealloc-mismatch (malloc vs operator delete []) on 0x60e000001460
2:     #0 0x7f7afc37a0c0 in operator delete[](void*) (/usr/lib/gcc/x86_64-linux-gnu/7/libasan.so+0xdd0c0)
2:     #1 0x7f7ae216932a in operator() /data/include/carma/carma/utils.h:98
2:     #2 0x7f7ae21692b4 in __invoke /data/include/carma/carma/utils.h:92
2:     #3 0x7f7ae2133287 in operator() /data/third_party/pybind11/include/pybind11/detail/../pytypes.h:1168

That means the related bunch of memory has been allocated using malloc not new.

@hpwxf
Copy link
Collaborator Author

hpwxf commented Jul 8, 2020

Understood !

To deallocate a bunch of memory allocated by armadillo, you cannot assume what allocator it is. It depends on the implementation.
cf https://gitlab.com/conradsnicta/armadillo-code/-/blob/9.900.x/include/armadillo_bits/memory.hpp#L54 .
Thus, I will try to use arma::memory::release when copy==false.

hpwxf added a commit to hpwxf/carma that referenced this issue Jul 8, 2020
Embed deallocator type with data in get_data return value.

Visible at runtime in Windows or using pytest-valgring or clang address sanitizer in Linux

(closes RUrlus#33)
hpwxf added a commit to hpwxf/carma that referenced this issue Jul 8, 2020
Embed deallocator type with data in get_data return value.

Visible at runtime in Windows or using pytest-valgring or clang address sanitizer in Linux

(closes RUrlus#33)
hpwxf added a commit to hpwxf/carma that referenced this issue Jul 8, 2020
Embed deallocator type with data in get_data return value.

Visible at runtime in Windows or using pytest-valgring or clang address sanitizer in Linux

(closes RUrlus#33)
hpwxf added a commit to hpwxf/carma that referenced this issue Jul 8, 2020
Embed deallocator type with data in get_data return value.

Visible at runtime in Windows or using pytest-valgring or clang address sanitizer in Linux

(closes RUrlus#33)
hpwxf added a commit to hpwxf/carma that referenced this issue Jul 8, 2020
Embed deallocator type with data in get_data return value.

Visible at runtime in Windows or using pytest-valgring or clang address sanitizer in Linux

(closes RUrlus#33)
@RUrlus
Copy link
Owner

RUrlus commented Jul 9, 2020

Since I've planned to use carma in another project (portable on Linux/Mac/Windows), I prefer to solve this problem without any leak in any OS.

Of course, I was thinking as a stop-gap solution.

@RUrlus
Copy link
Owner

RUrlus commented Jul 9, 2020

Understood !

To deallocate a bunch of memory allocated by armadillo, you cannot assume what allocator it is. It depends on the implementation.
cf https://gitlab.com/conradsnicta/armadillo-code/-/blob/9.900.x/include/armadillo_bits/memory.hpp#L54 .
Thus, I will try to use arma::memory::release when copy==false.

Good point, nooby mistake to assume delete was the appropriate way too free.

What do you think about letting armadillo handle the copy out as well. In that case we can always keep the armadillo matrix/row/... in the capsule and release the memory from armadillo?

In the case of a steal we can move the mat/row/col etc. This might also solve another issue where we can't reliable steal the memory from a Col or Row.

Something like:

    template <typename armaT, typename = std::enable_if_t<is_convertible<armaT>::value>>
    inline py::capsule create_capsule(const armaT & src) {
        using T = typename armaT::elem_type;
        armaT data = armaT(src.memptr(), src.n_rows, src.n_cols, true);
        return py::capsule(data.memptr(), [](void *f){
            T *data = reinterpret_cast<T *>(f);
            #ifndef NDEBUG
            // if in debug mode let us know what pointer is being freed
            std::cerr << "freeing memory @ " << f << std::endl;
            #endif
            arma::memory::release(data);
        });
    } /* create_capsule */

    template <typename armaT, typename = std::enable_if_t<is_convertible<armaT>::value>>
    inline py::capsule create_capsule(armaT && src) {
        using T = typename armaT::elem_type;
        armaT data = std::move(src);
        return py::capsule(data.memptr(), [](void *f){
            T *data = reinterpret_cast<T *>(f);
            #ifndef NDEBUG
            // if in debug mode let us know what pointer is being freed
            std::cerr << "freeing memory @ " << f << std::endl;
            #endif
            arma::memory::release(data);
        });
    } /* create_capsule */

hpwxf added a commit to hpwxf/carma that referenced this issue Jul 10, 2020
Embed deallocator type with data in get_data return value.

Visible at runtime in Windows or using pytest-valgring or clang address sanitizer in Linux

(closes RUrlus#33)
hpwxf added a commit to hpwxf/carma that referenced this issue Jul 11, 2020
Embed deallocator type with data in get_data return value.

Visible at runtime in Windows or using pytest-valgring or clang address sanitizer in Linux

(closes RUrlus#33)
RUrlus pushed a commit that referenced this issue Jul 11, 2020
Embed deallocator type with data in get_data return value.

Visible at runtime in Windows or using pytest-valgring or clang address sanitizer in Linux

(closes #33)
RUrlus pushed a commit that referenced this issue Aug 16, 2021
Embed deallocator type with data in get_data return value.

Visible at runtime in Windows or using pytest-valgring or clang address sanitizer in Linux

(closes #33)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants