Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move nativeFree calls outside of the memory_mutex to avoid deadlock #2124

Merged
merged 1 commit into from
Apr 13, 2018

Conversation

umar456
Copy link
Member

@umar456 umar456 commented Apr 12, 2018

Narrows the scope of the memory_mutex locks to just around the
modification and reading of the internal memory data members and
elements. This means that actual allocations happen outside of this
mutex's lock.

This was done because the CPU backend requires that the free operation
performs a sync before the pointer is freed. This caused deadlocks
when the main thread was waiting on the worker thread and the
worker thread called nativeFree. This scenario happens when a buffer
node is only referenced in the worker thread and it is removed.

Fixes #2115

@umar456 umar456 added this to the v3.6.0 milestone Apr 12, 2018
@umar456 umar456 requested a review from pavanky April 12, 2018 06:46
Copy link
Contributor

@FloopCZ FloopCZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the code.

This is a little bit off-topic, sorry, but I have a design pattern question. Backend-specific MemoryManagers inherit from the common MemoryManager<T> and pass it their own type as the template parameter. This is done because you prefer to call backend-specific methods like nativeFree using static_cast:

inline void nativeFree(void *ptr)
{
    static_cast<T*>(this)->nativeFree(ptr);
}

instead of using a virtual nativeFree function.

I thought you wanted to avoid virtual table lookup for performance (I may be wrong, though, considering that in this PR you e.g., store the call to nativeFree in std::function that is performing a similar type of type erasure anyway.). However, the common MemoryManager has a virtual destructor, is there a reason for this? Because you cannot use it purely polymorphically anyway, it always has to be parametrized using the backend-specific manager type.

I am asking mostly out of curiosity.


for (auto &kv : current.free_map) {
size_t num_ptrs = kv.second.size();
//Free memory by popping the last element
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is not true anymore.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

for(auto& p : kv.second) {
free_ptrs.push_back(p);
}
current.total_bytes -= num_ptrs * kv.first;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(OCD - two spaces after -=)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

// Frees the pointer outside the lock.
std::unique_ptr<void, std::function<void(void*)>> freed_ptr(nullptr, [this](void* p) {
this->nativeFree(p);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(OCD - this lambda seems misindented)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

// Pointer not found in locked map
if (iter == current.locked_map.end()) {
// Probably came from user, just free it
freed_ptr.reset(ptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice trick with unique_ptr. 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed 👍

std::vector<void *> ptrs;
ptrs.push_back(ptr);
current.free_map[bytes] = ptrs;
}
Copy link
Contributor

@FloopCZ FloopCZ Apr 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this whole "regular mode" branch (L291-L301) can be replaced with this line:

current.free_map[bytes].push_back(ptr);

because operator[] on mapping containers constructs the empty vector (default constructor) for you if the key does not exist.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this ^ I was not aware of this quirk of C++ standard containers when I wrote this code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

unique_ptr<void, function<void(void*)>> freed_ptr(nullptr,
[this](void* p) {
this->nativeFree(p);
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this into the class definition of MemoryManager ? It could be useful elsewhere.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// Pointer not found in locked map
if (iter == current.locked_map.end()) {
// Probably came from user, just free it
freed_ptr.reset(ptr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed 👍

std::vector<void *> ptrs;
ptrs.push_back(ptr);
current.free_map[bytes] = ptrs;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this ^ I was not aware of this quirk of C++ standard containers when I wrote this code.

@@ -424,6 +445,7 @@ class MemoryManager
bool checkMemoryLimit()
{
const memory_info& current = this->getCurrentMemoryInfo();
lock_guard_t lock(this->memory_mutex);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why lock here ? Was getCurrentMemoryInfo locking previously ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't locking but you are accessing the memory_info struct and some of the tools were throwing warnings for these accesses. I can remove it if you think its unnecessary but I don't think it hurts.

@pavanky
Copy link
Member

pavanky commented Apr 12, 2018

I thought you wanted to avoid virtual table lookup for performance (I may be wrong, though, considering that in this PR you e.g., store the call to nativeFree in std::function that is performing a similar type of type erasure anyway.). However, the common MemoryManager has a virtual destructor, is there a reason for this? Because you cannot use it purely polymorphically anyway, it always has to be parametrized using the backend-specific manager type.

This is a result of MemoryManager starting off as a polymorphic base class and then being converted to the templated version because we could not guarantee the order of destruction of certain objects which resulted in sgefaults on exits (I don't recall the specifics).

I guess the virtual destructor is something that was missed when the conversion was done.

@umar456 umar456 force-pushed the sync_without_lock branch 2 times, most recently from 1810a93 to 031b3e2 Compare April 13, 2018 04:45
@9prady9
Copy link
Member

9prady9 commented Apr 13, 2018

@FloopCZ Yes, it was a miss as @pavanky pointed out and I missed removing that. virtual qualifier is not needed there and in fact we can just remove that destructor and let the compiler handle it since it doesn't anything specific over there. @umar456 Can you please remove it.

@FloopCZ
Copy link
Contributor

FloopCZ commented Apr 13, 2018

@pavanky @9prady9 Thank you for your answers and @umar456 for the code fixing this tricky bug.

pavanky
pavanky previously approved these changes Apr 13, 2018
Narrows the scope of the memory_mutex locks to just around the
modification and reading of the internal memory data members and
elements. This means that actual allocations happen outside of this
mutex's lock.

This was done because the CPU backend requires that the free operation
performs a sync before the pointer is freed. This caused deadlocks
when the main thread was waiting on the worker thread and the
worker thread called nativeFree. This scenario happens when a buffer
node is only referenced in the worker thread and it is removed.
@umar456 umar456 merged commit 1dfbbe3 into arrayfire:master Apr 13, 2018
@umar456 umar456 deleted the sync_without_lock branch May 13, 2018 09:36
syurkevi pushed a commit to syurkevi/arrayfire that referenced this pull request Jul 26, 2018
…rrayfire#2124)

Narrows the scope of the memory_mutex locks to just around the
modification and reading of the internal memory data members and
elements. This means that actual allocations happen outside of this
mutex's lock.

This was done because the CPU backend requires that the free operation
performs a sync before the pointer is freed. This caused deadlocks
when the main thread was waiting on the worker thread and the
worker thread called nativeFree. This scenario happens when a buffer
node is only referenced in the worker thread and it is removed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants