Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize memory management in Trilinos sparsity pattern accessors. #16406

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

bangerth
Copy link
Member

@bangerth bangerth commented Jan 2, 2024

No description provided.

@peterrum
Copy link
Member

peterrum commented Jan 3, 2024

The following tests FAILED:
	7340 - trilinos/sparsity_pattern_06.mpirun=3.debug (Failed)
Errors while running CTest

Comment on lines +60 to +64
if (colnum_cache.use_count() > 1)
colnum_cache = std::make_shared<std::vector<size_type>>(
sparsity_pattern->row_length(this->a_row));
else
colnum_cache->resize(sparsity_pattern->row_length(this->a_row));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this thread-safe? The use count could be increased on a different thread after the check but before resize, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, good question. My primary motivation was that when you do

  SparsityPattern::iterator p = sp.begin();
  ++p;

and if ++p moves you to the next row of the matrix, then the current design deallocates the std::vector only to allocate another one. That's wasteful. Note that here, only one iterator is ever around, so the use count of these shared pointers is always one.

Your question pertains to the situation where another thread makes a copy of an iterator owned by the first thread (bumping the use count to two) at the very inopportune moment that the first thread has just passed the use_count() check. So perhaps something like this:

  SparsityPattern::iterator p = sp.begin();
  auto t = std::thread([&p]() { auto x = p; ++x; });
  ++p;
  t.join();

Note that here, the lambda function must capture p by reference -- if it had captured it by value, the capture would be a second copy, resulting in a use count of two, so the optimization would not have applied.

So I think you're right that the optimization is not thread-safe. Good catch!

The question is what we want to do about the situation. I would really like to optimize the use case here because typically one will only have one iterator object sitting around, no copies being made, and it seems silly to release and re-allocate the memory all the time. I could guard access to that variable with a mutex. What would you suggest?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question is what we want to do about the situation. I would really like to optimize the use case here because typically one will only have one iterator object sitting around, no copies being made, and it seems silly to release and re-allocate the memory all the time. I could guard access to that variable with a mutex. What would you suggest?

Yeah, I would just use a static std::mutex and lock it for any access to colnum_cache.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants