Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing #1722 #1728

Merged
merged 18 commits into from Aug 24, 2015
Merged

Fixing #1722 #1728

merged 18 commits into from Aug 24, 2015

Conversation

sithhell
Copy link
Member

This PR fixes the problem described in #1722.

In order to be able to send parcels in batches, we need to serialize them right before sending them. This serialization can't block the current sending thread as we unnecessarily hold back parcels which are to be send in this batch. This eventually leads to deadlocks in some situations.
As such a third serialization pass has been added which is able to wait on futures that aren't ready yet when put_parcel(s) is called. This might be futures that are to be serialized or when a GID exceeds its credits and a incref is needed.

sithhell and others added 9 commits August 18, 2015 11:06
When put_parcel(s) gets called, a special serialization pass is triggered to
extract possible non ready futures from any parcel.
    - call_for_each now takes the vector of parcels and calls each handler
      with it's associated parcel
    - The MPI parcelport has been switched back to run with connection caches
Instead of capturing every future and store it in a vector, we just
attach a continuation to the future and maintain a count to see when
all passed futures are ready
Conflicts:
	hpx/hpx_fwd.hpp
	hpx/plugins/parcelport/mpi/receiver.hpp
	hpx/plugins/parcelport/mpi/sender.hpp
	hpx/plugins/parcelport/tcp/sender.hpp
	hpx/runtime/naming/name.hpp
	hpx/runtime/parcelset/parcelport_impl.hpp
	plugins/parcelport/mpi/parcelport_mpi.cpp
	src/runtime/naming/name.cpp
// boost::unique_lock<mutex_type> l(connections_mtx_);
// connections_.push_back(std::move(sender));
// }
// }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this commented code block be removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

{
typedef gid_type::mutex_type::scoped_lock scoped_lock;
scoped_lock ll(gid.get_mutex());
gid_type new_gid = gid;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it guaranteed that gid is still alive when the lambda is executed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In src/runtime/naming/name.cpp:

@@ -278,53 +296,68 @@ namespace hpx { namespace naming

                 util::unlock_guard<scoped_lock> ul(l);
                 boost::int64_t new_credit = (HPX_GLOBALCREDIT_INITIAL - 1) * 2;
  •                agas::incref(new_gid, new_credit);
    
  •                new_gid = gid;
    
  •                HPX_ASSERT(new_gid != invalid_gid);
    
  •                return agas::incref_async(new_gid, new_credit).then(
    
  •                    [&gid](future<boost::int64_t>&&) mutable ->gid_type
    
  •                    {
    
  •                        typedef gid_type::mutex_type::scoped_lock scoped_lock;
    
  •                        scoped_lock ll(gid.get_mutex());
    
  •                        gid_type new_gid = gid;
    

Is it guaranteed that gid is still alive when the lambda is executed?

Yes, the gid is a reference to the one that is supposed to be serialized.

The new gids are now stored in an external map by the archive.
The parcelport takes care that the GIDs and the map are held alive long enough.
@sithhell
Copy link
Member Author

The review comments have been addressed.

@sithhell sithhell force-pushed the fixing_1722 branch 2 times, most recently from e6c78c0 to fec1db3 Compare August 21, 2015 08:24
@hkaiser
Copy link
Member

hkaiser commented Aug 23, 2015

LGTM! Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants