Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove superfluous OpenMP barriers #2098

Merged
merged 1 commit into from
Mar 18, 2022

Conversation

stinebuu
Copy link
Contributor

@stinebuu stinebuu commented Jul 2, 2021

This PR fixes #1441.

Tested on Piz Daint and JUSUF.

@stinebuu stinebuu added S: Normal Handle this with default priority T: Maintenance Work to keep up the quality of the code and documentation. I: No breaking change Previously written code will work as before, no one should note anything changing (aside the fix) labels Jul 2, 2021
@stinebuu stinebuu added this to PRs in progress in Kernel via automation Jul 2, 2021
@stinebuu stinebuu added this to the NEST 3.1 milestone Jul 2, 2021
@stinebuu
Copy link
Contributor Author

@suku248 @wschenck Would you have time to review this this week? Or should we try to find other reviewers?

@heplesser
Copy link
Contributor

@stinebuu I think we should wait for the original reviewers on this one, even if that should delay it to NEST 3.2.

@heplesser heplesser modified the milestones: NEST 3.1, NEST 3.2 Sep 3, 2021
@jougs
Copy link
Contributor

jougs commented Sep 17, 2021

@suku248, @wschenck: ping!

@jougs
Copy link
Contributor

jougs commented Nov 11, 2021

@suku248, @wschenck: could you please indicate if you are available for reviewing this or not? Thanks!

@heplesser heplesser removed this from the NEST 3.2 milestone Dec 20, 2021
@terhorstd
Copy link
Contributor

Last call for passengers @suku248 and @wschenck

Copy link
Contributor

@wschenck wschenck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the removal of these barriers is sound and reasonable with minimal changes to the measured runtime of deliver_spike_data.

@@ -446,7 +442,6 @@ EventDeliveryManager::gather_spike_data_( const thread tid,
decrease_buffer_size_spike_data_ = false;
}
}
#pragma omp barrier

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we don't need this barrier since gather_completed_checker_.any_false() also invokes a barrier.

@@ -706,7 +701,6 @@ EventDeliveryManager::gather_target_data( const thread tid )
if ( gather_completed_checker_.all_true() )
{
set_complete_marker_target_data_( assigned_ranks, send_buffer_position );
#pragma omp barrier
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree here: Superfluous barrier because of preceding invocation of gather_completed_checker_.all_true().

@@ -726,7 +720,6 @@ EventDeliveryManager::gather_target_data( const thread tid )

const bool distribute_completed = distribute_target_data_buffers_( tid );
gather_completed_checker_[ tid ].logical_and( distribute_completed );
#pragma omp barrier

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree here: Superfluous barrier because of following invocation of gather_completed_checker_.any_false().

@@ -736,7 +729,6 @@ EventDeliveryManager::gather_target_data( const thread tid )
buffer_size_target_data_has_changed_ = kernel().mpi_manager.increase_buffer_size_target_data();
}
}
#pragma omp barrier
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree here: Superfluous barrier because of preceding invocation of gather_completed_checker_.any_false().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with removing it, but I think the reason is that we will hit the following implicit barrier when checking for the while condition.

// Exit gather loop if all local threads and remote processes are
// done.
#pragma omp barrier

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With barrier, we measure for deliver_spike_data the runtime of the slowest thread; without barrier, we measure most likely the average runtime over all threads. I think, both variants are ok.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code provisions made for benchmarking should not reduce speed under production conditions, so I think it is sensible to remove this barrier.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this fall into the category of performance instrumentation that was discussed some time ago? We could make this barrier dependent on a cmake flag, but it might also confuse more than it helps. In any case, for sure this important difference in measurement must also be expressed in the documentation of the kernel timers!
@stinebuu, could you check if this difference is clear enough in the docs?

Would this be a problem for comparing benchmarks between NEST versions with/without this barrier?

@wschenck
Copy link
Contributor

wschenck commented Jan 19, 2022

Sorry for not answering earlier, I completely missed that some of the large number of emails on the nest-simulator mailing list contained my name. I will change the settings of my email program to avoid this in the future.

@heplesser
Copy link
Contributor

Sorry for not answering earlier, I completely missed that some of the large number of emails on the nest-simulator mailing list contained my name. I will change the settings of my email program to avoid this in the future.

Hi @wschenck! Thank you for your review!

@jougs
Copy link
Contributor

jougs commented Jan 28, 2022

Honestly, I don't think that it does not get much better than it is with the review by @wschenck. Together with the tests that @stinebuu carried out already earlier on Piz Daint and JUSUF, I think we should once the question by @terhorstd above is resolved.

@wschenck: how bad would the difference between old and new measurements be for the timers you mention above?

@wschenck
Copy link
Contributor

@jougs: I expect slight differences only for the timer deliver_spike_data. From a max operation (before) to a mean operation (after code change in PR) over threads. The resulting difference depends on the variance between threads. I guess that this variance is very small so that the measurement differences would be neglible.

@stinebuu stinebuu requested review from jarsi and removed request for wschenck March 1, 2022 11:13
@heplesser heplesser removed the request for review from suku248 March 1, 2022 13:00
Copy link
Contributor

@jarsi jarsi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work @stinebuu. I agree, it appears to be save to remove all these barriers.

@@ -736,7 +729,6 @@ EventDeliveryManager::gather_target_data( const thread tid )
buffer_size_target_data_has_changed_ = kernel().mpi_manager.increase_buffer_size_target_data();
}
}
#pragma omp barrier
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with removing it, but I think the reason is that we will hit the following implicit barrier when checking for the while condition.

@jougs jougs changed the title Remove superfluous omp barriers Remove superfluous OpenMP barriers Mar 18, 2022
@jougs jougs merged commit 66269a2 into nest:master Mar 18, 2022
Kernel automation moved this from PRs in progress to Done (PRs and issues) Mar 18, 2022
@stinebuu stinebuu deleted the remove_superfluous_barriers branch March 22, 2022 08:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I: No breaking change Previously written code will work as before, no one should note anything changing (aside the fix) S: Normal Handle this with default priority T: Maintenance Work to keep up the quality of the code and documentation.
Projects
Kernel
  
Done
Development

Successfully merging this pull request may close these issues.

Superfluous omp barriers in gather_spike_data_
6 participants