Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shares gossip thread-pool across gossip service threads #12402

Merged
merged 1 commit into from
Sep 24, 2020

Conversation

behzadnouri
Copy link
Contributor

@behzadnouri behzadnouri commented Sep 23, 2020

Problem

We want to parallelize the work in build_crds_filters using rayon thread
pools. But the gossip-work threads:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/cluster_info.rs#L2330-L2334
are not usable for this purpose, because they are created in the
ClusterInfo::listen thread:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/gossip_service.rs#L54
whereas build_crds_filter is invoked in ClusterInfo::gossip thread:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/gossip_service.rs#L61

Summary of Changes

This commit creates the thread-pool in GossipService::new :
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/gossip_service.rs#L31
This commit moves the gossip-work ThreadPool from ClusterInfo::listen:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/cluster_info.rs#L2330-L2334
to ClusterInfo::new as a new field in the ClusterInfo struct:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/cluster_info.rs#L248-L260
so that it can be shared between listen and gossip threads:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/gossip_service.rs#L54-L67

@codecov
Copy link

codecov bot commented Sep 23, 2020

Codecov Report

Merging #12402 into master will decrease coverage by 0.0%.
The diff coverage is 62.5%.

@@            Coverage Diff            @@
##           master   #12402     +/-   ##
=========================================
- Coverage    82.2%    82.2%   -0.1%     
=========================================
  Files         351      351             
  Lines       82279    82280      +1     
=========================================
- Hits        67710    67695     -15     
- Misses      14569    14585     +16     

@behzadnouri behzadnouri marked this pull request as ready for review September 23, 2020 02:18
@behzadnouri behzadnouri force-pushed the gossip-thread-pool branch 2 times, most recently from 94663d1 to 4895f2b Compare September 23, 2020 20:27
sakridge
sakridge previously approved these changes Sep 24, 2020
Copy link
Member

@sakridge sakridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@behzadnouri behzadnouri added the automerge Merge this Pull Request automatically once CI passes label Sep 24, 2020
@mergify mergify bot dismissed sakridge’s stale review September 24, 2020 17:24

Pull request has been modified.

@behzadnouri behzadnouri removed the automerge Merge this Pull Request automatically once CI passes label Sep 24, 2020
@behzadnouri behzadnouri merged commit 42f1ef8 into solana-labs:master Sep 24, 2020
mergify bot pushed a commit that referenced this pull request Sep 24, 2020
mergify bot added a commit that referenced this pull request Sep 24, 2020
(cherry picked from commit 42f1ef8)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
behzadnouri added a commit to behzadnouri/solana that referenced this pull request Sep 27, 2020
@behzadnouri behzadnouri deleted the gossip-thread-pool branch September 28, 2020 22:03
behzadnouri added a commit to behzadnouri/solana that referenced this pull request Sep 28, 2020
behzadnouri added a commit to behzadnouri/solana that referenced this pull request Sep 29, 2020
solana-labs#12402
moved gossip-work threads:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/cluster_info.rs#L2330-L2334
to ClusterInfo::new as a new field in the ClusterInfo struct:
https://github.com/solana-labs/solana/blob/35208c5ee/core/src/cluster_info.rs#L249
So that they can be shared between listen and gossip threads:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/gossip_service.rs#L54-L67

However, in testing solana-labs#12360
it turned out this will cause breakage:
https://buildkite.com/solana-labs/solana/builds/31646
https://buildkite.com/solana-labs/solana/builds/31651
https://buildkite.com/solana-labs/solana/builds/31655
Whereas with separate thread pools all is good. It might be the case
that one thread is slowing down the other by exhausting the thread-pool
whereas with separate thread-pools we get fair scheduling guarantees
from the os.

This commit reverts solana-labs#12402
and instead add separate thread-pools for listen and gossip threads:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/gossip_service.rs#L54-L67
behzadnouri added a commit to behzadnouri/solana that referenced this pull request Sep 29, 2020
solana-labs#12402
moved gossip-work threads:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/cluster_info.rs#L2330-L2334
to ClusterInfo::new as a new field in the ClusterInfo struct:
https://github.com/solana-labs/solana/blob/35208c5ee/core/src/cluster_info.rs#L249
So that they can be shared between listen and gossip threads:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/gossip_service.rs#L54-L67

However, in testing solana-labs#12360
it turned out this will cause breakage:
https://buildkite.com/solana-labs/solana/builds/31646
https://buildkite.com/solana-labs/solana/builds/31651
https://buildkite.com/solana-labs/solana/builds/31655
Whereas with separate thread pools all is good. It might be the case
that one thread is slowing down the other by exhausting the thread-pool
whereas with separate thread-pools we get fair scheduling guarantees
from the os.

This commit reverts solana-labs#12402
and instead adds separate thread-pools for listen and gossip threads:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/gossip_service.rs#L54-L67
behzadnouri added a commit that referenced this pull request Sep 29, 2020
#12402
moved gossip-work threads:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/cluster_info.rs#L2330-L2334
to ClusterInfo::new as a new field in the ClusterInfo struct:
https://github.com/solana-labs/solana/blob/35208c5ee/core/src/cluster_info.rs#L249
So that they can be shared between listen and gossip threads:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/gossip_service.rs#L54-L67

However, in testing #12360
it turned out this will cause breakage:
https://buildkite.com/solana-labs/solana/builds/31646
https://buildkite.com/solana-labs/solana/builds/31651
https://buildkite.com/solana-labs/solana/builds/31655
Whereas with separate thread pools all is good. It might be the case
that one thread is slowing down the other by exhausting the thread-pool
whereas with separate thread-pools we get fair scheduling guarantees
from the os.

This commit reverts #12402
and instead adds separate thread-pools for listen and gossip threads:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/gossip_service.rs#L54-L67
mergify bot pushed a commit that referenced this pull request Sep 29, 2020
#12402
moved gossip-work threads:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/cluster_info.rs#L2330-L2334
to ClusterInfo::new as a new field in the ClusterInfo struct:
https://github.com/solana-labs/solana/blob/35208c5ee/core/src/cluster_info.rs#L249
So that they can be shared between listen and gossip threads:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/gossip_service.rs#L54-L67

However, in testing #12360
it turned out this will cause breakage:
https://buildkite.com/solana-labs/solana/builds/31646
https://buildkite.com/solana-labs/solana/builds/31651
https://buildkite.com/solana-labs/solana/builds/31655
Whereas with separate thread pools all is good. It might be the case
that one thread is slowing down the other by exhausting the thread-pool
whereas with separate thread-pools we get fair scheduling guarantees
from the os.

This commit reverts #12402
and instead adds separate thread-pools for listen and gossip threads:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/gossip_service.rs#L54-L67

(cherry picked from commit 0d5258b)
jackcmay pushed a commit that referenced this pull request Sep 29, 2020
#12402
moved gossip-work threads:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/cluster_info.rs#L2330-L2334
to ClusterInfo::new as a new field in the ClusterInfo struct:
https://github.com/solana-labs/solana/blob/35208c5ee/core/src/cluster_info.rs#L249
So that they can be shared between listen and gossip threads:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/gossip_service.rs#L54-L67

However, in testing #12360
it turned out this will cause breakage:
https://buildkite.com/solana-labs/solana/builds/31646
https://buildkite.com/solana-labs/solana/builds/31651
https://buildkite.com/solana-labs/solana/builds/31655
Whereas with separate thread pools all is good. It might be the case
that one thread is slowing down the other by exhausting the thread-pool
whereas with separate thread-pools we get fair scheduling guarantees
from the os.

This commit reverts #12402
and instead adds separate thread-pools for listen and gossip threads:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/gossip_service.rs#L54-L67

(cherry picked from commit 0d5258b)
mergify bot added a commit that referenced this pull request Sep 29, 2020
)

#12402
moved gossip-work threads:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/cluster_info.rs#L2330-L2334
to ClusterInfo::new as a new field in the ClusterInfo struct:
https://github.com/solana-labs/solana/blob/35208c5ee/core/src/cluster_info.rs#L249
So that they can be shared between listen and gossip threads:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/gossip_service.rs#L54-L67

However, in testing #12360
it turned out this will cause breakage:
https://buildkite.com/solana-labs/solana/builds/31646
https://buildkite.com/solana-labs/solana/builds/31651
https://buildkite.com/solana-labs/solana/builds/31655
Whereas with separate thread pools all is good. It might be the case
that one thread is slowing down the other by exhausting the thread-pool
whereas with separate thread-pools we get fair scheduling guarantees
from the os.

This commit reverts #12402
and instead adds separate thread-pools for listen and gossip threads:
https://github.com/solana-labs/solana/blob/afd9bfc45/core/src/gossip_service.rs#L54-L67

(cherry picked from commit 0d5258b)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants