Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: support partition concurrent compute #177

Merged
merged 8 commits into from
May 19, 2022
Merged

Conversation

corgiboygsj
Copy link
Member

@corgiboygsj corgiboygsj commented Jan 4, 2022

implement #167

@codecov
Copy link

codecov bot commented Jan 4, 2022

Codecov Report

Merging #177 (5f4397f) into master (50b6147) will decrease coverage by 0.56%.
The diff coverage is 74.75%.

❗ Current head 5f4397f differs from pull request most recent head 89e92bc. Consider uploading reports for the commit 89e92bc to get more accurate results

@@             Coverage Diff              @@
##             master     #177      +/-   ##
============================================
- Coverage     87.32%   86.75%   -0.57%     
- Complexity     3151     3155       +4     
============================================
  Files           332      334       +2     
  Lines         11806    11897      +91     
  Branches       1053     1053              
============================================
+ Hits          10309    10321      +12     
- Misses          992     1069      +77     
- Partials        505      507       +2     
Impacted Files Coverage Δ
...graph/computer/core/network/DataClientManager.java 85.36% <ø> (ø)
.../baidu/hugegraph/computer/core/util/Consumers.java 50.00% <50.00%> (ø)
...graph/computer/core/sender/MessageSendManager.java 78.40% <91.30%> (+1.47%) ⬆️
...ugegraph/computer/core/compute/ComputeManager.java 86.51% <92.00%> (-0.33%) ⬇️
...raph/computer/core/compute/FileGraphPartition.java 87.55% <94.11%> (+0.57%) ⬆️
...ugegraph/computer/core/config/ComputerOptions.java 100.00% <100.00%> (ø)
...egraph/computer/core/input/WorkerInputManager.java 100.00% <100.00%> (ø)
...graph/computer/core/sender/MessageSendBuffers.java 94.73% <100.00%> (+0.98%) ⬆️
...aph/computer/core/sender/MessageSendPartition.java 100.00% <100.00%> (ø)
...aph/computer/core/sort/sorter/JavaInputSorter.java 100.00% <100.00%> (ø)
... and 19 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 50b6147...89e92bc. Read the comment docs.

@corgiboygsj corgiboygsj force-pushed the concurrent-compute branch 7 times, most recently from 2250d87 to db82aa5 Compare January 11, 2022 04:48
int partitionId = this.partitioner.partitionId(vertex.id());
WriteBuffers buffer = this.buffers.get(partitionId);

synchronized (buffer) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to use a more fine-grained lock?

Copy link
Member Author

@corgiboygsj corgiboygsj Jan 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any good ideas. Both writingBuffer and sortingBuffer may operated by this method, so I think WriteBuffers is the minimum granularity of lock.


private static List<KvEntry> threadSortList() {
List<KvEntry> list = SORT_LOCAL.get();
list.clear();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it may cause all KvEntry released

@corgiboygsj
Copy link
Member Author

@javeme MessageRecvManager#xxxPartitions can also merge files in parallel. Is it a good idea to pass the thread-pool to MessageRecvManager#xxxPartitions by ComputeManager?

@javeme
Copy link
Contributor

javeme commented Jan 14, 2022

@corgiboygsj Do you mean let receive-tasks and computing-tasks share the same thread pool?
Unified thread pool is beneficial in terms of management, but they may affect performance and stability with each other. I think you can have a try


public final class Consumers<V> {

private static final int CPU_CORE_NUM =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coderzc
coderzc previously approved these changes Jan 18, 2022
import com.baidu.hugegraph.computer.core.common.ComputerContext;
import com.baidu.hugegraph.computer.core.receiver.MessageStat;

public class MessageSendPartition {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer rename to ThreadWriteBuffers

this.buffers = new ConcurrentHashMap<>();
}

public WriteBuffers get() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to buffersForCurrentThread()?

Comment on lines 61 to 73
public void resetMessageWritten() {
for (WriteBuffers buffer : this.buffers.values()) {
buffer.resetMessageWritten();
}
}

public MessageStat messageWritten() {
MessageStat partitionStat = new MessageStat();
for (WriteBuffers buffer : this.buffers.values()) {
partitionStat.increase(buffer.messageWritten());
}
return partitionStat;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may access by multi threads?

@github-actions
Copy link

Due to the lack of activity, the current pr is marked as stale and will be closed after 180 days, any update will remove the stale label


private final ComputerContext context;
private final Managers managers;

private final Map<Integer, FileGraphPartition<M>> partitions;
private final Computation<M> computation;
private final Map<Integer, FileGraphPartition> partitions;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it right: multi threads may handle one partition, one MessageSendPartition hold multi buffers, one buffer each thread

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

if (!buffer.isEmpty()) {
buffer.prepareSorting();
futures.add(this.sortThenSend(partitionId, type, buffer));
for (WriteBuffers buffer : partition.buffers()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we ensure no thread access partition.buffers any more?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method will only be called by main thread

@github-actions
Copy link

github-actions bot commented Apr 15, 2022

CLA Assistant Lite bot:
Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.


I have read the CLA Document and I hereby sign the CLA


You can retrigger this bot by commenting recheck in this Pull Request

Comment on lines 57 to 67
public void clear() {
this.buffers.clear();
}

public void resetMessageWritten() {
for (WriteBuffers buffer : this.buffers.values()) {
buffer.resetMessageWritten();
}
}

public MessageStat messageWritten() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add synchronized for the 3 methods?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can do this, but they are only accessed by the main thread

@corgiboygsj corgiboygsj marked this pull request as ready for review April 25, 2022 09:07
@corgiboygsj corgiboygsj force-pushed the concurrent-compute branch 2 times, most recently from 2956516 to 09b0e51 Compare April 25, 2022 09:15
javeme
javeme previously approved these changes Apr 25, 2022
@javeme
Copy link
Contributor

javeme commented Apr 29, 2022

ci error:
beyondstorage/setup-hdfs@master is not allowed to be used in apache/incubator-hugegraph-computer.

@github-actions github-actions bot removed the inactive label Apr 29, 2022
@codecov
Copy link

codecov bot commented May 19, 2022

Codecov Report

Merging #177 (bca36db) into master (0d2e6df) will decrease coverage by 0.63%.
The diff coverage is 74.75%.

@@             Coverage Diff              @@
##             master     #177      +/-   ##
============================================
- Coverage     87.37%   86.73%   -0.64%     
- Complexity     3153     3167      +14     
============================================
  Files           332      334       +2     
  Lines         11806    11962     +156     
  Branches       1053     1068      +15     
============================================
+ Hits          10315    10375      +60     
- Misses          988     1073      +85     
- Partials        503      514      +11     
Impacted Files Coverage Δ
...graph/computer/core/network/DataClientManager.java 85.36% <ø> (ø)
.../baidu/hugegraph/computer/core/util/Consumers.java 50.00% <50.00%> (ø)
...graph/computer/core/sender/MessageSendManager.java 78.40% <91.30%> (+1.47%) ⬆️
...ugegraph/computer/core/compute/ComputeManager.java 86.51% <92.00%> (-0.33%) ⬇️
...raph/computer/core/compute/FileGraphPartition.java 87.55% <94.11%> (+0.57%) ⬆️
...ugegraph/computer/core/config/ComputerOptions.java 100.00% <100.00%> (ø)
...egraph/computer/core/input/WorkerInputManager.java 100.00% <100.00%> (ø)
...graph/computer/core/sender/MessageSendBuffers.java 94.73% <100.00%> (+0.98%) ⬆️
...aph/computer/core/sender/MessageSendPartition.java 100.00% <100.00%> (ø)
...aph/computer/core/sort/sorter/JavaInputSorter.java 100.00% <100.00%> (ø)
... and 14 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0d2e6df...bca36db. Read the comment docs.

@coderzc coderzc merged commit bcaef85 into master May 19, 2022
@coderzc coderzc deleted the concurrent-compute branch May 19, 2022 06:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants