Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] support shuffle partition data write to multiple shuffle servers #1373

Closed
3 tasks done
xumanbu opened this issue Dec 15, 2023 · 10 comments · Fixed by #1445
Closed
3 tasks done

[FEATURE] support shuffle partition data write to multiple shuffle servers #1373

xumanbu opened this issue Dec 15, 2023 · 10 comments · Fixed by #1445
Assignees

Comments

@xumanbu
Copy link
Contributor

xumanbu commented Dec 15, 2023

Code of Conduct

Search before asking

  • I have searched in the issues and found no similar issues.

Describe the feature

Currently, data from one partition can only be written to a designated shuffle server.

In some scenarios, such as when a large partition occupies a significant amount of shuffle server disk space, this can lead to server crashes in extreme cases.

Based on #825, we could support writing one partition data to multiple shuffle servers.

Motivation

https://docs.google.com/document/d/1FRGP82R7JBnr3IjmF8PQ6qWP9GnKQsk-N1OWk_f1kSY

Describe the solution

shuffleServer drawio

Additional context

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!
@xumanbu
Copy link
Contributor Author

xumanbu commented Dec 15, 2023

When to trigger the next node write?

Two strategies I propose are:

  • Based on the application quota of a shuffle server.
  • Based on the shuffle server's health status.

@jerqi @zuston @leixm

@zuston
Copy link
Member

zuston commented Dec 15, 2023

+1 for that.

I hope the trigger strategies could be designed as the pluggable mechansim

@jerqi jerqi changed the title [FEATURE] support shuffle partition data write to multiple shuffle [FEATURE] support shuffle partition data write to multiple shuffle servers Dec 18, 2023
@connorlwilkes
Copy link
Contributor

How would this differ from rss.data.replica.write

@jerqi
Copy link
Contributor

jerqi commented Dec 19, 2023

When to trigger the next node write?

Two strategies I propose are:

  • Based on the application quota of a shuffle server.
  • Based on the shuffle server's health status.

@jerqi @zuston @leixm

+1 for this proposal.

@jerqi
Copy link
Contributor

jerqi commented Dec 19, 2023

How would this differ from rss.data.replica.write

This may only write one replica but the replica can be stored in different servers.

@xumanbu
Copy link
Contributor Author

xumanbu commented Dec 27, 2023

@jerqi @zuston added a dataflow diagram in describe the solution for implementation plan.

@zuston
Copy link
Member

zuston commented Dec 29, 2023

@jerqi @zuston added a dataflow diagram in describe the solution for implementation plan.

Looks good.

@jerqi
Copy link
Contributor

jerqi commented Dec 29, 2023

@jerqi @zuston added a dataflow diagram in describe the solution for implementation plan.

You would better give us a google document like https://docs.google.com/document/d/1p1PksBN2LJ-OtGEHvdyEuH9b1Mv1aD_exMPl4TNaTs0/edit#heading=h.c7fauf9cicbi

@xumanbu
Copy link
Contributor Author

xumanbu commented Dec 29, 2023

@jerqi @zuston added a dataflow diagram in describe the solution for implementation plan.

You would better give us a google document like https://docs.google.com/document/d/1p1PksBN2LJ-OtGEHvdyEuH9b1Mv1aD_exMPl4TNaTs0/edit#heading=h.c7fauf9cicbi

OK. added a google doc https://docs.google.com/document/d/1FRGP82R7JBnr3IjmF8PQ6qWP9GnKQsk-N1OWk_f1kSY

@zuston
Copy link
Member

zuston commented Jan 3, 2024

LGTM. Please go ahead.

xumanbu added a commit to xumanbu/incubator-uniffle that referenced this issue Jan 6, 2024
xumanbu added a commit to xumanbu/incubator-uniffle that referenced this issue Jan 12, 2024
zuston pushed a commit that referenced this issue Feb 21, 2024
…ing from reassignment mechanism (#1445)

### What changes were proposed in this pull request?

partition write to multi servers leveraging from reassignment mechanism

### Why are the changes needed?

For: #1373

### Does this PR introduce _any_ user-facing change?

1、add config `rss.server.dynamic.assign.enabled` for whether to reassign a faulty shuffle server.
2、support reassign a new shuffle server for send failed blocks
3、ShuffleReader read partition in muitl server implement will in next pr

### How was this patch tested?

UTs

---------

Co-authored-by: jam.xu <jam.xu@vipshop.com>
dingshun3016 pushed a commit to dingshun3016/incubator-uniffle that referenced this issue Feb 26, 2024
dingshun3016 pushed a commit to dingshun3016/incubator-uniffle that referenced this issue Mar 14, 2024
dingshun3016 added a commit to dingshun3016/incubator-uniffle that referenced this issue Mar 14, 2024
dingshun3016 added a commit to dingshun3016/incubator-uniffle that referenced this issue Mar 14, 2024
dingshun3016 added a commit to dingshun3016/incubator-uniffle that referenced this issue Mar 14, 2024
dingshun3016 added a commit to dingshun3016/incubator-uniffle that referenced this issue Mar 14, 2024
dingshun3016 added a commit to dingshun3016/incubator-uniffle that referenced this issue Mar 14, 2024
zuston pushed a commit that referenced this issue Mar 15, 2024
…signment (#1580)

### What changes were proposed in this pull request?

add client type when request shuffle assignment

### Why are the changes needed?

Fix: (#1373)

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Not necessary.
zuston pushed a commit that referenced this issue Mar 15, 2024
### What changes were proposed in this pull request?

fix partition id type is incorrect

### Why are the changes needed?
Fix: (#1373)

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Not necessary.
zuston pushed a commit that referenced this issue Apr 1, 2024
…n partition data reassign is enabled (#1583)

### What changes were proposed in this pull request?
Fix task fail retry parameter not work and add task fail retry parameter in other locations

### Why are the changes needed?
Fix: (#1373)

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Not necessary.

---------

Co-authored-by: shun01.ding <shun01.ding@vipshop.com>
dingshun3016 pushed a commit to dingshun3016/incubator-uniffle that referenced this issue Apr 1, 2024
… range error when reassign faulty shuffle server for tasks
dingshun3016 pushed a commit to dingshun3016/incubator-uniffle that referenced this issue Apr 8, 2024
… range error when reassign faulty shuffle server for tasks
zuston pushed a commit that referenced this issue Apr 8, 2024
… after reassign (#1612)

### What changes were proposed in this pull request?

fix partition id inconsistency when reassign new shuffle server

For example:
when writing data on node a1, the registered partition id is 1003.
a1 node fails,and reassign node b1 and register shuffle server b1,but partitionNumPerRange is 1.
when writing data to node b1, NO_REGISTER exception will be thrown

### Why are the changes needed?

Fix: (#1373)

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

---------

Co-authored-by: shun01.ding <shun01.ding@vipshop.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment