Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Auto-balancing volumes between nodes & disks #4105

Open
iosifnicolae2 opened this issue Jun 12, 2022 · 6 comments
Open

[FEATURE] Auto-balancing volumes between nodes & disks #4105

iosifnicolae2 opened this issue Jun 12, 2022 · 6 comments
Assignees
Labels
area/performance System, volume performance area/replica Volume replica where data is placed area/stability System or volume stability area/volume-replica-scheduling Volume replica scheduling related highlight Important feature/issue to highlight kind/feature Feature request, new feature priority/0 Must be fixed in this release (managed by PO) require/auto-e2e-test Require adding/updating auto e2e test cases if they can be automated require/lep Require adding/updating enhancement proposal
Milestone

Comments

@iosifnicolae2
Copy link

iosifnicolae2 commented Jun 12, 2022

Is your feature request related to a problem?

  • The actual size of volumes is very high and disks are getting low on available disk space.
  • the performance is affected because instead of distributing the volumes across multiple disks, one disk is used to store all the volumes (a better option would be to balance volumes based on IOPS usage, but for now this is science fiction :D)

Describe the solution you'd like

It would be awesome if Longhorn would be able to:

  1. allocate new volumes based on available disk space and/or based on actual size usage .
  2. evict a volume from a disk

Describe alternatives you've considered

  1. Set best-effort as data locality and then constrain the deployment on different nodes (this solution does not fix the problem entirely because we can't configure which disk to be used).
  2. To evict a volume, increase the replication factor to 2, wait a few hours, delete the old replica, set replication factor to 1

#3840 #3887

@iosifnicolae2 iosifnicolae2 added the kind/feature Feature request, new feature label Jun 12, 2022
@derekbit derekbit added this to New in Community Issue Review via automation Jun 13, 2022
@derekbit
Copy link
Member

Hello @iosifnicolae2,
Could you elaborate evict a volume from a disk more?
What's the purpose of the feature?

@innobead innobead changed the title Auto-balancing volumes between nodes & disks [FEATURE] Auto-balancing volumes between nodes & disks Jun 16, 2022
@iosifnicolae2
Copy link
Author

Hello @iosifnicolae2,
Could you elaborate evict a volume from a disk more?
What's the purpose of the feature?

hi, sure

by evicting a volume from a disk I mean to actually move it to move it to another disk because the current disk is almost full and we have other disks which have plenty of space

@derekbit
Copy link
Member

Hello @iosifnicolae2,
Could you elaborate evict a volume from a disk more?
What's the purpose of the feature?

hi, sure

by evicting a volume from a disk I mean to actually move it to move it to another disk because the current disk is almost full and we have other disks which have plenty of space

Got it. More precisely, you hope the replica can automatically move to another disk which has more free space.

@joshimoo
Copy link
Contributor

Epic for tracking replica scheduling improvements #3840
I think we will be adding disk anti affinity rules as part of the volume group feature #2128
The replica eviction request sounds reasonable, since we currently already allow for evicting replica from a node.

@joshimoo joshimoo added area/replica Volume replica where data is placed area/volume-replica-scheduling Volume replica scheduling related labels Jun 17, 2022
@derekbit derekbit moved this from New to Resolved/Scheduled in Community Issue Review Jun 17, 2022
@innobead innobead added this to the v1.5.0 milestone Aug 19, 2022
@innobead innobead added priority/0 Must be fixed in this release (managed by PO) area/performance System, volume performance area/stability System or volume stability highlight Important feature/issue to highlight labels Jan 9, 2023
@innobead
Copy link
Member

innobead commented Jan 9, 2023

cc @c3y1huang

@innobead innobead assigned c3y1huang and unassigned PhanLe1010 Apr 13, 2023
@innobead innobead modified the milestones: v1.5.0, v1.6.0 Apr 13, 2023
@innobead innobead assigned mantissahz and unassigned c3y1huang Aug 16, 2023
@innobead innobead modified the milestones: v1.6.0, v1.7.0 Sep 14, 2023
@innobead innobead assigned c3y1huang and unassigned mantissahz Dec 10, 2023
@innobead innobead modified the milestones: v1.7.0, v1.6.0 Dec 18, 2023
@innobead innobead removed this from the v1.6.0 milestone Jan 2, 2024
@longhorn-io-github-bot
Copy link

longhorn-io-github-bot commented Feb 5, 2024

Pre Ready-For-Testing Checklist

  • Where is the reproduce steps/test steps documented?
    The reproduce steps/test steps are at:

  • Is there a workaround for the issue? If so, where is it documented?
    The workaround is at:

  • Does the PR include the explanation for the fix or the feature?

  • Does the PR include deployment change (YAML/Chart)? If so, where are the PRs for both YAML file and Chart?
    The PR for the YAML change is at:
    The PR for the chart change is at:

  • Have the backend code been merged (Manager, Engine, Instance Manager, BackupStore etc) (including backport-needed/*)?
    The PR is at:

  • Which areas/issues this PR might have potential impacts on?
    Area auto-balance, replica, scheduling
    Issues

  • If labeled: require/LEP Has the Longhorn Enhancement Proposal PR submitted?
    The LEP PR is at feat(lep): auto-balance pressured disks #7576

  • If labeled: area/ui Has the UI issue filed or ready to be merged (including backport-needed/*)?
    The UI issue/PR is at

  • If labeled: require/doc Has the necessary document PR submitted or merged (including backport-needed/*)?
    The documentation issue/PR is at TBU

  • If labeled: require/automation-e2e Has the end-to-end test plan been merged? Have QAs agreed on the automation test case? If only test case skeleton w/o implementation, have you created an implementation issue (including backport-needed/*)
    The automation skeleton PR is at
    The automation test case PR is at test(integration): replica auto balance disk in pressure longhorn-tests#1703
    The issue of automation test case implementation is at (please create by the template)

  • If labeled: require/automation-engine Has the engine integration test been merged (including backport-needed/*)?
    The engine automation PR is at

  • If labeled: require/manual-test-plan Has the manual test plan been documented?
    The updated manual test plan is at

  • If the fix introduces the code for backward compatibility Has a separate issue been filed with the label release/obsolete-compatibility?
    The compatibility issue is filed at

@c3y1huang c3y1huang added the require/auto-e2e-test Require adding/updating auto e2e test cases if they can be automated label Feb 5, 2024
@innobead innobead added the require/lep Require adding/updating enhancement proposal label Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/performance System, volume performance area/replica Volume replica where data is placed area/stability System or volume stability area/volume-replica-scheduling Volume replica scheduling related highlight Important feature/issue to highlight kind/feature Feature request, new feature priority/0 Must be fixed in this release (managed by PO) require/auto-e2e-test Require adding/updating auto e2e test cases if they can be automated require/lep Require adding/updating enhancement proposal
Projects
Status: Resolved/Scheduled
Community Issue Review
Resolved/Scheduled
Development

No branches or pull requests

8 participants