New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] Local volume for distributed data workloads #3957
Comments
@derekbit good job on the evaluation :) |
@derekbit as we discussed, let's add this to 1.4.0. |
This is a great news! May I ask what brings IO down in comparison to the baseline? |
It's more about the improvement of latency. Basically, there are 3 primary things:
|
Pre Ready-For-Testing Checklist
longhorn/longhorn-manager#1562
|
Thanks for your answer. Probably, my question was misleading. I've asked about difference between local-path-provisioner and longhorn with data locality, unix socket and without replication. The difference between 90,531 and 47,667, and also between 77,799 and 21,158 is still huge. |
I see. Longhorn local volume is not for achieving a similar performance of local-path-provisioner you mentioned, but rather still based on the existing data path w/ some changes above to ensure strict data locality between engine and replica to gain some IO performance if compared with volumes with best-effort or disabled locality. |
Latency is still somewhat 500% local path |
The local volume's data path is not changed a lot in this improvement in order to preserve the existing functionalities such as snapshotting, backup, restore, etc. We will continue the improvement the local volume, e.g. pass-througn, to squeeze more performance. However, the performance difference is still significant after these improvements because of the existing data path design. |
sure 150% or something is fine to get the overlay, but 500% on latency is a bit too much to buy, makes it unusable for localpath usecases which is where you want fast low latency disk access for systems that themselves take care of replication (cassandra, redpanda, postgres, chronicle stores), also its a quite some what of a reduction of IOPS available, looking at stats above write is from 97k down to 28k at best. Its not to dismiss what work has been done here, i think its a great step in the right direction, its just to realistically fit localpv use cases perf is somewhat off atm. Did anything get done with SPDK? I think in last discussions it was an idea that it could help reduce some of that.. |
what i raised and has been closed for this issue, #1965 the point here was to and i quote use case for localpv's akin to openebs;s local pv offerings e.g. their lvm-localpv or hostpath-localpv , or minio's directpv. without needing to switch vendor, and also still having some unified management, e.g. ui,monitoring, backup. "Using K8s native LocalPV's are useful as no network-based storage can keep up with baremetal in write IOPS/latency/throughput, when using NVME/Optane disks. Giving Direct I/O: Near-zero disk performance overhead" |
Need to update hte upgrade test image, because new options are added. |
Hi @derekbit When I attached strict-local volume form one node to another node, volume will looped between attaching and detaching, the only active button was delete, did this circumstance expected? thank you. |
This is expected, because the strict-local volume one replica should satisfy
So, if you attach the volume to another node, the attach-detach loop should happe because of the lack of a replica. |
Verified in longhorn master
|
After discussing with @derekbit , see if we need to have a validation hook to avoid unnecessary intended reconsiling for this situation. |
We also need to check if local volume supports auto/manual salvage. cc @chriscchien @longhorn/qa |
Verify test case Node restart/down scenario with With |
Is your feature request related to a problem? Please describe
Longhorn is a highly available replica-based storage system, and it's good for fault tolerance, read performance, data protection, etc, but on the other side, it also needs some extra costs like requiring more disk paces for replication.
In some cases, especially for distributed data workloads (SS) like databases (ex: Cassandra, Kafka, etc), they already have their own data replication, sharding, etc, so we should provide a better volume type for these use cases but also still support existing specific volume functionalities like snapshotting, backup/restore, etc.
Describe the solution you'd like
strict
orenforced
mode to require one replica should be local and next to the workloadDescribe alternatives you've considered
N/A
Additional context
#1965
The text was updated successfully, but these errors were encountered: