-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(k8s): Improve performance of kubeconfig module #6032
Conversation
d8c154a
to
0a7b94b
Compare
3b054e3
to
461f931
Compare
How can this be fixed? |
src/modules/kubernetes.rs
Outdated
.map(String::from), | ||
namespace: ctx_yaml["context"]["namespace"] | ||
.as_str() | ||
fn get_current_kube_context_name(document: &JsonOrYaml) -> Option<String> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fn get_current_kube_context_name(document: &JsonOrYaml) -> Option<String> { | |
fn get_current_kube_context_name(document: &JsonOrYaml) -> Option<&str> { |
This avoids cloning, and thanks to lifetime elision, there's no need to specify lifetime annotations (Rust infers that the return value borrows from document
, as there's no other option). The function return value seems to be used in places where read-only access/&str
suffices (except for one place, where you then need to clone).
So this change doesn't improve performance in the sense of avoiding clones (before: 1, after: 1), but it's neat to lean on the side of 'least capability necessary'. The effect would start kicking in if this function were called more than once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/modules/kubernetes.rs
Outdated
enum JsonOrYaml { | ||
Json(JsonValue), | ||
Yaml(Yaml), | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
enum JsonOrYaml { | |
Json(JsonValue), | |
Yaml(Yaml), | |
} | |
#[derive(Debug, Clone)] | |
enum Document { | |
Json(JsonValue), | |
Yaml(Yaml), | |
} |
AorB
is not a great abstraction, I feel likeDocument
is a better name; the variable bound to this value is often calleddocument
in this file as well, makes sense!- it's good practice to derive the most common traits. Downstream crates wouldn't be able to otherwise.
Debug
is important for a baseline string representation (for logging, ...). Cloning is convenient. More is possible, but better to implement as-needed (this is subjective). More context: https://rust-lang.github.io/api-guidelines/interoperability.html#types-eagerly-implement-common-traits-c-common-traits (that book is a great resource anyway)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/modules/kubernetes.rs
Outdated
fn get_kube_ctx_components( | ||
document: &JsonOrYaml, | ||
current_ctx_name: &str, | ||
) -> Option<KubeCtxComponents> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also a great candidate for borrowing. Something along the lines of... (just as an example)
fn get_kube_ctx_components( | |
document: &JsonOrYaml, | |
current_ctx_name: &str, | |
) -> Option<KubeCtxComponents> { | |
fn get_kube_ctx_components<'doc>( | |
document: &'doc Document, | |
current_ctx_name: &str, | |
) -> Option<KubeCtxComponents<'doc>> { |
with something like
#[derive(Default)]
struct KubeCtxComponents<'doc> {
user: Option<&'doc str>,
namespace: Option<&'doc str>,
cluster: Option<&'doc str>,
}
This:
- gets rid of clones (
String::from
) - is unlikely to improve performance in meaningful ways, while complicating (== adding more constraints to) the code... but if this PR is about performance, why not give it a shot...
- might not work (depending on how this change affects other parts of the code)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the tiny, likely not even measurable performance improvement is worth the added complexity
f2c6f78
to
50bafbd
Compare
@@ -97,6 +108,12 @@ fn get_aliased_name<'a>( | |||
} | |||
} | |||
|
|||
#[derive(Debug)] | |||
enum Document { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really happy that this ends up in two different codepath everywhere, just to get a bit more speed in an edge case (That's at least how I perceive this: how often do you have such big kube configs?) :-(
Personally, I would also have expected to get a common data structure from both parser.
If this is acceptable, then I would expect an explanation in the code why this solution was chosen, so one does not have to look up the original commits to understand why two different parsers are used, either here where the data structure is defined or in the parser function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly, the yaml-rust2 crate doesn't support serde yet and serde-yaml is unmaintained, but there are plans for this at the more actively developed branch of yaml-rust2.
Nevertheless, it might be possible to reduce duplication a bit with macro_rules!
at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried to make this better by using generics to not need to duplicate the logic that finds the relevant fields for the yaml and json case and also left a comment as to why to have the two codepaths.
I do agree kubeconfigs of this size are unusual but it is not "a bit" more speed, its about 77% faster so this makes a really huge difference once you do have such a kubeconfig.
daaf7b0
to
f9b9f90
Compare
src/modules/kubernetes.rs
Outdated
Some(value) => match value.chars().next() { | ||
// Parsing as json is about an order of magnitude faster than parsing | ||
// as yaml, so do that if possible. | ||
Some('{') => serde_json::from_str(&value).ok().map(Document::Json), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to fall back to YamlLoader::load_from_str
if serde_json::from_str
fails, even if the first character is {
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a fallback and test, ptal
fc8b8ed
to
8ada2e7
Compare
This module currently takes about 200 ms when using our ~10MiB kubeconfig. This change improves its performance by: * Only parsing the file once * (Naively) checking if the content is yaml or json and potentially parse as the latter, as that seems to be much faster
@davidkna gentle ping, any chance you could give this another look, please? Really appreciate it! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@davidkna what is the process to get a change like this actually merged? |
Thank you for your contribution @alvaroaleman and thanks for reviewing @davidkna, @jankatins and @alexpovel. |
This module currently takes about 200 ms when using our ~10MiB kubeconfig. This change improves its performance by:
All timings with warm page cache.
Description
Motivation and Context
Closes #
Screenshots (if appropriate):
How Has This Been Tested?
Checklist: