perf(k8s): Improve performance of kubeconfig module #6032

alvaroaleman · 2024-06-11T00:34:50Z

This module currently takes about 200 ms when using our ~10MiB kubeconfig. This change improves its performance by:

Only parsing the file once: Reduces time to around 130 ms
(Naively) checking if the content is yaml or json and potentially parse as the latter, as that seems to be much faster, reducing the runtime to ~30ms

All timings with warm page cache.

Description

Motivation and Context

Closes #

Screenshots (if appropriate):

How Has This Been Tested?

I have tested using MacOS
I have tested using Linux
I have tested using Windows

Checklist:

I have updated the documentation accordingly.
I have updated the tests accordingly.

alvaroaleman · 2024-06-11T03:42:23Z

Main workflow / Check if config schema is up to date (pull_request) Failing after 33s

How can this be fixed?

alexpovel · 2024-06-11T16:21:39Z

src/modules/kubernetes.rs

-            .map(String::from),
-        namespace: ctx_yaml["context"]["namespace"]
-            .as_str()
+fn get_current_kube_context_name(document: &JsonOrYaml) -> Option<String> {


Suggested change

fn get_current_kube_context_name(document: &JsonOrYaml) -> Option<String> {

fn get_current_kube_context_name(document: &JsonOrYaml) -> Option<&str> {

This avoids cloning, and thanks to lifetime elision, there's no need to specify lifetime annotations (Rust infers that the return value borrows from document, as there's no other option). The function return value seems to be used in places where read-only access/&str suffices (except for one place, where you then need to clone).

So this change doesn't improve performance in the sense of avoiding clones (before: 1, after: 1), but it's neat to lean on the side of 'least capability necessary'. The effect would start kicking in if this function were called more than once.

alexpovel · 2024-06-11T16:31:21Z

src/modules/kubernetes.rs

+enum JsonOrYaml {
+    Json(JsonValue),
+    Yaml(Yaml),
+}


Suggested change

enum JsonOrYaml {

Json(JsonValue),

Yaml(Yaml),

}

#[derive(Debug, Clone)]

enum Document {

Json(JsonValue),

Yaml(Yaml),

}

AorB is not a great abstraction, I feel like Document is a better name; the variable bound to this value is often called document in this file as well, makes sense!

it's good practice to derive the most common traits. Downstream crates wouldn't be able to otherwise. Debug is important for a baseline string representation (for logging, ...). Cloning is convenient. More is possible, but better to implement as-needed (this is subjective). More context: https://rust-lang.github.io/api-guidelines/interoperability.html#types-eagerly-implement-common-traits-c-common-traits (that book is a great resource anyway)

alexpovel · 2024-06-11T16:58:44Z

src/modules/kubernetes.rs

+fn get_kube_ctx_components(
+    document: &JsonOrYaml,
+    current_ctx_name: &str,
+) -> Option<KubeCtxComponents> {


This is also a great candidate for borrowing. Something along the lines of... (just as an example)

Suggested change

fn get_kube_ctx_components(

document: &JsonOrYaml,

current_ctx_name: &str,

) -> Option<KubeCtxComponents> {

fn get_kube_ctx_components<'doc>(

document: &'doc Document,

current_ctx_name: &str,

) -> Option<KubeCtxComponents<'doc>> {

with something like

#[derive(Default)] struct KubeCtxComponents<'doc> { user: Option<&'doc str>, namespace: Option<&'doc str>, cluster: Option<&'doc str>, }

This:

gets rid of clones (String::from)

is unlikely to improve performance in meaningful ways, while complicating (== adding more constraints to) the code... but if this PR is about performance, why not give it a shot...

might not work (depending on how this change affects other parts of the code)

I don't think the tiny, likely not even measurable performance improvement is worth the added complexity

jankatins · 2024-06-12T19:31:31Z

src/modules/kubernetes.rs

@@ -97,6 +108,12 @@ fn get_aliased_name<'a>(
    }
 }

+#[derive(Debug)]
+enum Document {


I'm not really happy that this ends up in two different codepath everywhere, just to get a bit more speed in an edge case (That's at least how I perceive this: how often do you have such big kube configs?) :-(

Personally, I would also have expected to get a common data structure from both parser.

If this is acceptable, then I would expect an explanation in the code why this solution was chosen, so one does not have to look up the original commits to understand why two different parsers are used, either here where the data structure is defined or in the parser function.

Sadly, the yaml-rust2 crate doesn't support serde yet and serde-yaml is unmaintained, but there are plans for this at the more actively developed branch of yaml-rust2.

Nevertheless, it might be possible to reduce duplication a bit with macro_rules! at least.

I've tried to make this better by using generics to not need to duplicate the logic that finds the relevant fields for the yaml and json case and also left a comment as to why to have the two codepaths.

I do agree kubeconfigs of this size are unusual but it is not "a bit" more speed, its about 77% faster so this makes a really huge difference once you do have such a kubeconfig.

davidkna · 2024-06-13T06:43:58Z

src/modules/kubernetes.rs

+            Some(value) => match value.chars().next() {
+                // Parsing as json is about an order of magnitude faster than parsing
+                // as yaml, so do that if possible.
+                Some('{') => serde_json::from_str(&value).ok().map(Document::Json),


I would prefer to fall back to YamlLoader::load_from_str if serde_json::from_str fails, even if the first character is {.

Added a fallback and test, ptal

This module currently takes about 200 ms when using our ~10MiB kubeconfig. This change improves its performance by: * Only parsing the file once * (Naively) checking if the content is yaml or json and potentially parse as the latter, as that seems to be much faster

alvaroaleman · 2024-06-17T17:12:57Z

@davidkna gentle ping, any chance you could give this another look, please? Really appreciate it!

davidkna

LGTM

alvaroaleman · 2024-07-08T13:03:01Z

@davidkna what is the process to get a change like this actually merged?

andytom · 2024-07-16T20:26:22Z

Thank you for your contribution @alvaroaleman and thanks for reviewing @davidkna, @jankatins and @alexpovel.

alvaroaleman force-pushed the wip branch 2 times, most recently from d8c154a to 0a7b94b Compare June 11, 2024 00:53

alvaroaleman marked this pull request as draft June 11, 2024 02:18

alvaroaleman force-pushed the wip branch 2 times, most recently from 3b054e3 to 461f931 Compare June 11, 2024 03:38

alvaroaleman marked this pull request as ready for review June 11, 2024 03:42

alexpovel reviewed Jun 11, 2024

View reviewed changes

alvaroaleman force-pushed the wip branch 3 times, most recently from f2c6f78 to 50bafbd Compare June 12, 2024 01:11

jankatins reviewed Jun 12, 2024

View reviewed changes

Fix config schema

baa1f66

alvaroaleman force-pushed the wip branch 2 times, most recently from daaf7b0 to f9b9f90 Compare June 12, 2024 22:40

davidkna reviewed Jun 13, 2024

View reviewed changes

alvaroaleman force-pushed the wip branch 2 times, most recently from fc8b8ed to 8ada2e7 Compare June 13, 2024 20:41

alvaroaleman force-pushed the wip branch from 8ada2e7 to 789c5eb Compare June 14, 2024 02:31

andytom changed the title ~~perf: Improve performance of kubeconfig module~~ perf(k8s): Improve performance of kubeconfig module Jun 16, 2024

davidkna approved these changes Jul 7, 2024

View reviewed changes

andytom merged commit fae92b2 into starship:master Jul 16, 2024
21 checks passed

github-actions bot mentioned this pull request Jul 15, 2024

chore(master): release 1.20.0 #5990

Merged

alvaroaleman deleted the wip branch July 20, 2024 04:24

cesarcoatl mentioned this pull request Jul 26, 2024

starship 1.20.0 Homebrew/homebrew-core#178631

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(k8s): Improve performance of kubeconfig module #6032

perf(k8s): Improve performance of kubeconfig module #6032

alvaroaleman commented Jun 11, 2024 •

edited

Loading

alvaroaleman commented Jun 11, 2024 •

edited

Loading

alexpovel Jun 11, 2024 •

edited

Loading

alvaroaleman Jun 12, 2024

alexpovel Jun 11, 2024

alvaroaleman Jun 12, 2024

alexpovel Jun 11, 2024

alvaroaleman Jun 12, 2024

jankatins Jun 12, 2024 •

edited

Loading

davidkna Jun 12, 2024

alvaroaleman Jun 12, 2024

davidkna Jun 13, 2024

alvaroaleman Jun 13, 2024

alvaroaleman commented Jun 17, 2024

davidkna left a comment

alvaroaleman commented Jul 8, 2024

andytom commented Jul 16, 2024 •

edited

Loading

	fn get_current_kube_context_name(document: &JsonOrYaml) -> Option<String> {
	fn get_current_kube_context_name(document: &JsonOrYaml) -> Option<&str> {

perf(k8s): Improve performance of kubeconfig module #6032

perf(k8s): Improve performance of kubeconfig module #6032

Conversation

alvaroaleman commented Jun 11, 2024 • edited Loading

Description

Motivation and Context

Screenshots (if appropriate):

How Has This Been Tested?

Checklist:

alvaroaleman commented Jun 11, 2024 • edited Loading

alexpovel Jun 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jankatins Jun 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alvaroaleman commented Jun 17, 2024

davidkna left a comment

Choose a reason for hiding this comment

alvaroaleman commented Jul 8, 2024

andytom commented Jul 16, 2024 • edited Loading

alvaroaleman commented Jun 11, 2024 •

edited

Loading

alvaroaleman commented Jun 11, 2024 •

edited

Loading

alexpovel Jun 11, 2024 •

edited

Loading

jankatins Jun 12, 2024 •

edited

Loading

andytom commented Jul 16, 2024 •

edited

Loading