Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from JSON to protobuf for kube-api requests #2570

Merged
merged 4 commits into from Feb 27, 2024

Conversation

r2k1
Copy link
Contributor

@r2k1 r2k1 commented Feb 27, 2024

What does this PR change?

Switch from JSON to protobuf for kube-api requests. Improve performance. I assume it should also reduce load on kube-api.

On my test it:

  • reduced memory allocation from 350MB to 300MB (for data fetching)
  • reduced object allocation from 1.700.000 to 1.100.000 objects (for data fetching)
  • max memory usage (in a test) reduced from 300MB to 240MB. (total for all app)
  • added a new frame for each object type to a profiler, previously all of them were bundled together. (Now it visible pods are number one, followed by a ReplicaSet on my cluster)
  • reduced CPU consumption for resource parsing from 520ms to 180

Memory allocation space diff:
Screenshot 2024-02-27 at 12 58 55

Related issue #2473

For the future references, my test setup:

AKS cluster with ~2500 pods.

package main

import (
	"fmt"
	"io"
	"net/http"
	"os"
	"runtime"
	"testing"
	"time"

	"github.com/opencost/opencost/pkg/cmd"
	"github.com/stretchr/testify/require"
)

func TestMemory(t *testing.T) {
	os.Setenv("KUBECONFIG", "/Users/r2k1/p/playground/performance-test/tc1.yaml")
	os.Setenv("PROMETHEUS_SERVER_ENDPOINT", "http://localhost:9092")
	os.Setenv("KUBERNETES_PORT", "443")
	os.Setenv("PPROF_ENABLED", "true")
	os.Setenv("CACHE_WARMING_ENABLED", "false")
	os.Setenv("DISABLE_AGGREGATE_COST_MODEL_CACHE", "true")
	os.Setenv("MAX_QUERY_CONCURRENCY", "5")

	go func() {
		err := cmd.Execute(nil)
		require.NoError(t, err)
	}()

	maxUsage := uint64(0)
	go trackMaxMemoryUsage(100*time.Microsecond, &maxUsage)
	time.Sleep(30 * time.Second)

	end := time.Now()
	start := end.Add(-time.Hour * 24)
	testFetch(t, fmt.Sprintf("http://localhost:9003/metrics"))
	testFetch(t, fmt.Sprintf("http://localhost:9003/allocation/compute?aggregate=namespace,controllerKind,controller,label:app,label:team,label:pod_template_hash&idleByNode=true&includeIdle=true&includeProportionalAssetResourceCosts=true&step=window&window=%s,%s", start.Format("2006-01-02T15:04:05Z"), end.Format("2006-01-02T15:04:05Z")))
	testFetch(t, fmt.Sprintf("http://localhost:9003/assets?window=%s,%s", start.Format("2006-01-02T15:04:05Z"), end.Format("2006-01-02T15:04:05Z")))
	t.Log("Max memory usage, KiB:", maxUsage/1024)
}


func testFetch(t *testing.T, url string) {
	startTime := time.Now()
	resp, err := http.Get(url)
	require.NoError(t, err)
	require.Less(t, resp.StatusCode, 300)
	require.GreaterOrEqual(t, resp.StatusCode, 200)
	data, err := io.ReadAll(resp.Body)
	require.NoError(t, err)
	t.Logf("%s, %v MiB, %s", time.Since(startTime), len(data)/1024/1024, url)
}

func trackMaxMemoryUsage(interval time.Duration, maxUsage *uint64) {
	var memStats runtime.MemStats

	for {
		runtime.ReadMemStats(&memStats)

		// Update maxUsage if the current HeapAlloc is greater
		if memStats.HeapAlloc > *maxUsage {
			*maxUsage = memStats.HeapAlloc
		}

		time.Sleep(interval)
	}
}

Copy link

vercel bot commented Feb 27, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
opencost ✅ Ready (Inspect) Visit Preview 💬 Add feedback Feb 27, 2024 7:20pm

Copy link
Contributor

@michaelmdresser michaelmdresser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super cool! Thanks for the optimization numbers to help put the change in context. The decreased CPU usage is particularly exciting.

return loader.ClientConfig()
config, err := loader.ClientConfig()
if err != nil {
return nil, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you please wrap this error?

Comment on lines +25 to +26
config.AcceptContentTypes = "application/vnd.kubernetes.protobuf,application/json"
config.ContentType = "application/vnd.kubernetes.protobuf"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there somewhere in the K8s documentation we can link to that explains these config values? E.g. what acceptable inputs are and what setting ContentType = "application/vnd.kubernetes.protobuf" means? I don't have the library in front of me, so if its already documented in the Go libraries godoc then that is plenty for me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very very cool! I wonder why they haven't switched to this by default. Backwards compatibility I guess?

Copy link
Contributor

@mbolt35 mbolt35 Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@r2k1 Just want to make sure that older versions of kubernetes will still work appropriately with this change. Is there any reason to expect versions of kubernetes on GKS, EKS, AKS may not work properly?

Just want to make sure that we're covering our bases here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not aware about any potential reasons why it would not work. I would expect all cloud services to use the same binary.

kube-state-metrics sets same values.

Signed-off-by: r2k1 <yokree@gmail.com>
Signed-off-by: r2k1 <yokree@gmail.com>
Copy link
Contributor

@ameijer ameijer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neat!

Copy link

sonarcloud bot commented Feb 27, 2024

@AjayTripathy AjayTripathy merged commit 7e71053 into opencost:develop Feb 27, 2024
6 checks passed
@AjayTripathy
Copy link
Contributor

Thank you Artur!

@r2k1 r2k1 deleted the optimization branch February 27, 2024 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants