Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON Marshaling of large responses is excessively expensive #3601

Closed
jacksontj opened this issue Dec 19, 2017 · 12 comments
Closed

JSON Marshaling of large responses is excessively expensive #3601

jacksontj opened this issue Dec 19, 2017 · 12 comments

Comments

@jacksontj
Copy link
Contributor

jacksontj commented Dec 19, 2017

What did you do?
Sent a query looking like metricname[3d] to prometheus.

What did you expect to see?
metricname has ~5k labelsets, and from my math I'm expecting to get a large data response (~2G)

What did you see instead? Under which circumstances?
Instead, I seemingly never get a response from prometheus. More interestingly I see a large increase in memory utilization to the point that prometheus stops scraping and eventually OOMs. From digging in more I found that the issue is all to do with how prometheus is marshaling out the response on the HTTP API. With my below test script, we generate 3d worth of data (at a 15s period) for 5k timeseries. We generate that data (~500ms and ~1.2G RAM) and then json marshal that data (~2m -- and consumes ~11G of RAM).

Issues

  • json.Marshal uses significantly more memory than the original (1.2G) and the output (2G) at ~11G
  • requests in json.Marshal aren't cancel-able so that if a request like this ever hits prometheus, it will either complete or cause prometheus to die
  • in addition to the large memory footprint, the marshaling (in the example below) takes ~2m on my desktop -- which is excessive, especially when you consider that the data generation took ~500ms.

Suggestions
Suggestion 1
In an effort to alleviate both problems I suggest the json marshaling is made to stream the data to the wire. There's no need to make a copy of it all in memory first, especially in the API case where we literally just write to the buffer. A terrible-hacky example would be something like:

		enc := json.NewEncoder(w)
		w.Write([]byte{'['})
		
		for i, item := range m {
    		if err := enc.Encode(item); err != nil {
    		    fmt.Println(err)
    		    return
    		}
    		if i < len(m)-1 {
    		    w.Write([]byte{','})
    		}
		}
		w.Write([]byte{']'})

In this example we would spin over every entry in the response and marshal that out. This means that each samplestream (in this example) would need to be in memory, but we'd then write it to the wire and no longer need it in memory. In addition this means that the request is "cancelable" at each encode step (if the client disconnects, then you get a stream closed error). Of course the "correct" implementation of this would require a bit of type switching

Suggestion 2
Change the marshaling of the various structs to be codegen'd. Most of them are partially there, but there are some minor improvements that can be made that would give you ~2x boost in performance (mostly copying less, and reflecting less).

For both of these I'd more than happy to implement it (its not that bad), but I wanted to get some feedback prior to implementation.

Repro Script

package main

import (
	"encoding/json"
	"fmt"
	"strconv"
	"time"

	"github.com/prometheus/common/model"
)

func generateData() model.Matrix {
	NUM_TIMESERIES := 5000
	NUM_DATAPOINTS := 17280

	// Create the top-level matrix
	m := make(model.Matrix, 0)

	for i := 0; i < NUM_TIMESERIES; i++ {
		lset := map[model.LabelName]model.LabelValue{
			model.MetricNameLabel: model.LabelValue("timeseries_" + strconv.Itoa(i)),
		}

		now := model.Now()

		values := make([]model.SamplePair, NUM_DATAPOINTS)

		for x := NUM_DATAPOINTS; x > 0; x-- {
			values[x-1] = model.SamplePair{
				Timestamp: now.Add(time.Second * -15 * time.Duration(x)), // Set the time back assuming a 15s interval
				Value:     model.SampleValue(float64(x)),
			}
		}

		ss := &model.SampleStream{
			Metric: model.Metric(lset),
			Values: values,
		}

		m = append(m, ss)
	}
	return m
}

func main() {
	start := time.Now()
	m := generateData()
	took := time.Now().Sub(start)

	fmt.Println("done generatingData took:", took)

	start = time.Now()
	json.Marshal(m)
	took = time.Now().Sub(start)
	fmt.Println("done marshaling took:", took)
}
@brian-brazil
Copy link
Contributor

See #3536, but in general if you request something that takes 1GB of memory you're likely to have problems.

@brian-brazil
Copy link
Contributor

Also, we have protections against this type of large query so querying 3d of data at 15s resolution should immediately return an error. This is to encourage such requests to be broken up into smaller requests.

@brancz
Copy link
Member

brancz commented Dec 19, 2017

Sorry if this is too far out of context, but what do we generally think of optionally marshaling as protobuf instead of json? (something I recently had a discussion with @DirectXMan12 about)

@jacksontj
Copy link
Contributor Author

jacksontj commented Dec 19, 2017 via email

@jacksontj
Copy link
Contributor Author

jacksontj commented Dec 19, 2017 via email

@brian-brazil
Copy link
Contributor

Sorry if this is too far out of context, but what do we generally think of optionally marshaling as protobuf instead of json?

It doesn't make a difference, we still need to build up all the data in RAM before we can start writing it out.

@jacksontj
Copy link
Contributor Author

I've spent some time working on this over the last day, and I've made significant progress. I have a PR open (prometheus/common#112) which speeds up marshaling using the current encoding/json marshaler by ~15-85% from the benchmarks.

Using this PR'd version of prometheus/common my repro script goes from a 3m runtime to 27s. If I change the script to use easyjson.MarshalToWriter the runtime drops even further, to ~16s.

This was referenced Dec 21, 2017
@smcquay
Copy link

smcquay commented Dec 27, 2017

In discussing a related PR on #prometheus-dev on free node it came up that the other PR was a microbenchmark and missed some performance cases.

For completeness I replace encoding/json with json-iter and ran the numbers; the easyjson version is still fastest:

$ ./slowcli_easyjson
done generatingData took: 488.309301ms
done marshaling took: 15.330267979s

# with jsoniter
$ ./slowcli_jsoniter
done generatingData took: 494.28975ms
done marshaling took: 2m26.32011921s

# with encoding json
$ ./slowcli_encoding_json
done generatingData took: 478.830941ms
done marshaling took: 2m52.064084961s

@brian-brazil
Copy link
Contributor

That is quite different from the benckmarks jsoniter have themselves, so I'm a bit suspicious. I'd really like to see an end-to-end test.

One way or the other this is all getting assembled in memory before being marshalled, so smaller queries are needed for this volume of data.

@jacksontj
Copy link
Contributor Author

I'm not sure what you are suspicious of? Both of the benchmarks aren't black-boxed, they are open source-- meaning you can look at them and determine if there is something wrong.

If you don't want to bother looking into it, I've added end-to-end data (hitting the /query/ endpoint) on #3608 (comment)

@brancz brancz mentioned this issue Jan 16, 2018
@brian-brazil
Copy link
Contributor

#3536 improved this. Your use case continues to not be sane.

@lock
Copy link

lock bot commented Mar 22, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 22, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants