Streaming APIs #3690

brancz · 2018-01-16T10:55:05Z

Various of our endpoints would benefit from being streaming, meaning that we send results in chunks rather than collecting all results, and then rendering them in a single response. This is useful, for large responses, but foreshadowing, this might also be useful in order to stream new samples to a front end, although I would see that as a separate issue.

For this path to be optimal, it requires that all the underlying parts are also streaming, meaning TSDB (to my knowledge this is already the case), as well as the query engine. My gut feeling also tells me that depending on the implementation isolation in TSDB would also be required.

(Somewhat) related: #3691 #3443 #3601 https://github.com/prometheus/tsdb/issues/260

@fabxc @gouthamve @brian-brazil

brian-brazil · 2018-01-16T11:23:17Z

I'm not sure on this one, large responses are something we should avoid. Anything too big to fit in RAM is going to take out a web browser too. #3601 is outside of sane usage for example.

This would also prevent us from ever having a limit on the number of time series returned, as we'd only find that out after we've already returned a 200.

krasi-georgiev · 2018-01-16T14:18:15Z

promql data expansion has the same problem and although I am not sure for a good solution yet I think it would be nice if we can reduce the memory footprint.

fabxc · 2018-01-16T14:24:04Z

Anything too big to fit in RAM is going to take out a web browser too

That's making a pretty big assumption about who the client is. You'll probably suggest other clients chunk up their requests themselves. This does not really scale well – neither UX nor performance wise.

This would also prevent us from ever having a limit on the number of time

series returned Not really. You can get that number either approximated but also trivially precisely based on index lookups. Please consider that there may be other valid client use cases outside of web browsers that care about decent performance and flexible reuslt sizes. That generally implies some sort of chunked and streaming API to me as well.

…

On Tue, Jan 16, 2018 at 3:18 PM Krasi Georgiev ***@***.***> wrote: promql data expansion has the same problem and although I am not sure for a good solution yet I think it would be nice if we can reduce the memory footprint. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#3690 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEuA8tnNTQTU-s5ya8BNK2og1IjpyyULks5tLK-2gaJpZM4RflpJ> .

himanshub16 · 2018-02-16T12:34:00Z

Hi,
I read about this on GSOC'18 ideas page and would like to contribute to it.
I would like to have some guidance on what component would require change/upgrade for this to work.
Thanks.

krasi-georgiev · 2018-02-16T13:18:33Z

this would be a very long topic , but in general we would need to change
https://github.com/prometheus/prometheus/tree/master/promql
https://github.com/prometheus/prometheus/tree/master/web

promql is quite complex and I am still getting my head around it, but willing to team with someone willing to spend some time on this.

brian-brazil · 2018-02-16T13:30:13Z

PromQL are where the wins are. Web is much more involved, as that'd involve creating brand new APIs for use cases which are unclear.

himanshub16 · 2018-02-22T15:40:46Z

@krasi-georgiev This sounds pretty cool to me. There seems to be a lot of complexity under the hood and I would love to have some discussion over the same.
Meanwhile, my exams are getting over by Monday, and I would try to get my hands dirty. Would like to have suggestions on how to uncover the layers without getting overwhelmed.

krasi-georgiev · 2018-02-22T16:58:48Z

I can't give you any specific advice as I haven't spent that much in that part of the code myself, but in general get familiar with

the code execution of promql(vscode with breakpoints, fmt.Println(...)).
golang profiling - run different queries and collect profiles to see where most of the CPU and Memory is spent in promql.

this should be enough to get us started.

Titanssword · 2018-02-27T14:42:19Z

I am just thinking that we can store the compacted data not the original data to reduce the memory usage. Here is paper named Optimal Quantile Approximation in Streams.

In this paper, it proposes a new data structure called compactor to store data. If we don't need the Accurate answer, it can use smaller space to maintain the statistic over streams. Then, you can use this data structure to answer the query about what is the quantile of a new data. This data structure can also be mergeable. So we can use this method to compact the data in every chunk and merge all chunk when we want to use them.

But using this data structure may cause some loss of our data and some metric types of our data such as count type. I am not sure about if this notion will help to reduce the memory usage.

I am looking forward to getting some suggestions. By the way, I want to be helpful to this part.

brian-brazil · 2018-02-27T14:52:32Z

@Titanssword That doesn't really help in PromQL, there's only two functions that calculate quantiles like that and they're not very commonly used.

himanshub16 · 2018-03-06T20:51:15Z

I studied a bit about creating a streaming API, but there aren't many resources for the same.
Having used Twitter's streaming API recently, I tried to figure out how they work from this code.
Twitter's streaming API keeps the connection initiated by the client open and keeps writing gzipped data to the stream.

An alternative to this would be using WebSockets, which sometimes cause problems with proxies and firewalls.
The data can be gzipped/plain and the counterpart has to be done in the browser.

Here's a proof of concept for a simple streaming API.

waitDur, _ := time.ParseDuration("2s")

http.HandleFunc("/hello", func(w http.ResponseWriter, r *http.Request) {
	flusher := w.(http.Flusher)

	for i := 0; i < 10; i++ {
		fmt.Fprintln(w, "Hello world", i+1, time.Now().Unix())
		flusher.Flush()
		fmt.Println("hello", i+1)
		time.Sleep(waitDur)
	}
})

Gist > https://gist.github.com/himanshub16/98f7c00a39256d58de838394a55682ff

I went through some code in web and found protobuf in use. So, this should require special care.
Am I heading in the right direction?
Would like to have feedback on the same.

krasi-georgiev · 2018-03-06T21:11:19Z

@himanshub16 maybe should hang around the IRC dev channel and try to get some feedback there.

I am definitely interested in getting involved but my focus is on tsdb for the next few weeks.

brian-brazil · 2018-03-06T22:09:09Z

Based on work I've been doing on PromQL in the past while, streaming results is even less practical than I first thought. We still need to have the whole result in memory before we start to send anything, so changing our HTTP APIs from what they are today isn't going to buy us anything.

brancz · 2018-03-07T06:53:39Z

@brian-brazil can you share your thoughts?

brian-brazil · 2018-03-07T08:49:15Z

Right now we need to evaluate the entire query to know whether it succeeded to determine the status code, and the status code is the first thing we send via HTTP.

PromQL currently evaluates range queries by step, to stream that you'd have to resend the metric names on every step which would blow up response sizes by at least 5x. Making that better would require a non-trivial protocol, making ad-hoc usage of the APIs much harder.

A more efficient PromQL would evaluate each node fully for all steps at once, which offers no scope for streaming responses as you would still need be building up the entire response first.

To me this seems like a solution in search of a problem. Everything I've seen talked about is already doable with our current API.

krasi-georgiev · 2018-08-14T09:54:05Z

after the dev discussions at Promcon there are no objections that there would be no benefits from streaming in remote read and too complicated to be worth it in the promql.

himanshub16 · 2018-08-14T11:17:51Z

Wise discussions and not getting too excited saved us from technical debt.

krasi-georgiev · 2018-08-14T12:00:48Z

a short version of the discussion will be published soon.

#3443 (comment)
@brancz and @brian-brazil just corrected me so It seems I missed some of the main points.

We did conclude that streaming in the remote read endpoint would be good, but we're unclear how if that's practical to mix in to the current implementation.

That was not the conclusion, just that similar to how the Thanos store API does it would be more beneficial.

lock · 2019-03-22T11:28:24Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

bwplotka · 2019-06-07T10:55:08Z

Just FYI, there is movement in this direction, follow up discussion and announcement on #4517

brancz mentioned this issue Jan 16, 2018

Streaming query engine #3691

Closed

krasi-georgiev mentioned this issue Jan 31, 2018

heavy memory usage #3774

Closed

brian-brazil added kind/enhancement priority/Pmaybe component/api labels Jun 13, 2018

krasi-georgiev closed this as completed Aug 14, 2018

lock bot locked and limited conversation to collaborators Mar 22, 2019

krasi-georgiev added priority/P3 and removed priority/Pmaybe priority/P3 labels Jun 7, 2019

prometheus unlocked this conversation Jun 7, 2019

lock bot locked and limited conversation to collaborators Dec 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming APIs #3690

Streaming APIs #3690

brancz commented Jan 16, 2018 •

edited

brian-brazil commented Jan 16, 2018

krasi-georgiev commented Jan 16, 2018

fabxc commented Jan 16, 2018 via email

himanshub16 commented Feb 16, 2018

krasi-georgiev commented Feb 16, 2018

brian-brazil commented Feb 16, 2018

himanshub16 commented Feb 22, 2018

krasi-georgiev commented Feb 22, 2018

Titanssword commented Feb 27, 2018

brian-brazil commented Feb 27, 2018

himanshub16 commented Mar 6, 2018

krasi-georgiev commented Mar 6, 2018

brian-brazil commented Mar 6, 2018

brancz commented Mar 7, 2018

brian-brazil commented Mar 7, 2018

krasi-georgiev commented Aug 14, 2018

himanshub16 commented Aug 14, 2018

krasi-georgiev commented Aug 14, 2018

lock bot commented Mar 22, 2019

bwplotka commented Jun 7, 2019

Streaming APIs #3690

Streaming APIs #3690

Comments

brancz commented Jan 16, 2018 • edited

brian-brazil commented Jan 16, 2018

krasi-georgiev commented Jan 16, 2018

fabxc commented Jan 16, 2018 via email

himanshub16 commented Feb 16, 2018

krasi-georgiev commented Feb 16, 2018

brian-brazil commented Feb 16, 2018

himanshub16 commented Feb 22, 2018

krasi-georgiev commented Feb 22, 2018

Titanssword commented Feb 27, 2018

brian-brazil commented Feb 27, 2018

himanshub16 commented Mar 6, 2018

krasi-georgiev commented Mar 6, 2018

brian-brazil commented Mar 6, 2018

brancz commented Mar 7, 2018

brian-brazil commented Mar 7, 2018

krasi-georgiev commented Aug 14, 2018

himanshub16 commented Aug 14, 2018

krasi-georgiev commented Aug 14, 2018

lock bot commented Mar 22, 2019

bwplotka commented Jun 7, 2019

brancz commented Jan 16, 2018 •

edited