Skip to content
This repository has been archived by the owner on Aug 23, 2023. It is now read-only.

built-in execution of graphite requests #575

Merged
merged 38 commits into from
Apr 19, 2017
Merged

built-in execution of graphite requests #575

merged 38 commits into from
Apr 19, 2017

Conversation

Dieterbe
Copy link
Contributor

@Dieterbe Dieterbe commented Mar 22, 2017

this is a prototype. the basics seem to work, but needs more testing and more polish. I also have yet to seriously start on adding a couple functions.

in particular, the parsing and expr tree seems good, it responds with correct looking output for basic (non-function) requests, and the proxying to graphite-api seems to work too. but:

  1. the proxying doesn't handle POST yet
  2. output goes a second too far

@Dieterbe Dieterbe force-pushed the expr branch 2 times, most recently from f407997 to c5cc8bc Compare March 24, 2017 11:32
@Dieterbe
Copy link
Contributor Author

Dieterbe commented Mar 24, 2017

now the proxy works. if you run launch.sh docker-dev-direct-single-tenant you'll have everything setup to test easily. you can load the metrictank dashboard directly from graphite-api or MT (spoiler alert they look identical)
To my pleasant surprise, I also saw http: proxy error: context canceled messages when loading the MT dashboard and reloading it while requests were still pending. not sure if cancellation propagation works properly through the entire path, but this is something independent of the proxy that we'll have to address at some point, and it looks like out of the box, much of it already works.

I also created a dashboard to see the stats, it looks like so:
proxy-stats
two important notes here:

  1. only the first unsupported function is reported. not all of the functions in a given request that we don't support. this simplifies the request parsing
  2. i added a provision to mark certain functions as stable or not. this allows us to have code active to support certain functions, and test them out with real data, but still mark them as unstable and issue proxy requests to graphite.

next up: testing functions, writing some more. and coming up with a nice solution of the slice allocations (copy on write vs copy first and modify in place)

@Dieterbe Dieterbe force-pushed the expr branch 5 times, most recently from 92a8ffb to 7264244 Compare March 28, 2017 08:33
@Dieterbe
Copy link
Contributor Author

OK I think I got everything working properly now. supported functions are alias, sumSeries/sum and averageSeries/avg. the docker-dev-direct-single-tenant docker env makes it easy to test MT and compare to graphite. i added a bunch more unit tests also.

Notes:

  1. I'm not happy with the addition of the QueryPatt, QueryFrom and QueryTo fields to models.Series
    it could be cleaned up by embedding the type Req from expr/plan.go (and possibly change models.Req to embed that as well), this would make the code more elegant, as in most places we deal with these 3 attributes as one (e.g. in mergeSeries).
    But we also have to think about how this will go to make it a standalone library that grafana can also consume. we'll have to make models.Series reusable or i'm also thinking about making the expr libary taking an interface (with a few methods, Len, Next point etc) instead of a concrete type like models.Series . there might be some performance overhead though)
  2. graphite has from exclusive and until inclusive. MT's native render api (exposed to graphite-api) has from inclusive and to/until exclusive. so now we're extending that api and when treating it as graphite, our from/to are off by one second.

TODO:

  1. docs
  2. performance validation

@Dieterbe
Copy link
Contributor Author

Dieterbe commented Apr 12, 2017

now did some performance testing as well. see the script in the last commit.
looks good to me. although going through graphite was surprisingly slow and I have no idea why. graphite shouldn't be this slow. but anyway the benchmarks show MT is always faster and proxy overhead is not significant.

~/g/s/g/r/m/c/mt-index-cat ❯❯❯ ../../docker/docker-dev-direct-single-tenant/bench.sh
run this from inside a directory that has the mt-index-cat binary
before running this, make sure you run the following command and you saw the 'now doing realtime' message
fakemetrics -carbon-tcp-address localhost:2003 -statsd-addr localhost:8125 -shard-org -keys-per-org 100 -speedup 200 -stop-at-now -offset 1h && echo "now doing realtime" && fakemetrics -carbon-tcp-address localhost:2003 -statsd-addr localhost:8125 -shard-org -keys-per-org 100
and verify http://localhost:3000/dashboard/db/fake-metrics-data
press any key to continue

>>>> 1A: MT simple series requests
Requests      [total, rate]            500, 10.02
Duration      [total, attack, wait]    49.90810306s, 49.899999825s, 8.103235ms
Latencies     [mean, 50, 95, 99, max]  8.495676ms, 8.506675ms, 10.064634ms, 22.706379ms, 52.404426ms
Bytes In      [total, mean]            34625372, 69250.74
Bytes Out     [total, mean]            0, 0.00
Success       [ratio]                  100.00%
Status Codes  [code:count]             200:500  
Error Set:

>>>> 1B: graphite simple series requests
Requests      [total, rate]            500, 10.02
Duration      [total, attack, wait]    49.946537572s, 49.899999912s, 46.53766ms
Latencies     [mean, 50, 95, 99, max]  30.297993ms, 28.507309ms, 48.112856ms, 65.825307ms, 68.895464ms
Bytes In      [total, mean]            38120154, 76240.31
Bytes Out     [total, mean]            0, 0.00
Success       [ratio]                  100.00%
Status Codes  [code:count]             200:500  
Error Set:

>>>> 2A MT sumSeries(patterns.*) (no proxying)
Requests      [total, rate]            6250, 250.04
Duration      [total, attack, wait]    25.00198183s, 24.995999831s, 5.981999ms
Latencies     [mean, 50, 95, 99, max]  5.483174ms, 4.315464ms, 18.445995ms, 30.602304ms, 43.466792ms
Bytes In      [total, mean]            347411119, 55585.78
Bytes Out     [total, mean]            0, 0.00
Success       [ratio]                  100.00%
Status Codes  [code:count]             200:6250  
Error Set:

>>>> 2B same but lower load since graphite cant do much
Requests      [total, rate]            125, 5.04
Duration      [total, attack, wait]    24.807791656s, 24.799999789s, 7.791867ms
Latencies     [mean, 50, 95, 99, max]  9.601537ms, 8.472735ms, 22.204463ms, 23.863076ms, 42.246818ms
Bytes In      [total, mean]            7663595, 61308.76
Bytes Out     [total, mean]            0, 0.00
Success       [ratio]                  100.00%
Status Codes  [code:count]             200:125  
Error Set:

>>>> 2C graphite sumSeries(patterns.*)
Requests      [total, rate]            250, 5.02
Duration      [total, attack, wait]    50.099044783s, 49.799999926s, 299.044857ms
Latencies     [mean, 50, 95, 99, max]  152.935257ms, 34.399036ms, 970.421343ms, 1.334126438s, 1.715109579s
Bytes In      [total, mean]            17044595, 68178.38
Bytes Out     [total, mean]            0, 0.00
Success       [ratio]                  100.00%
Status Codes  [code:count]             200:250  
Error Set:

>>>> 3A MT load needing proxying
Requests      [total, rate]            250, 5.02
Duration      [total, attack, wait]    49.83281577s, 49.799999888s, 32.815882ms
Latencies     [mean, 50, 95, 99, max]  34.041134ms, 31.943184ms, 53.609206ms, 68.950639ms, 71.411507ms
Bytes In      [total, mean]            19122578, 76490.31
Bytes Out     [total, mean]            0, 0.00
Success       [ratio]                  100.00%
Status Codes  [code:count]             200:250  
Error Set:

>>>> 3B graphite directly to see proxying overhead
Requests      [total, rate]            250, 5.02
Duration      [total, attack, wait]    49.834155561s, 49.799999746s, 34.155815ms
Latencies     [mean, 50, 95, 99, max]  33.519563ms, 31.67443ms, 51.330246ms, 70.272948ms, 71.695535ms
Bytes In      [total, mean]            19146048, 76584.19
Bytes Out     [total, mean]            0, 0.00
Success       [ratio]                  100.00%
Status Codes  [code:count]             200:250  
Error Set:

here's the corresponding snapshot https://snapshot.raintank.io/dashboard/snapshot/Jh5Zl91d5gCIYXoUtsCQ1Owu6mW0C0As

mostly just parsing of expressions
* track the query pattern used along with a req, so tie back the results
  to the query later.
* series also needs to keep track of the query that caused the series to be
  created, so that the processing functions can efficiently retrieve the
  series for the given input parameters
* adjust merging so that we only merge series if they correspond to the
  same request from the same functions.
similar to the change we made to graphite-raintank,
sometimes (especially in dev)
it's nice to run MT such that it always assumes orgId 1,
without requiring something like tsdbgw in front to do auth.
requests

we can now successfully use MT directly from browser, with dynamic
proxying!
* this way all data gets saved back to the buffer pool, and only once.
  whether the data was in the input data, whether the data made it
  through to the end (e.g. individual inputs to a sum generally don't
  unless there's only 1 input), or whether the data was generated by
  a processing function
* in the future we'll be able to use this as temporary result cache
  so that steps with some shared logic don't need to redo the processing.
out = mergeSeries(out)

// instead of waiting for all data to come in and then start processing everything, we could consider starting processing earlier, at the risk of doing needless work
// if we need to cancel the request due to a fetch error
Copy link
Contributor

@replay replay Apr 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assuming that there shouldn't be too many fetch errors i'd say that would probably make sense. but i don't think it should be a blocker for merging this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. but that would complicate things. building a processing system that is pipelined. more of a long term / low prio item IMHO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree

graphiteProxy.ModifyResponse = func(resp *http.Response) error {
// if kept, would be duplicated. and duplicated headers are illegal)
resp.Header.Del("access-control-allow-credentials")
resp.Header.Del("Access-Control-Allow-Origin")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks a little confusing that the header on :43 is with lower case and :44 is with uppercase letters. .Del() should be case insensitive anyway

func (p Plan) Clean() {
for _, series := range p.input {
for _, serie := range series {
pointSlicePool.Put(serie.Datapoints[:0])
Copy link
Contributor

@replay replay Apr 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand that [:0]. Isn't that always just creating an empty list?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it creates an empty list by creating a slice value, backed by (using a pointer to) the same underlying array. so when we pull the value out of the pool next time, we have a slice of len zero that we can immediately start adding elements to without allocations, because we'll reuse the same backing array.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"gopkg.in/raintank/schema.v1"
)

const defaultPointSliceSize = 2000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of the pool is very nice, but i think the limit of 2000 seems a little arbitrary. Would it make sense to make that based on some setting like f.e. maxPointsPerReqHard

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maxPointsPerReqHard is per request. but a pointslice is a slice of points for a given series.
you're right that it's somewhat arbitrary. you want it low enough to not waste too much memory allocating slices with space that we don't use, and high enough to avoid expensive re-allocation and copying as we need to expand a slice as it gets filled; though the cost of this gets amortized as we can reuse the slice (incl backing array) over time so starting out a little low is not so problematic.
it's hard to generalize how many points will typically be returned for any given series, across different requests, but I think this is a good enough value that we don't have to worry about for a while.

Copy link
Member

@woodsaj woodsaj Apr 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we currently keep meters of the number of series per request and number of points per request (reqRenderSeriesCount, reqRenderPointsFetched). I think we should also keep counters of pointsFetched and number of series fetched.

the current metrics give us good insight into query complexity, but the counters would provide a good measure of overall request workload.

)

func CaptureBody(c *Context) {
body, _ := ioutil.ReadAll(c.Req.Request.Body)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems like a potential error should be handled somehow so we don't pass an "invalid" body into the request handlers

@@ -132,6 +132,10 @@ key-file = /etc/ssl/private/ssl-cert-snakeoil.key
max-points-per-req-soft = 1000000
# limit of number of datapoints a request can return. Requests that exceed this limit will be rejected. (0 disables limit)
max-points-per-req-hard = 20000000
# require x-org-id authentication to auth as a specific org. otherwise orgId 1 is assumed")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a ) without (

expr/expr.go Outdated
case etString:
return fmt.Sprintf("%sexpr-string %q", space, e.valStr)
}
return "HUH-SHOULD-NEVER-HAPPEN"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol, "this is not a bug, it's a feature"

Datapoints: out,
Interval: series[0].Interval,
}
cache[Req{}] = append(cache[Req{}], output)
Copy link
Contributor

@replay replay Apr 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the idea to cache that here is cool. but i can't see where this cache is read, am i missing something?
the only place where i can see this is accessed is in Plan.Clean(), but there it just extracts empty slices from it.

Copy link
Contributor

@replay replay Apr 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, maybe the variable should rather be named something like pool instead of cache and actually the output elements are only appended to it to reuse the allocated memory later, but not to reuse the generated values, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see the comment for the generated property of type Plan, which is what is passed in here. you're right that right now we don't have caching implemented yet, but i do foresee this to be the main use in the future (as explained in that comment).

@Dieterbe
Copy link
Contributor Author

thanks for all the good feedback @replay . now see the few commits i pushed.

please let me know any final remarks, otherwise this will get merged tomorrow! cc @woodsaj

expr/funcs.go Outdated
)

type Func interface {
Signature() ([]argType, []argType)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are their two argType slices? one for input, one for output?

expr/funcs.go Outdated
Signature() ([]argType, []argType)
// what can be assumed to have been pre-validated: len of args, and basic types (e.g. seriesList)
Init([]*expr) error // initialize and validate arguments (expressions given by user), for functions that have specific requirements
Depends(from, to uint32) (uint32, uint32) // allows a func to express its dependencies
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how are two uint32 numbers a dependency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couldn't find a better way to phrase it, but the idea is that an invocation like movingAverage(foo, "5min") with from=x and to=y can declare it "depends" on data from=x-5min to y as input.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so shouldnt the function just be called TsRange(oringalFrom, originalTo uint32) (uint32, uint32)

expr/funcs.go Outdated
// what can be assumed to have been pre-validated: len of args, and basic types (e.g. seriesList)
Init([]*expr) error // initialize and validate arguments (expressions given by user), for functions that have specific requirements
Depends(from, to uint32) (uint32, uint32) // allows a func to express its dependencies
Exec(map[Req][]models.Series, ...interface{}) ([]interface{}, error) // execute the function with its arguments. functions and names resolved to series, other expression types(constant, strings, etc) are still the bare expressions that were also passed into Init()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this function supposed to return?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'll push a commit that explains all these functions better

@Dieterbe
Copy link
Contributor Author

Dieterbe commented Apr 18, 2017

update re graphite perf trouble, for those who are interested:

  • the benchmark script calls mt-index-cat in a way where it generates queries with * at random places. if you're unlucky, the '*' is at the end which expands into 100 series which have to be summed. I consider this a fair, realistic and diverse workload, but it seems graphite handles it poorly, to the extent that a lot of the requests it needs to process are affected by one slow query.
  • however, even excluding that particular pattern, and keeping it to requests that only do sums over 1 series sometimes result in fast and sometimes in slow requests: see
    gentle-load-for-graphite.txt
    in particular the 2nd run which has mean and median of 8s.
  • there was no performance increase by running the currently stable graphite-MT combo.

UPDATE: same requests (with ending wildcard removed), against MT gives:

cat gentle-load-for-graphite.txt | vegeta attack -rate 10 -duration 10s | vegeta report
Requests      [total, rate]            100, 10.10
Duration      [total, attack, wait]    9.909619504s, 9.899999659s, 9.619845ms
Latencies     [mean, 50, 95, 99, max]  8.093534ms, 8.580528ms, 13.198168ms, 14.393088ms, 14.799109ms
Bytes In      [total, mean]            6019911, 60199.11
Bytes Out     [total, mean]            0, 0.00
Success       [ratio]                  100.00%
Status Codes  [code:count]             200:100  
Error Set:

Copy link
Contributor

@replay replay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked at the commits that you pushed after my bunch of comments. That all looks good, but i still added one comment

api/graphite.go Outdated
}
for _, def := range metric.Defs {
locatedDefs[r][def.Id] = locatedDef{def, s.Node}
for _, archive := range metric.Defs {
Copy link
Contributor

@replay replay Apr 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you not need to skip the non-leaf nodes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not really . leaf property is just a shorthand for len(metrics.Def) == 0

@Dieterbe Dieterbe merged commit be64803 into master Apr 19, 2017
@Dieterbe Dieterbe mentioned this pull request May 9, 2017
@Dieterbe Dieterbe deleted the expr branch September 18, 2018 09:00
Dieterbe added a commit that referenced this pull request Apr 10, 2019
introduced via 929513c / 174
became obsolete in b3d3831 / #575 / 0.7.1-61-gbe64803e

we no longer really log requests ourselves anymore.
i imagine at some point we probably will, at which point we can
re-introduce this setting

(our qa tool `unused` notified that this variable wasn't being read
anywhere)
@Dieterbe Dieterbe mentioned this pull request Apr 10, 2019
Dieterbe added a commit that referenced this pull request Apr 10, 2019
introduced via 929513c / #174
became obsolete in b3d3831 / #575 / 0.7.1-61-gbe64803e

we no longer really log requests ourselves anymore.
i imagine at some point we probably will, at which point we can
re-introduce this setting

(our qa tool `unused` notified that this variable wasn't being read
anywhere)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants