Add in support for elasticsearch path store #35

addisonj · 2014-05-05T20:41:23Z

This adds in support for an elasticsearch path store.

As far as I can tell, its working pretty well. I am going to test this against my cluster today to try and see just how much it helps performance. I will update the PR with the results.

There is currently one minor issue, when you select a path for retrieval, if any metrics have that path as a prefix (foo.bar.baz, foo.bar.baz2), both are included. This should be easy to fix, but wanted to get feedback first.

Implementation notes:
Elastisch is used for the implementation and both the native and rest interfaces are supported via different implementations of the Pathstore protocol.

Paths are stored in elasticsearch as:

{
  "path" : "foo.bar.baz",
  "depth" : 3,
  "leaf": false
}

All paths, both leaf and tree paths, are stored. This allows for quick querying and a wildcard search is used which very closely matches the current behavior.

Also, this is by far the most substantial piece of clojure I have written, so feel free to pull it apart, looking for the feedback as well!

brutasse · 2014-05-06T07:51:30Z

src/org/spootnik/cyanite/es_path.clj

+        scrollfn (partial esrd/scroll-seq conn)
+        queryfn (partial esrd/search conn index)]
+    (if (not (esri/exists? conn "cyanite_paths"))
+      (esri/create conn "cyanite_paths" :mappings {"path" {:properties {:path {:type "string" :index "not_analyzed"}}}}))


Looks like this doesn't take the configured index name if present

…ding *

addisonj · 2014-05-06T23:02:40Z

src/org/spootnik/cyanite/es_path.clj

+  (let [depth (path-depth path)
+        p (str/replace (str/replace path "." "\\.") "*" ".*")
+        f (vector
+           {:range {:depth {:from depth :to depth}}}


now using a regexp filter with caching, as far as I can tell, not much performance difference, but it seems to be working well!

addisonj · 2014-05-06T23:14:28Z

@pyr addressed those changes and also fixed the bug with multiple graphs drawing.

I have been using this in my cluster since I opened the PR and the difference is pretty drastic. Iterating paths and drawing graphs is now much faster.

There is still some possible improvement that could be made here (such as using bulk updates for paths), but I think this is pretty solid.

One minor note that @brutasse might be interested in, brutasse/graphite-cyanite calls to get the paths for a query and then makes an individual call for each full path. The elasticsearch store supports the same searches for the lookup operation as well. If the memory-store were refactored to do the same, it may allow for graphite-cyanite to just make a single call, which should mean faster render/response times.

Edit: scratch that last bit, apparently that is how it works, apparently I misinterpreted some logs

pyr · 2014-05-07T07:17:49Z

Thanks a lot @addisonj and @brutasse for the comments.

Add in support for elasticsearch path store

brutasse · 2014-05-07T07:21:49Z

👍

brutasse · 2014-05-08T07:01:19Z

I just deployed this and realize this is rather aggressive with elasticsearch… For every datapoint received it does a GET for every subset of the path.

GET /cyanite-paths/path/collectd
GET /cyanite-paths/path/collectd.georges
GET /cyanite-paths/path/collectd.georges.memory
GET /cyanite-paths/path/collectd.georges.memory.memory-cached

Cyanite went from 1.5% active CPU to 9%.

Not sure of the best way to solve this:

combine this with an in-memory cache
find another structure for paths to avoid O(n) ES queries
batch updates every X minutes or so

Ideas?

addisonj · 2014-05-08T07:09:07Z

I actually started on implementing an in memory cache for this same reason,
but then decided to leave it as a later optimization.

I don't think it would be overly complex, but I wanted to get it going
first.

I may try and take a crack at it tomorrow.

One thing to note, it only really seems to matter when the rest interface
is used. Using the native client, the overhead was much lower.
On May 8, 2014 1:01 AM, "Bruno Renié" notifications@github.com wrote:

I just deployed this and realize this is rather aggressive with
elasticsearch… For every datapoint received it does a GET for every
subset of the path.

GET /cyanite-paths/path/collectd
GET /cyanite-paths/path/collectd.georges
GET /cyanite-paths/path/collectd.georges.memory
GET /cyanite-paths/path/collectd.georges.memory.memory-cached

Cyanite went from 1.5% active CPU to 9%.

Not sure of the best way to solve this:

combine this with an in-memory cache

find another structure for paths to avoid O(n) ES queries

batch updates every X minutes or so

Ideas?

—
Reply to this email directly or view it on GitHubhttps://github.com//pull/35#issuecomment-42518949
.

brutasse · 2014-05-08T07:27:07Z

Indeed, es-native performs better. Who would have though :) It's similar to the in-memory store now.

Still I don't want to put too much pressure on ES so I'm back to the default pathstore.

Add in support for elasticsearch path store

8602e0b

brutasse reviewed May 6, 2014
View reviewed changes

addisonj added 2 commits May 6, 2014 12:59

add tenant into the doc, fix a few minor issues

ce97eb5

use a regex for caching, fix up getting multiple metrics by not appen…

8413f8f

…ding *

addisonj reviewed May 6, 2014
View reviewed changes

pyr added a commit that referenced this pull request May 7, 2014

Merge pull request #35 from addisonj/es_paths

4efce29

Add in support for elasticsearch path store

pyr merged commit 4efce29 into pyr:master May 7, 2014

addisonj mentioned this pull request May 13, 2014

Cache paths for elastic search store #36

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add in support for elasticsearch path store #35

Add in support for elasticsearch path store #35

addisonj commented May 5, 2014

brutasse May 6, 2014

addisonj May 6, 2014

addisonj commented May 6, 2014

pyr commented May 7, 2014

brutasse commented May 7, 2014

brutasse commented May 8, 2014

addisonj commented May 8, 2014

brutasse commented May 8, 2014

Add in support for elasticsearch path store #35

Add in support for elasticsearch path store #35

Conversation

addisonj commented May 5, 2014

brutasse May 6, 2014

Choose a reason for hiding this comment

addisonj May 6, 2014

Choose a reason for hiding this comment

addisonj commented May 6, 2014

pyr commented May 7, 2014

brutasse commented May 7, 2014

brutasse commented May 8, 2014

addisonj commented May 8, 2014

brutasse commented May 8, 2014