-
Notifications
You must be signed in to change notification settings - Fork 79
Add in support for elasticsearch path store #35
Conversation
scrollfn (partial esrd/scroll-seq conn) | ||
queryfn (partial esrd/search conn index)] | ||
(if (not (esri/exists? conn "cyanite_paths")) | ||
(esri/create conn "cyanite_paths" :mappings {"path" {:properties {:path {:type "string" :index "not_analyzed"}}}})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this doesn't take the configured index name if present
(let [depth (path-depth path) | ||
p (str/replace (str/replace path "." "\\.") "*" ".*") | ||
f (vector | ||
{:range {:depth {:from depth :to depth}}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now using a regexp filter with caching, as far as I can tell, not much performance difference, but it seems to be working well!
@pyr addressed those changes and also fixed the bug with multiple graphs drawing. I have been using this in my cluster since I opened the PR and the difference is pretty drastic. Iterating paths and drawing graphs is now much faster. There is still some possible improvement that could be made here (such as using bulk updates for paths), but I think this is pretty solid.
Edit: scratch that last bit, apparently that is how it works, apparently I misinterpreted some logs |
Add in support for elasticsearch path store
👍 |
I just deployed this and realize this is rather aggressive with elasticsearch… For every datapoint received it does a GET for every subset of the path.
Cyanite went from 1.5% active CPU to 9%. Not sure of the best way to solve this:
Ideas? |
I actually started on implementing an in memory cache for this same reason, I don't think it would be overly complex, but I wanted to get it going I may try and take a crack at it tomorrow. One thing to note, it only really seems to matter when the rest interface
|
Indeed, es-native performs better. Who would have though :) It's similar to the in-memory store now. Still I don't want to put too much pressure on ES so I'm back to the default pathstore. |
This adds in support for an elasticsearch path store.
As far as I can tell, its working pretty well. I am going to test this against my cluster today to try and see just how much it helps performance. I will update the PR with the results.
There is currently one minor issue, when you select a path for retrieval, if any metrics have that path as a prefix (foo.bar.baz, foo.bar.baz2), both are included. This should be easy to fix, but wanted to get feedback first.
Implementation notes:
Elastisch is used for the implementation and both the native and rest interfaces are supported via different implementations of the Pathstore protocol.
Paths are stored in elasticsearch as:
All paths, both leaf and tree paths, are stored. This allows for quick querying and a wildcard search is used which very closely matches the current behavior.
Also, this is by far the most substantial piece of clojure I have written, so feel free to pull it apart, looking for the feedback as well!