Skip to content
This repository has been archived by the owner on Sep 2, 2020. It is now read-only.

/paths api doesn't return JSON object #119

Closed
alangibson opened this issue Aug 5, 2015 · 20 comments
Closed

/paths api doesn't return JSON object #119

alangibson opened this issue Aug 5, 2015 · 20 comments
Labels

Comments

@alangibson
Copy link

While trying to figure out why graphite-api was only ever returning HTTP 500 errors, I noticed that the /paths api only ever returns a list, not a JSON object.

[root@56959e8f3388 /]# curl http://127.0.0.1:8099/paths?query=56959e8f3388.cpu-6.cpu-idle
["56959e8f3388.cpu-6.cpu-idle"]

Adding '-H "Content-Type: application/json"' doesn't change anything.

However, the documents at cyanite.io/api.html say

/paths
Query available metric paths.
Input
        query: A valid path query
Output
    {
      "paths": [
        "path1",
        "path2",
        "pathN"
      ]
    }

This is a problem because the CyaniteFinder in the Cyanite python package (brutasse/graphite-cyanite) expects the JSON object from the documentation:

def find_nodes(self, query):
    paths = requests.get(urls.paths,
                         params={'query': query.pattern}).json()
    for path in paths:
        if path['leaf']:
            yield CyaniteLeafNode(path['path'],
                                  CyaniteReader(path['path']))
        else:
            yield BranchNode(path['path'])

I am using the latest code from the GitHub repo.

@pyr
Copy link
Owner

pyr commented Aug 5, 2015

Hi @alangibson,

This is a known issue and one of the blockers for the upcoming release.
I'll be closing #117 which is a duplicate.

Cheers.

@pyr pyr added this to the 0.5.1 milestone Aug 5, 2015
@pyr pyr added the bug label Aug 5, 2015
@TransactCharlie
Copy link

experiencing the same issue -- it's taken me a while to find out what is going on,

That'll teach me for trying the bleeding edge!. I think I might monkey patch graphite-cyanite and see if I can get any further.

@TransactCharlie
Copy link

gah -- trying to hack the graphite-cyanite lib isn't getting me very far. I've come up with this:

    def find_nodes(self, query):
        paths = requests.get(urls.paths,
                             params={'query': query.pattern}).json()

        root = {}
        for path in paths:
            cur = root
            for node in path.split("."):
                if node not in cur:
                    cur[node] = {}
                cur = cur[node]

        def walk(t, p=None):
            for k, v in t.items():
                p = ".".join((p, k)) if p else k
                if v == {}:
                    yield CyaniteLeafNode(p, CyaniteReader(p))
                else:
                    yield BranchNode(p)
                    for n in walk(v, p):
                        yield n

        for n in walk(root):
            yield n

to replace the find_nodes() method. But I can't seem to work out the proper yields. I thought it was like a directory walk.

I now see metrics branches in graphite-web but duplicated and always broken.

@TransactCharlie
Copy link

screen shot 2015-08-07 at 02 18 04

@TransactCharlie
Copy link

comparing the results from

curl http://127.0.0.1:8080/paths?query=*

on an older version of cyanite to the latest master the behaviour is very different......

the older version will only show you the top level branches.

and the master version will spit out every metric path as a flat list of leaf nodes)

its a shame as I'd love to use this and test out the pickle handling.

@TransactCharlie
Copy link

Bump - Pyr? you taking a break from this project? maybe a nice holiday somewhere?

I'd try and modify cyanite myself except I'm not sure what the problem is exactly. Is it the elastic search schema and retrieval or is it some post processing that has to happen inside cyanite after the paths are retrieved from elastic search?

@cberry777
Copy link

I upgraded to the latest cyanite to prepare for a bakeoff and this problem now plagues me too.
Any ETA on getting it fixed??

BTW: It is not the elasticsearch index storage. I tried using "Index: memory" and it is the same problem.

@pyr
Copy link
Owner

pyr commented Aug 24, 2015

Hi @TransactCharlie, I'll work on this this week. the index is incomplete as of today

Cheers!

@TransactCharlie
Copy link

Hi @pyr. Awesome news!

@cberry777
Copy link

yes. thank you very much!

From: Charlie Gildawie <notifications@github.commailto:notifications@github.com>
Reply-To: pyr/cyanite <reply@reply.github.commailto:reply@reply.github.com>
Date: Monday, August 24, 2015 at 4:48 AM
To: pyr/cyanite <cyanite@noreply.github.commailto:cyanite@noreply.github.com>
Cc: Chris Berry <cberry@homeaway.commailto:cberry@homeaway.com>
Subject: Re: [cyanite] /paths api doesn't return JSON object (#119)

Hi @pyrhttps://github.com/pyr. Awesome news!

Reply to this email directly or view it on GitHubhttps://github.com//issues/119#issuecomment-134118301.

@alangibson
Copy link
Author

Hi @pyr, just another friendly ping on this issue. I'm prepping an internal demo for metrics solutions for early next week and I'd really like to include Cyanite + Grafana.

@emanuelis
Copy link
Contributor

Are there any workarounds for this problem? I do really need to setup Cyanite+Grafana. Or are there any other compatible frontends?

@cberry777
Copy link

Unfortunately, as far as I know, it is broken for all graphite front-ends, including graphite-web.
I am hoping I do not have to roll back to a previous version, as this version has the new schema, and I'd rather not load all my metrics into the obsolete schema.
I don't know clojure, or I'd jump in and fix it...
Cheers,

  • Chris

From: Emanuelis <notifications@github.commailto:notifications@github.com>
Reply-To: pyr/cyanite <reply@reply.github.commailto:reply@reply.github.com>
Date: Wednesday, September 9, 2015 at 8:21 PM
To: pyr/cyanite <cyanite@noreply.github.commailto:cyanite@noreply.github.com>
Cc: Chris Berry <cberry@homeaway.commailto:cberry@homeaway.com>
Subject: Re: [cyanite] /paths api doesn't return JSON object (#119)

Are there any workarounds for this problem? I do really need to setup Cyanite+Grafana. Or are there any other compatible frontends?

Reply to this email directly or view it on GitHubhttps://github.com//issues/119#issuecomment-139088651.

@TransactCharlie
Copy link

Hi. @cberry777

When I was looking into this I got the impression that the way that the metrics paths are stored in cyanite (either in memory or in elastic search is currently a work in progress).

The older versions had a tree like schema where items were designated as leaves or nodes and a search retrieved much less entries that now.

That's why I don't think it's just a matter of fixing the code. I don't think its some bug that can be changed. It's ?new? functionality that doesn't yet exist. That's what's stopping me from forking and trying to 'make it work' - I think I'd just have to throw it all away when this gets done by @pyr.

@TransactCharlie
Copy link

@pyr don't suppose you have some design docs or a simple way you can post what you need to happen with the index? If so maybe someone can actually implement it for you if you are snowed under with other commitments.

I've finally got into clojure for another project (ironically making a dsl for people to define their metrics from arbitrary sources). So I think I might be able to make some progress if you do.

@cberry777
Copy link

I suppose I will have to roll back to a previous point in the git history, as we would really like to begin a large scale load test of cyanite very soon. Because cyanite doesn't tag releases or milestones, that is easier said than done. Can anyone point me at the last known, stable commit in github??

It is unfortunate that the current work on master (all of which looks like great improvements BTW) was not done on a User branch. And thus, the current work in master is not stable -- i.e. AFAIK, it cannot actually produce graphs in graphite or grafana -- which is, after all, the whole point in of gathering metrics :)

So I am going to approach my boss today to see if I can get a bounty assigned to this bug on BountySource.

@iain-buclaw-sociomantic

@TransactCharlie - this is what I came up with very briefly to get it working:

# From the list of returned metrics, only return those
# that match the immediate query pattern.
def truncate_nodes(query, paths):
    count = len(query.split('.'))
    seen = {}
    npaths = []

    for path in paths:
        # Get the first 'count' parts of the path name.
        qpath = '.'.join(path.split('.')[:count])
        # Skip over duplicates.
        if qpath in seen:
            continue
        # Add to the new paths list
        seen[qpath] = 1
        npaths.append({'leaf': (qpath == path), 'path': qpath })

    return npaths

Then in find_nodes:

    def find_nodes(self, query):
        paths = requests.get(urls.paths,
                             params={'query': query.pattern}).json()
        paths = truncate_nodes(query.pattern, paths)    # <-- Add this line after the request
        for path in paths:
            if path['leaf']:
                yield CyaniteLeafNode(path['path'],
                                      CyaniteReader(path['path']))
            else:
                yield BranchNode(path['path'])

It would have been better had it just worked out the box though. ;-)

@TransactCharlie
Copy link

thanks for that @iain-buclaw-sociomantic

Have you tested with some different filter permutations -- such as the { } ones? I'm not sure if cyanite successfully retrieves metric paths at this point

@iain-buclaw-sociomantic

The { } permutations look to be working just fine, eg: this retrieves what I expect to see:

http://localhost:8080/paths?query=eu-*.cpu-{0,3}.cpu-{system,user,wait,idle}.value

And I can walk through the grafana metrics interface selecting each individual step.

pyr added a commit that referenced this issue Oct 15, 2015
Improve performance and Graphite compatibility.

- [X] Refactor search interface
- [X] Cassandra search implementation
- [X] Graphite query parser
- [X] Load test procedure

Refactor search interface
-------------------------

The search functionality now puts a lot less responsibility in
the hands of implementers. Three functions are now expected from
implementations:

```clojure
(defprotocol MetricIndex
  (push-segment! [this pos segment path length])
  (by-pos        [this pos])
  (by-segment    [this pos segment]))
```

Based on these primitives, cyanite now builds a simple inverted
index of the following structure, given the input paths:
`collectd.web01.cpu` and `collectd.web02.cpu`:

```json
{
  "segments": {
    0: [ "collectd" ],
    1: [ "web01", "web02" ],
    2: [ "cpu" ]
  },
  "paths": {
    [0, "collectd"]: [["collectd.web01.cpu", 3], ["collectd.web02.cpu", 3]],
    [1, "web01"]:    [["collectd.web01.cpu", 3]],
    [1, "web02"]:    [["collectd.web02.cpu", 3]],
    [2, "cpu"]:      [["collectd.web01.cpu", 3], ["collectd.web02.cpu", 3]]
  }
}
```

Given this structure, cyanite will now split paths into segments, and perform
globbing queries on segments.

- `push-segment!`: Register a new path.
- `by-pos`: Yield all segments at a given position
- `by-segment`: Yield paths for a position and segment tuple

The globbing implementation is somewhat naive and leaves room for improvement,
implementations should aim to sort segments. Subsequent commits will bypass
`by-pos` whenever possible and perform prefix lookups up to the first wildcard
to further reduce lookup times.

Two implementations of this protocol are provided:

- `AgentIndex` stores segments and paths in memory, updates go through an agent.
- `CassandraIndex` provides Cassandra-backed storage for paths and segments.

The ElasticSearch-backed index is now gone, but a compatible implementation will
be provided as a subsequent improvement.

Graphite Query Parser
---------------------

A tokenizer for the Graphite syntax has been living in the tree for a while.
A subset of the syntax is now handled, and may be translated to an AST.

A `run-query!` function is also provided which will walk tokens to
extract paths, query paths, handling globs and will then be reduced
to the result of the operation if successful.

The following operations are already implemented:

- `sumSeries`
- `divideSeries`
- `scale`
- `absolute`

Globbing is handled by https://github.com/pyr/globber and adheres to
globbing rules as available in common shells.

Load testing procedure
----------------------

Cyanite now integrates https://github.com/feangulo/graphite-stresser for
development, the baseline against which it is tested is a workload
of 200000 metrics per second, flushed at at 5 second interval with
a maximum heap-size of 512m.

Remaining work
--------------

These changes fix & improve the following on-going issues:

- #119
- The path indexing part of #121.
- Most of the work needed for #136.
@pyr pyr removed this from the 0.5.1 milestone Oct 16, 2015
@pyr
Copy link
Owner

pyr commented Oct 16, 2015

#137 should fix this issue.

@pyr pyr closed this as completed in 71a6ad7 Oct 16, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

6 participants