List series should have option to return shard space mapping #867

pauldix · 2014-08-22T19:15:02Z

So users can be sure they've set things up properly, they should be able to see which shard space a give series will be mapped to.

It seems the easiest way to do this is to have an option on list series that will have it return the shard space name each series will be mapped to. Like

list series, space
list series, space /&stats.*/

The , space would be optional. If included the result would look something like:

Name	Space
seriesA	one_week_space
seriesB	30_days_space

I'm open to other potential query syntax, this was just my first idea.

The text was updated successfully, but these errors were encountered:

sahilthapar · 2014-08-23T06:22:28Z

👍

schmurfy · 2014-08-26T10:42:08Z

why not just add the space column in the returned data for "list series" ? is it expensive ?
Other than that what would your second example do, list the series stored in the spaces matching a regexp ?

jvshahid · 2014-08-27T16:34:11Z

@schmurfy agree, I don't see a reason why make the query more complicated than just list series and return the extra column

schmurfy · 2014-09-09T13:55:46Z

is there any news on this feature ? I currently have a test system running fine but I would like to check that everything is going where I think it is before pushing it to production.

pauldix · 2014-09-09T17:05:17Z

@jvshahid, @schmurfy returning the space mapping per series would definitely be more expensive. Unless we cached that mapping. Basically, for each series name we'd loop through the shard spaces and see if it matches the regex.

It's certainly easier to do this feature without the updates to the query grammar. I could run a few tests to see how much it slows down the query.

pauldix · 2014-09-09T21:11:19Z

Just tested on my laptop with 500k series. Without including the space name was about 2s to run list series. With space names it was about 2.7s. Obviously, it'll have a greater impact when going over a network, but gzip will help quite a bit with that.

What you think @jvshahid, good enough?

pauldix · 2014-09-09T21:16:52Z

I forgot to mention that was with 3 shard spaces defined.

Dieterbe · 2014-09-09T22:04:13Z

alternatively we could also keep 'list series' just the list of series, and have a command like inspect shard <shard-name> to see which series match it.

schmurfy · 2014-09-10T08:08:11Z

@pauldix as a temporary measure could you share the modified list series ? If you can make it available on a branch somewhere I could use it for my current goal and will not impact the discussion on how to really implement it in an influxdb release.

I think the real question for implementing this feature is: how is list series currently used ?
As I see it you would not run it until needed and probably keep a cached version but aside of the admin tools which need to show this information I don't see any real use in production code, if that's really the case adding more data and slightly slow it down is not an issue.
Do you have any use case for list series in production code ?

Fixes #867. Updated lexer and parser to work, added code to coordinator to insert spaces if requested.

Dieterbe · 2014-09-10T22:12:34Z

I think the real question for implementing this feature is: how is list series currently used ?

great point!
people who use influxdb as graphite backend, do so via graphite-influxdb. let me explain that use case because i think it's important and pretty common.
the graphite server receives a request such as target=someFunc(foo.bar.*.something.{match1,match2}.blah), and it needs to convert this into the appropriate queries for influxdb. this goes as follows:
1 is figuring out which series are matched
2 querying the data of those series
3 applying someFunc on the data (in the graphite process)
4 converting the results in a png graph, and returning that to the user.

all of this, of course, needs to be as fast as possible. certain things can be cached but even on first hit it should still be fast, and we can't cache for long anyway because new series should become visible quickly. That's why the ability to retrieve the list of series (filtered by regex) as fast as possible is so important for the graphite-api use case (see also #884)

Civil · 2014-09-10T22:23:41Z

Also my 5c to the previous comment:
Ability to know about retention scheme and space mapping also will be useful - because it's better to get this info from influx, then to force user to specify it in config file. Though speed of this query is more important. Cause even now 'list series' is slow, very slow. For dashboards (typical dashboard in my practice is around 10-15 graphs), graphite executes at least one 'list series /regex/' per graph (and that's with out of tree patches, with patches - 1 query per line, so it can be easily 50 queries for dashboard). Even now you can see how dashboard redraws, if it'll be 30% slower it'll be totally unacceptable for displaying data with graphite (and I think for any other graph system with influxdb as storage).

For 1kk series (250k series with 4 spaces), it could take really really a lot. 10 quereis, let's say 500ms each - it'll be 5 seconds just to ensure that graph can be plotted. And update period is 30-60s. And what if there'll be more graphs? In my experience there could be 700k series without spaces (and with it'll be more then 2kk), how fast it would work?

Though it's only my opinion as a user.

schmurfy · 2014-09-12T08:39:21Z

thanks :)

schmurfy · 2014-09-12T10:41:39Z

are you sure the result is reliable ?
Because it shows me everything going into default which is a bit odd since my default space seems to retain 7d worth of data and I can get points from 2 weeks ago.

Can anyone confirm it can returns something other than default ?

schmurfy · 2014-09-12T10:45:26Z

@pauldix on a somewhat related topic would it be possible to add a similar way to return the shard space when doing a select ? It would return the real space where the point was read instead of where it would end up being stored, I think both make sense to track down configuration errors.

pauldix · 2014-09-12T17:42:26Z

The test checks mappings to other spaces: https://github.com/influxdb/influxdb/blob/master/integration/single_server_test.go#L146-L181

I'm wondering if there's a problem here where you created shard spaces, but didn't have a catch all space (like the default). Then you wrote data in and it fell through and then created the default space.

The problem is that when spaces are created, they get put at the front of the list so they are evaluated first. So after that everything gets assigned to the default. Not sure, just a guess.

Can you post a gist of http://localhost:8086/cluster/configuration?u=root&p=root?

schmurfy · 2014-09-15T08:05:03Z

it seems to work now, what I did in the interval was remove the database an recreate it with a master build (to get the "include spaces" option).
Now if I run list series with space it shows what I expects, my old configuration had one regexp wrong but the others were right so showing everything going in default was still wrong, anyway I don't have the faulty database anymore so I suppose it's fine if I was the only one with this issue ^^

pauldix added a commit that referenced this issue Sep 10, 2014

Add option to include space to series mappings in list series query.

90983aa

Fixes #867. Updated lexer and parser to work, added code to coordinator to insert spaces if requested.

pauldix mentioned this issue Sep 10, 2014

Add option to include space to series mappings in list series query. #927

Closed

pauldix closed this as completed in f0c0abd Sep 11, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

List series should have option to return shard space mapping #867

List series should have option to return shard space mapping #867

pauldix commented Aug 22, 2014

sahilthapar commented Aug 23, 2014

schmurfy commented Aug 26, 2014

jvshahid commented Aug 27, 2014

schmurfy commented Sep 9, 2014

pauldix commented Sep 9, 2014

pauldix commented Sep 9, 2014

pauldix commented Sep 9, 2014

Dieterbe commented Sep 9, 2014

schmurfy commented Sep 10, 2014

Dieterbe commented Sep 10, 2014

Civil commented Sep 10, 2014

schmurfy commented Sep 12, 2014

schmurfy commented Sep 12, 2014

schmurfy commented Sep 12, 2014

pauldix commented Sep 12, 2014

schmurfy commented Sep 15, 2014

List series should have option to return shard space mapping #867

List series should have option to return shard space mapping #867

Comments

pauldix commented Aug 22, 2014

sahilthapar commented Aug 23, 2014

schmurfy commented Aug 26, 2014

jvshahid commented Aug 27, 2014

schmurfy commented Sep 9, 2014

pauldix commented Sep 9, 2014

pauldix commented Sep 9, 2014

pauldix commented Sep 9, 2014

Dieterbe commented Sep 9, 2014

schmurfy commented Sep 10, 2014

Dieterbe commented Sep 10, 2014

Civil commented Sep 10, 2014

schmurfy commented Sep 12, 2014

schmurfy commented Sep 12, 2014

schmurfy commented Sep 12, 2014

pauldix commented Sep 12, 2014

schmurfy commented Sep 15, 2014