Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Directory : provide a last_seen hint #183

Closed
StevenLeRoux opened this issue May 4, 2017 · 8 comments
Closed

Directory : provide a last_seen hint #183

StevenLeRoux opened this issue May 4, 2017 · 8 comments

Comments

@StevenLeRoux
Copy link
Contributor

To know if a serie is still active or not, you can fetch the last available datapoint and check its date matching your criteria.

If data auto eviction is enabled, then you won't even be able to lookup this last datapoint.

It would be very useful to have a last_seen hint per GTS that give the timestamp (second is enough) corresponding to the last update on a GTS.

There is two possibilities to implements this :

  • Ingress
  • Store

Store component is CPU bound and uses less memory than Ingress, since it maintains the meta caching, so in case of distributed deployment, it would be better implement this on the Store side.

It would maintain a structure, like a concurrent hash map, setting the last timestamp as a value for a key corresponding to the TS ID.

Data could be sampled, and produced in best effort to either directory (standalone) or a dedicated Kafka topic.

The struct IndexSpec could have the last_seen field so that we can at the end perform a LASTSEEN with an optional parameter :

[ 'RTOKEN' 'class_pattern' { labels } ] LASTSEEN

result would be :

[{
		"c": "class",
		"l": {
			"label0": "value0",
			"label1": "value1"
		},
		"a": {
			"attr0": "value0"
		},
		"v": [
			[0, last_seen],
		]
	}
]

This way, we can leverage all frameworks to manipulate the result (FILTER, ...) and easily get series older than n days.

This would be very helpful to manage the Directory.

@hbs
Copy link
Collaborator

hbs commented May 4, 2017

Which component would handle the LASTSEEN query?

@StevenLeRoux
Copy link
Contributor Author

It would be the Directory (through egress).

LASTSEEN would be just a FIND which export the lastseen field as a value.

It could also be a FIND directly... since it's not really intrusive and integrate the current output format.

/find could handle it with a &lastseen=true (default to false) QS, appending the lastseen TS value at the end of the output :
class{labels}{attributes} lastseen.

@StevenLeRoux
Copy link
Contributor Author

StevenLeRoux commented May 4, 2017

I stated that Stores would be a better place to implement this, but actually on Ingress there is already metadataCache that could be use for this. Today the assigned value is just null, but we could set the current timestamp at the time an update query is done.

Then a regular scan would be performed over the map, comparing values to 'now' and would trigger pushMetadataMessage if now - lastseen > Configuration.DIRECTORY_GTS_FRESHNESS_LIMIT.

Alternativement, metadataCache could just be a LRU, and on eviction, a pushEvictionMetadataMessage is performed and consumed by Directory to update its index entry. In this case, IndexSpec would just mention the GTS as "evicted".

@hbs
Copy link
Collaborator

hbs commented May 5, 2017

This behavior cannot be implemented as you suggest, because this would basically be the equivalent of a massive periodic cache flush of all the series which are actively updated. We will need to introduce some jitter so the pushes to the metadata topic are spread in time.

The notion of 'lastseen' also needs to be defined more precisely, you seem to imply that it only tracks datapoints being pushed on the GTS, but what about deletes and/or attribute updates (meta)?

Let's think about it more thoroughly and update this issue as ideas become less blurry.

@StevenLeRoux
Copy link
Contributor Author

StevenLeRoux commented May 9, 2017

I agree on a jitter to manage the cache flush.

'lastseen' correspond to the last timestamp we saw a datapoint. It doesn't inherit the datapoint's own timestamp.

Deletes would suppress both the GTS from Directory and from MetadataCache, regardless the lastseen value.

Attributes only operates at Directory level and don't intervene with 'lastseen' which is only updated from Updates queries.

'lastseen' is just a hint (no atomic garantee), that shows the last Update activity for a given GTS. If we use a standard pushMetadataMessage() for cache flush, in case of deletes to be consumed, the flush needs to be stopped and reset and delayed for a while. This way we avoid a deleted GTS from Directory to be sent back from the cache flush.

Otherwise, pushMetadataMessage() could be leveraged for this, and would prevent too much intrusive modifications. Basically it would work like if Ingress wouldn't have any MetadataCache.

This feature could be controled through :

//
// Enable metadata cache flush to update lastseen
//
ingress.metadata.cache.flush.enable = boolean

//
// Set metadata cache flush period (in ms, default to 60000)
//
ingress.metadata.cache.flush.period = long

//
// Set metadata cache flush batch
//
ingress.metadata.cache.flush.batch = long

Example :

ingress.metadata.cache.flush.enable = true
ingress.metadata.cache.flush.period = 10000
ingress.metadata.cache.flush.batch = 1000

=> 24h to update 8.64M GTS

ingress.metadata.cache.flush.period = 1000
ingress.metadata.cache.flush.batch = 1000

=> 24h to update 86.4M GTS (imply not receiving Delete queries)

ingress.metadata.cache.flush.period = 100
ingress.metadata.cache.flush.batch = 1000

=> 24h to update 864M GTS (36M/h)

@StevenLeRoux
Copy link
Contributor Author

We can add a third parameter to cache flush policy :

ingress.metadata.cache.flush.enable = true
ingress.metadata.cache.flush.ttl = 24 (in hours)
ingress.metadata.cache.flush.every = 1 (in hours)
ingress.metadata.cache.flush.sleep = 100
ingress.metadata.cache.flush.batch = 1000

This means while iterating over the map, we compare the 'lastseen' value with current timestamp. If (now - lastseen < TTL), we push. If (now - lastseen > TTL) then do nothing.

Then we avoid pushing older GTS that didn't seen any activity. This can dramatically reduce the amount of messages.

@StevenLeRoux
Copy link
Contributor Author

Sum up on last proposition.

New configuration keys :

  • ingress.metadata.cache.lastseen.enable = true (default)
  • ingress.metadata.cache.lastseen.freshness = 24 (in hours)

The idea is to leverage the ingress metadata cache for advertising updates on series.
Metadata thrift structure will be modified to add a "lastseen" field.
Ingress Metadata cache will inherit from the modified thrift structure, then Ingress component will have the ability to get the current 'lastseen' value, deduct it from current timestamp and
compare it to ingress.metadata.cache.lastseen.freshness.

If (now - current_lastseen) < freshness : do nothing
else emit metadata Message to kafka with now as lastseen value

Lastseen should be enabled by default. Since it will be documented, it should be a normal behaviour, and disabling it should be considered custom deployment.

From a user perspective :

FIND : you could integrate it with FIND, answering a 0 TS key with lastseen as value :

[{
		"c": "class",
		"l": {
			"label0": "value0",
			"label1": "value1"
		},
		"a": {
			"attr0": "value0"
		},
		"v": [
			[0, lastseen],
		]
	}
]

LASTSEEN : if we don't want to break the FIND output, we can introduce a LASTSEEN function that is a FIND with the lastseen hint.

FETCH / FIND with filtering on lastseen by passing PARAM_LASTSEEN(TS) and PARAM_LASTSEEN_FRESH (boolean) when in MAP parameter instead of default LIST parameter.

DELETE : $WTOKEN 'selector' NULL NULL -n DELETE
if GTS count is -value or +value then the DELETE selector is performed on metadata that are filtered on lastseen. "-value" means : "NOW - lastseen < value" and "+value" means : "NOW - lastseen > value"
For example :
-24h (-1 24 h *) : NOW - lastseen < 24 h => filter on active series on last last 24h
+24h (24 h) : NOW - lastseen > 24 h => filter on inactive series on last 24h

API /api/v0/find : the url query parameter &lastseen=on would add the last seen as value like in GTS InputFormat :

class1{label1=val1}{attr1=val1} 1501110625
class2{label=value} 1501110625

@StevenLeRoux
Copy link
Contributor Author

Current proposition in PR #200

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants