Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use part-time-simple in fixed-time-window #728

Closed

Conversation

mcorbin
Copy link
Contributor

@mcorbin mcorbin commented Oct 7, 2016

fix #563
I tried to use part-time-simple in fixed-time-window, so fixed-time-window can fire the vector of events by itself. It seems to work. I have also updated the test suite.
I am new to Riemann and his codebase, so maybe i missed something.

with part-time-simple, fixed-time-window can fire the vector of events
by itself
@mcorbin mcorbin force-pushed the part-time-simple-fixed-time-window branch from 06ab432 to 12cf43a Compare October 7, 2016 20:57
@mcorbin
Copy link
Contributor Author

mcorbin commented Oct 9, 2016

I have a few interrogations.

If i understand correctly the actual code:

  • The start-time of the window is the time of the first event. here
  • Events are added to the buffer if they have no :time or if :time < start-time + window-size here
  • If one event have :time > start-time + window-size, we are in a new window. The beginning of the new window is calculated. We now send to the children the buffer of events and n empty buffers, when n is the number of window we "missed". here

My code add events to the buffer if the event has no :time or if :time > start-time. I don't check if the event :time < start-time + window-size. Is this necessary ?
Also, if there is no event in a window i don't generate empty buffer to the children. part-time-simple indicate "When no events arrive in a given time window, no functions are called". But i generate empty buffer if i drop events (part-time-simple is called, and i drop the event because his :time is too old)

What is the desired behavior for fixed-time-window ? There are several possibilities regarding :time (OK if event :time in [start-time, start-time + window-size], or maybe only if :time < start-time + window-size, or if :time > start-time etc...) and regarding empty buffers send to children (and the part-time-simple behavior)

@mcorbin
Copy link
Contributor Author

mcorbin commented Oct 20, 2016

can someone answer my questions ? thx

@aphyr
Copy link
Collaborator

aphyr commented Oct 26, 2016

I like this idea. It is going to change the semantics, but we've received consistent feedback that having the windowing functions be lazy (instead of triggering every n seconds) is confusing, so I think the change is supportable--and it will simplify the codebase some. I suggest making this change to all fixed-windows (and cleaning up any unused fixed-window functions).

There is a critical problem, though: because this uses local time as opposed to stream time, this will break windowing functions that operate on delayed events. Part-time-simple's windows cut over on local time, e.g. when the system clock ticks 10, 20, 30 seconds, etc. However, there's no guarantee the events arriving at time 30 will be from time 30--they could be from time 20. The current fixed-window implementation doesn't care, because it only depends on event times. However, if you reject events by comparing event times to the current times, people who feed fixed windows with, say, rollups, rates, or percentiles--any stream that delays--might find their events silently dropped.

So I'd say that if we make this change, you should either a.) write a variant of part-time-simple that uses event times to infer the current clock (this might be hard; the riemann.time scheduler isn't really built for it) or b.) ignore event times altogether. Or c.) leave things the way they are. None of these are really... ideal, so I'll leave the call up to @pyr and @jamtur01. What do y'all think?

@Anvil
Copy link

Anvil commented Oct 27, 2016

OK, for the record, I'm positively sure that the problem described by @aphyr will hit the configuration w have here. I'm all for keeping the current fixed-time-window implementation as is.

The setup we have here is one or several front "level-1" riemann instance(s) processing and forwarding events to a "level-2" riemann instance running some fixed-time-window to collect and process events from "level-1". With the current fixed-time-window implementation, whatever the load "level-2" riemann, all events can be processed. A part-time-simple based solution would not allow that, as far as I understand the new implementation.

And I cannot ignore :time, because it is information, not noise.

My position here is : "I dont want to miss events however late they are as far as they arrive in pseudo-order that allow fixed-time-window to efficiently gather 99%-ish of them."

@Anvil
Copy link

Anvil commented Oct 27, 2016

(please, note that, despite of what I need, I realize the advantages of your solution. Maybe the 2 implementations can coexist with different names?)

@mcorbin
Copy link
Contributor Author

mcorbin commented Nov 1, 2016

I think the actual code can drop events. The docstring of fixed-time-window-fn contains : "Once an event is emitted, all events older or equal to that emitted event are silently dropped."

And in the code :

                        ; Too old
                        (< (:time event) @start-time)
                        nil

@mcorbin
Copy link
Contributor Author

mcorbin commented Dec 12, 2016

I close this PR for now.

@boernd
Copy link
Contributor

boernd commented Dec 6, 2017

@mcorbin I think it would be really nice to have a windowed time stream where you do not care about event time but only about the arrival time of the event. I took your code from the PR and just removed the if condition of the add function and it seems to work. Maybe we could call it fixed-system-time-window or something similiar?

@mcorbin
Copy link
Contributor Author

mcorbin commented Dec 6, 2017

@boernd Yes we could if @jamtur01 and @aphyr are ok with it. Btw, it's maybe better to use part-time-fast instead of part-time-simple.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fixed-time-window doesn't seem to fire the vector of events by itself
4 participants