Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline "route_to_stream()" function does not work with custom index sets #4954

Closed
lennartkoopmann opened this issue Jul 25, 2018 · 8 comments · Fixed by #6788
Closed

Pipeline "route_to_stream()" function does not work with custom index sets #4954

lennartkoopmann opened this issue Jul 25, 2018 · 8 comments · Fixed by #6788
Assignees

Comments

@lennartkoopmann
Copy link
Member

@lennartkoopmann lennartkoopmann commented Jul 25, 2018

The route_to_stream() pipeline function does not route the message into the index set of the underlying stream.

Expected Behavior

When routing a message into a stream that has a custom index set configured, I expect the route_to_stream() function to write the message into that index set.

Current Behavior

The message is written to the default index set.

Steps to Reproduce (for bugs)

  1. Create a stream with a new custom index set.
  2. Route a message into that stream, using the route_to_stream() function.
  3. Open the index set and you'll see that it has not stored any messages.
  4. Open the stream. You will not find any messages.
  5. Change the stream setting to use the default stream and you'll find the routed messages because Graylog is now searching in the default index set.

Your Environment

  • Graylog Version: 3.0.0-snapshot-from-monday
@lennartkoopmann lennartkoopmann added this to the 3.0.0 milestone Jul 25, 2018
@bernd bernd added the needs-input label Jul 26, 2018
@bernd

This comment has been minimized.

Copy link
Member

@bernd bernd commented Jul 26, 2018

@lennartkoopmann I am unable to reproduce that. Can you please show us your pipeline rule code?

@lennartkoopmann

This comment has been minimized.

Copy link
Member Author

@lennartkoopmann lennartkoopmann commented Aug 15, 2018

I've had another issue very much related to this one. Will try to find time today or tomorrow to write up the exact steps to reproduce. (I think the processor order might be involved.)

Stay tuned.

@no-response no-response bot removed the needs-input label Aug 15, 2018
@jalogisch

This comment has been minimized.

Copy link
Member

@jalogisch jalogisch commented Aug 20, 2018

I tried to reproduce this in my Lab and a clean OVA installation and can't reproduce this.
Would need to know how this can happen, or what you configured how exactly.

@lennartkoopmann

This comment has been minimized.

Copy link
Member Author

@lennartkoopmann lennartkoopmann commented Aug 20, 2018

so....... :) I tried to reproduce this again and now it just works. Not sure what I did differently the last time.

Thanks!

@no-response no-response bot removed the needs-input label Aug 20, 2018
@no-response no-response bot reopened this Aug 20, 2018
@jrunu

This comment has been minimized.

Copy link
Contributor

@jrunu jrunu commented May 17, 2019

I hope its alright that I bump this closed issue. I was seeing exactly the same behaviour described in the Reproducing Steps in the original post. Only that I'm running a graylog 3.0.2+1686930.

While investigating I determined the following behaviour to be consistent:

  1. Setting a different Index Set while Creating stream leads to the expected behaviour
  2. Changing the Index Set after creation shows the unexpected behaviour.
  3. Pausing / Unpausing the Stream after performing the index set change leads to the expected behaviour.

So I would assume that, unlike with editing and saving for example inputs, the trigger to restart/reload/propagade the change is not triggered.

@bernd bernd removed this from the 3.0.0 milestone May 22, 2019
@bernd

This comment has been minimized.

Copy link
Member

@bernd bernd commented May 22, 2019

@jrunu Thank you for the updated steps to reproduce this.! 👍 I will reopen the issue.

@bernd bernd reopened this May 22, 2019
@bernd bernd added the #S label Oct 28, 2019
@thll thll self-assigned this Nov 11, 2019
thll added a commit that referenced this issue Nov 13, 2019
Previously we were storing the stream itself in the nameToStream
MultiMap but were using only the id of the stream as the comparator for
the underlying set. This would lead to stale stream objects in that
cache if the cache was updated with a changed stream which was already
cached.

By using a real cache to load the streams by name on demand and
invalidating if a stream changes, we impose a penalty on first lookup by
name after a change but avoid the stale items.

fixes #4954
@thll

This comment has been minimized.

Copy link
Contributor

@thll thll commented Nov 13, 2019

I was finally able to reproduce the issue. It only happens if the route_to_stream() pipeline function routes to the stream by name . If it routes to the stream by ID, everything works as expected.
This is a bug in the internal caching of streams used by the function. We will provide a fix for that.

@lennartkoopmann

This comment has been minimized.

Copy link
Member Author

@lennartkoopmann lennartkoopmann commented Nov 15, 2019

That's great news! Thank you.

@bernd bernd closed this in #6788 Nov 25, 2019
bernd added a commit that referenced this issue Nov 25, 2019
)

Previously we were storing the stream itself in the nameToStream
MultiMap but were using only the id of the stream as the comparator for
the underlying set. This would lead to stale stream objects in that
cache if the cache was updated with a changed stream which was already
cached.

The implementation felt way to complicated for what it was doing. For
the benefit of having clearer semantics we simply reload all streams
when a stream changes. This will put additional load on the database for
cache updating but cache lookup times will stay stable.

Fixes #4954
linuspahl added a commit that referenced this issue Nov 27, 2019
)

Previously we were storing the stream itself in the nameToStream
MultiMap but were using only the id of the stream as the comparator for
the underlying set. This would lead to stale stream objects in that
cache if the cache was updated with a changed stream which was already
cached.

The implementation felt way to complicated for what it was doing. For
the benefit of having clearer semantics we simply reload all streams
when a stream changes. This will put additional load on the database for
cache updating but cache lookup times will stay stable.

Fixes #4954
bernd added a commit that referenced this issue Nov 28, 2019
)

Previously we were storing the stream itself in the nameToStream
MultiMap but were using only the id of the stream as the comparator for
the underlying set. This would lead to stale stream objects in that
cache if the cache was updated with a changed stream which was already
cached.

The implementation felt way to complicated for what it was doing. For
the benefit of having clearer semantics we simply reload all streams
when a stream changes. This will put additional load on the database for
cache updating but cache lookup times will stay stable.

Fixes #4954

(cherry picked from commit e9dc74a)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.