Skip to content
This repository has been archived by the owner on Apr 4, 2023. It is now read-only.

ModularInput: Events in the EventWriter queue are lost when Splunk shuts down. #37

Closed
brattonc opened this issue Apr 29, 2015 · 6 comments
Labels

Comments

@brattonc
Copy link
Contributor

I ran in to this issue while working on a modular input that processes a large amount of data and subsequently runs for an extended period of time. If Splunkd shuts down while a modular input is running and events are sitting in the EventWriter queue, these events are lost and never logged to Splunk. Additionally, the events are sent to the IProgress< EventWrittenProgressReport > as if they were successfully written. This makes it impossible to create a checkpoint for the file being processed because there's no way to tell how many events actually made it to Splunk.

Steps to reproduce:

  1. Create a modular input that continuously logs events.
  2. Run the modular input using Splunk and let the EventWriter queue fill up with at least 100,000+ events.
  3. Stop the Splunk windows service while the modular input is running.

At this point Splunk will send a ctrl+break signal to the modular input. I have the modular input trap this signal and stop writing events. The EventWriter continues writing events to stdout and sending the Events to the progress reporter, but none of these events reach Splunkd. In fact, events written to stdout for up to 500ms (the amount of time varies from run to run) before the ctrl+break is received are lost as well.

@brattonc
Copy link
Contributor Author

A fix I'd suggest is...

  1. Have Splunkd send a signal to the modular input via it's stdin requesting that it shutdown.
  2. The modular input framework would then discard everything in the event writer queue (without signaling the progress reporter) and signal back to splunkd that event writing is complete.
  3. And then the modular input framework fires an event so that checkpoint data can be saved.
  4. If the modular input doesn't terminate within 5 seconds, then have splunkd send the ctrl+break signal to it.

This isn't a simple fix as it requires changes to Splunkd, but I think that's unavoidable given the symptoms I've outlined in the issue (losing some events that are written even before the modular input receives the ctrl+break).

@itay
Copy link
Contributor

itay commented Apr 29, 2015

@brattonc thanks for filing the bug. As you noted, there is a larger issue with the control mechanisms from splunkd, and that's something we will look into but won't be fixed in the immediate future.

Do you think there is a bug in the C# Mod Input framework itself to fix, or is this just a symptom of the larger problem?

@brattonc
Copy link
Contributor Author

Imo, the EventWriter sending events to the progress reporter when it could know that Splunkd is trying to terminate it (via ctrl+break) is a bug. Trapping this and dumping the event queue wouldn't fix the issue, but would at least mitigate some of the damage. I'd rather lose a couple hundred events than 100k+ as it is right now.

I understand if this isn't a desirable change as it's a stopgap and not a true fix. As an alternative, a mechanism for injecting some other implementation of an EventWriter in to the ModularInput class would let me implement the ctrl+break trap and queue dumping myself.

@itay
Copy link
Contributor

itay commented Apr 29, 2015

@brattonc we'll look into this and see what we can do. Pull requests as always are welcome.

@glennblock
Copy link
Contributor

Thank you @brattonc. We will definitely look into this.
On Wed, Apr 29, 2015 at 11:11 AM Itay Neeman notifications@github.com
wrote:

@brattonc https://github.com/brattonc we'll look into this and see what
we can do. Pull requests as always are welcome.


Reply to this email directly or view it on GitHub
#37 (comment)
.

@ncanumalla-splunk
Copy link
Contributor

This SDK is deprecated and no longer under active development.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

5 participants