Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plugins/filestat: Skip missing files #7316

Merged
merged 2 commits into from
Feb 17, 2021

Conversation

antifuchs
Copy link
Contributor

This PR attempts to fix #7315, the same problem as #6940, but without adding a config flag.

Previously, the filestat plugin would check for a file's existence, and then try to compute all sorts of things for the missing file, which failed and resulted in a log message about possible permission errors.

Now, the plugin just skips the file if it exists. If an error is encountered probing an existing file, the plugin now logs that error, too, instead of hazarding a guess at permission problems.

Required for all PRs:

  • Signed CLA.
  • Associated README.md updated. (no update needed)
  • Has appropriate unit tests: Added a test.

Previously, the filestat plugin would check for a file's existence,
and then try to compute all sorts of things for the missing file,
which failed and resulted in a log message about possible permission
errors.

Now, the plugin just skips the file if it exists. If an error is
encountered probing an existing file, the plugin now logs that error,
too, instead of hazarding a guess at permission problems.
@sjwang90 sjwang90 added the fix pr to fix corresponding bug label Nov 23, 2020
Copy link
Contributor

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we want to ignore the files silently. What if the user misspelled the filename? He probably will notice quite late (when analyzing the data). So my guts-feeling is we should log at least once.
How about adding a map[string]bool to the struct for remembering which filename was logged already and output a warning on the first encounter of the missing file. Then set the map entry for the file to true and skip any further warning-outputs...

How does that sound to you?

@srebhan srebhan self-assigned this Dec 4, 2020
@sjwang90 sjwang90 added this to the Planned milestone Dec 9, 2020
@srebhan
Copy link
Contributor

srebhan commented Jan 6, 2021

@antifuchs any update?

@antifuchs
Copy link
Contributor Author

Ah, sorry I missed this! I will look into getting this implemented.

@antifuchs
Copy link
Contributor Author

@srebhan Updated - I hope this reflects what you had in mind.

Comment on lines 40 to 42
missingFiles map[string]struct{}
// files that had an error in Stat - we only log the first error.
filesWithErrors map[string]struct{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about

Suggested change
missingFiles map[string]struct{}
// files that had an error in Stat - we only log the first error.
filesWithErrors map[string]struct{}
missingFiles map[string]bool
// files that had an error in Stat - we only log the first error.
filesWithErrors map[string]bool

Comment on lines 96 to 101
if _, ok := f.missingFiles[fileName]; !ok {
f.Log.Warnf("File %q not found", fileName)
f.missingFiles[fileName] = struct{}{}
}
continue
} else {
delete(f.missingFiles, fileName)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can simplify it to

Suggested change
if _, ok := f.missingFiles[fileName]; !ok {
f.Log.Warnf("File %q not found", fileName)
f.missingFiles[fileName] = struct{}{}
}
continue
} else {
delete(f.missingFiles, fileName)
}
if ! f.missingFiles[fileName] {
f.Log.Warnf("File %q not found", fileName)
f.missingFiles[fileName] = true
}
continue
}
f.missingFiles[fileName] = false

Comment on lines 106 to 112
if _, ok := f.filesWithErrors[fileName]; !ok {
f.filesWithErrors[fileName] = struct{}{}
f.Log.Errorf("Unable to get info for file %q: %v",
fileName, err)
}
} else {
delete(f.filesWithErrors, fileName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above. Use a string/bool map.

Copy link
Contributor

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use a map[string]bool instead. See the comments in the code. Otherwise looks good!

If we encounter an error (file not found or otherwise), only log that
error once, instead of every time the file gets stat'ed.

* Introduce two new maps, for holding information on whether the last
  stat resulted in "file not found" or another error
* Probe and set that value accordingly
* Also fix a bug where the actual error encountered in Stat didn't get
  logged.
@antifuchs
Copy link
Contributor Author

Restructured to use string/bool maps, hope that works for you

@sjwang90 sjwang90 removed this from the Planned milestone Jan 29, 2021
Copy link
Contributor

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@srebhan srebhan added the ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review. label Feb 1, 2021
Copy link
Contributor

@ssoroka ssoroka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit concerned that the memory use of this grows infinitely on directories with a lot of files going in/out. I guess I'm okay merging it, but I think we should come back and fix that up.

Comment on lines +39 to +42
// files that were missing - we only log the first time it's not found.
missingFiles map[string]bool
// files that had an error in Stat - we only log the first error.
filesWithErrors map[string]bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi, struct{} uses no memory.

Suggested change
// files that were missing - we only log the first time it's not found.
missingFiles map[string]bool
// files that had an error in Stat - we only log the first error.
filesWithErrors map[string]bool
// files that were missing - we only log the first time it's not found.
missingFiles map[string]struct{}
// files that had an error in Stat - we only log the first error.
filesWithErrors map[string]struct{}

Copy link
Contributor Author

@antifuchs antifuchs Feb 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that a struct{} map would use less memory, but then again it'll use about as much memory as the number of files configured to be checked, whose config structure already uses memory ... it comes out a wash, I feel like.

@ssoroka ssoroka merged commit b991aab into influxdata:master Feb 17, 2021
ssoroka pushed a commit that referenced this pull request Feb 17, 2021
arstercz pushed a commit to arstercz/telegraf that referenced this pull request Mar 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix pr to fix corresponding bug ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review.
Projects
None yet
4 participants