Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes Redmine #1855: don't close file descriptors we use #621

Merged

Conversation

jooooooon
Copy link
Contributor

I've been studying the Redmine #1855 bug, which causes promise analysis strings (mostly HTML, sometimes TEXT) to be output to syslog at random times.

I've included a full explanation on https://cfengine.com/dev/issues/1855, but basically this patch gets the file descriptors used to write the HTML/TEXT promise analysis strings to files (actually, /dev/null usually) and makes sure the ActAsDaemon function doesn't close them. I believe that closing them, then writing to them, is what's causing this issue.

I apologize if this patch doesn't match CFEngine coding style or best practices - I did my best to match what was around in the file. Also, if I'm screwing up in my logic here, please let me know. Basically just tell me how you feel about this and I'll rework it if necessary.

@jooooooon
Copy link
Contributor Author

Hmmm, the bugtracker seems to be down, so I couldn't post the details there. Here's a copy/paste:

OK, I've spent some time debugging this, and I think I've figured out what's happening. I can reproduce this consistently on one machine just by changing my promises. Here's a bit of a long-winded description, in the hope CFEngine developers can build a fix for this into the next release.

The HTML outputs occur when cf-serverd re-reads promises from disk, which happens automatically after they've changed. The outputs to syslog happen for some calls to ShowPromiseInReportHtml, not all. This function calls WriterWriteF, which is essentially a wrapper around the standard vfprintf C function that writes to a file. Again, not all calls to this function cause the outputs to syslog, and the "syslog()" call is never used directly.

When cf-serverd starts up, the main function in src/cf-serverd.c does basically this:

  • Create a "ReportContext" with a call to function OpenReports. This runs fopen() on 2 or 3 files, and stores the FILE pointers for use later by WriterWriteF (actually, these are /dev/null usually but still valid file descriptors).
  • Some other non-relevant stuff
  • Calls function StartServer (from src/cf-serverd-functions.c), which if "NO_FORK" is true (ie, cf-serverd is not run with -d, -v or --no-fork) calls the function ActAsDaemon (from src/exec_tools.c). This function, ActAsDaemon, loops through all file descriptors that may exist, and closes them. Yes, that's right, including the ones OpenReports just created, and stored for use every time the ShowPromiseInReportHtml function is called.

So basically, unless I'm mistaken (and that is of course quite possible), what's happening is we're opening a file descriptor (to /dev/null, but still) and then closing it, and then writing lots of HTML to the closed file descriptor. I suspect that the vfprintf function is then reading random memory instead of the real FILE pointer, and some pattern in memory causes this to end up on syslog (which must be a valid file descriptor somewhere, so given enough HTML output, we end up hitting that file descriptor), but only randomly. I don't really know why this lands up in syslog, but the random pattern this occurs at suggests some explanation of the sort. Either way, it doesn't seem to make sense to close a file descriptor and then use it.

I've submitted a Pull Request that contains a patch I have tested successfully on the machine I used to reproduce this. Since implementing it, I've never reproduced the bug. The patch may not be logical, or in accordance with CFEngine coding styles - please let me know! See #621.

I'm hoping that this effort will enable this to be fixed in 3.4.5.

@jooooooon
Copy link
Contributor Author

Bugtracker being back up, I put my previous comment there. Sorry for the noise.

@tzz
Copy link
Contributor

tzz commented May 6, 2013

This is really cool, thank you for tracking down this heisenbug :)

sigurdteigen pushed a commit that referenced this pull request May 6, 2013
…utputs

Fixes Redmine #1855: don't close file descriptors we use
@sigurdteigen sigurdteigen merged commit bc09566 into cfengine:3.4.x May 6, 2013
@sigurdteigen
Copy link
Contributor

Thank you. Let's pray this works.

tzz pushed a commit to tzz/core that referenced this pull request Oct 25, 2016
Add missing override_classes to control/def.cf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants