Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jobs with multiple files don't complete when backend fails #5359

Closed
bmasonrh opened this issue Jul 27, 2018 · 3 comments
Closed

Jobs with multiple files don't complete when backend fails #5359

bmasonrh opened this issue Jul 27, 2018 · 3 comments

Comments

@bmasonrh
Copy link

When the following conditions are met:

  • A print job is submitted with multiple files (i.e. lp -d pq /etc/services /etc/services)
  • The print job uses at least one filter (i.e. not a raw queue)
  • The backend fails (due to, for example, a Broken Pipe error, which is a common occurrence)

Then the job stays in the queue. The backend fails, but the filter doesn't receive SIGPIPE and never exits, and the scheduler doesn't kills the job.

If the print job has only one file, then the filter gets SIGPIPE when the backend fails and the job is aborted (or retried depending on the Error Policy).

I've reproduced this in CUPS 1.4.2, 1.6.3 and 2.2.6.

I've been trying (unsuccessfully) to discover a difference in the way the scheduler starts filters for multi-file jobs vs. single-file jobs that would account for this behavior. Suggestions for where to look in the code would be appreciated (as would a patch to fix this, but I'm happy to work on the patch myself if I can get a push in the right direction).

Or would it make sense to call stop_job() somewhere if the backend fails. There's not much point in continuing to process a job if the backend has failed, is there?

Thanks.

@michaelrsweet
Copy link
Collaborator

michaelrsweet commented Jul 27, 2018

@bryan-mason "Broken pipe" should not be a common occurrence for a backend, particularly when all of the standard backends block/handle it.

The difference with single-vs-multi document jobs is that the backend (and associated pipes) remains active for all of the documents in the job - basically we reuse them for all documents, e.g:

filters for document 1  \
filters for document 2   | backend
...                     /
filters for document N /

Anyways, we should probably be closing the pipes when the backend fails, which will allow the filters to see that the backend has gone away (rather than block on IO) and allow the job to abort.

@bmasonrh
Copy link
Author

"Broken pipe" should not be a common occurrence for a backend, particularly when all of the standard backends block/handle it.

It's been my experience supporting enterprise customers that:

D [25/Jul/2018:17:24:05 -0700] [Job 42] Error reading back-channel data: Connection reset by peer
E [25/Jul/2018:17:24:05 -0700] [Job 42] Unable to write print data: Broken pipe

Is one of the more common failure modes that people report. The cause always seems to be some sort of network equipment problem, and the socket backend handles it gracefully and exits cleanly, but it's still a problem (usually because ErrorPolicy is stop-printer and the customer wants to know why their print queue stopped).

@michaelrsweet
Copy link
Collaborator

[master 72a2134] Fix stuck multi-file jobs (Issue #5359, Issue #5413)

[branch-2.2 e7e33bf] Fix stuck multi-file jobs (Issue #5359, Issue #5413)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants