Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: The "window-status" is causing the "process.waitFor()" to block indefinitely. #131

Merged
merged 5 commits into from Dec 7, 2023
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
Expand Up @@ -55,6 +55,12 @@ public class Pdf {
*/
private int timeout = 10;

/**
* wkHTMLtoPDF uses a rather old version of WebKit, which often has trouble with up-to-date JavaScript code.
* When using window-status and a SyntaxError: Parse error occurs in JavaScript, it can result in the failure of setting the window-status. The wkhtmltopdf process will remain in a waiting state indefinitely. In such cases, a timeout is needed to handle this situation.
* */
private int windowStatusTimeout = 30;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the already existing timeout field for this purpose (right above this field)? Do you think they are different in some sense?

If we can keep only one "timeout" field would be better, and may suggestion would be using the same existing timeout field for this. We can also increase the default timeout field value to 30s, if you think 10s is too low for a default value. Regarding the 30 seconds, is it just a best guess? Could we keep 10s? This is configurable anyway via setTimeout method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very sorry, I had a busy day at work today, and I now have time to respond. 🙏🙏🙏

Regarding windowStatusTimeout, I think I need to explain the scenario in which this issue arises:

When you add the --window-status parameter in the command, wkhtmltopdf considers the current page to be loaded only after you have completed the assignment window.status = "complete" in JavaScript. It typically occurs in situations where the page content is generated asynchronously. Failure to add this parameter can result in incomplete rendering of the page content.

If you have added this parameter in the command but, for some reason, the page has not completed the assignment, wkhtmltopdf will think that the current page has not loaded and will continue to wait. This is the reason why process.waitFor() is blocked.

This issue is often caused by developers unfamiliar with the features of wkhtmltopdf. When new features or dependencies are added, developers may use ES6 or syntax not supported by wkhtmltopdf's WebKit, leading to a SyntaxError: Parse error warning during the parsing of JavaScript. This interruption in normal JavaScript execution prevents the completion of the assignment. Since it is a warning rather than an error, wkhtmltopdf does not exit but continues to wait.

Therefore, I believe it is a timeout caused by the window.status parameter, but it does fall within the range of a runtime timeout.

Regarding the 30s, in our use case, there are often concurrent processing situations, and 10s may not be sufficient. However, it is configurable, so I think there is no need to modify it.

You are right, and I will remove this parameter. I apologize for any confusion it may have caused you. 🙏🙏🙏

My English is not very good; it was generated by OpenAI's translation. Please forgive any inaccuracies.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @xweiba no need to apologize! I'm from Brazil, so my English is not good neither =D
Anyway, I really appreciate your contribution and explanation of the issue. Now I fully understand the issue and I'm confident we can merge it. Thank you for this!


private File tempDirectory;

/**
Expand Down Expand Up @@ -220,6 +226,15 @@ public void setTimeout(final int timeout) {
this.timeout = timeout;
}

/**
* Sets the timeout to wait windowStatus while generating a PDF, in seconds
*
* @param windowStatusTimeout the windowStatusTimeout
*/
public void setWindowStatusTimeout(final int windowStatusTimeout) {
this.windowStatusTimeout = windowStatusTimeout;
}

/**
* wkhtmltopdf often returns 1 to indicate some assets can't be found,
* this can occur for protocol less links or in other cases. Sometimes you
Expand Down Expand Up @@ -320,7 +335,17 @@ public byte[] getPDF() throws IOException, InterruptedException, PDFExportExcept
Future<byte[]> inputStreamToByteArray = executor.submit(streamToByteArrayTask(process.getInputStream()));
Future<byte[]> outputStreamToByteArray = executor.submit(streamToByteArrayTask(process.getErrorStream()));

process.waitFor();
if (command.contains("window-status")) {
logger.debug("Waiting for window-status waitFor: {}s...", this.windowStatusTimeout);
if (!process.waitFor(this.windowStatusTimeout, TimeUnit.SECONDS)) {
// 关闭进程
process.destroy();
logger.error("Error generating pdf by window-status timeout, command: {}", command);
throw new RuntimeException("window-status timeout");
}
} else {
process.waitFor();
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (command.contains("window-status")) {
logger.debug("Waiting for window-status waitFor: {}s...", this.windowStatusTimeout);
if (!process.waitFor(this.windowStatusTimeout, TimeUnit.SECONDS)) {
// 关闭进程
process.destroy();
logger.error("Error generating pdf by window-status timeout, command: {}", command);
throw new RuntimeException("window-status timeout");
}
} else {
process.waitFor();
}
if (!process.waitFor(this.timeout, TimeUnit.SECONDS)) {
process.destroy();
logger.error("PDF generation failed by defined timeout of {}s, command: {}", timeout, command);
throw new RuntimeException("PDF generation timeout by user. Try to increase the timeout.");
}

Following the idea of using the already existing timeout field, we can use directly process.waitFor(this.timeout, TimeUnit.SECONDS). As we already have the concept of having a timeout, we should use it wherever possible to keep usage expectations consistent.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we will need some unit tests to cover this behavior, could you add it please? We could set a really low timeout time to force the error and check if the exception is thrown.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I've rewritten it and added a unit test to cover this behavior.


if (!successValues.contains(process.exitValue())) {
byte[] errorStream = getFuture(outputStreamToByteArray);
Expand Down