New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix broken serialization of binary http uploads #241

Closed
wants to merge 2 commits into
base: saxon96
from

Conversation

Projects
None yet
2 participants
@Conal-Tuohy
Contributor

Conal-Tuohy commented Sep 19, 2016

This commit fixes a bug which can corrupt the entity body of binary http uploads.

Previously, binary content was treated as text with a particular character encoding, but in fact a base64-encoded c:body does not have a 'character encoding' since it represents a sequence of bytes, not characters.

I discovered this while trying to upload a PDF file, which was read from a file: URI successfully using p:http-request, and could be saved with p:store to produce an identical file, but which was corrupted when PUT to an http: URI using p:http-request.

I tested this using curl, a web service that accepts PUT, and a logging TCP tunnel; by comparing the HTTP request made by curl with the request made by my pipeline I was able to identify the underlying problem in the http-request step's implementation. However, I struggled to build a test case for this using Calabash's test framework.

@Conal-Tuohy Conal-Tuohy reopened this Sep 19, 2016

try {
if ("base64".equals(encoding)) {
String charset = body.getAttributeValue(_charset);

This comment has been minimized.

@Conal-Tuohy

Conal-Tuohy Sep 20, 2016

Contributor

The base64-encoded data would typically be non-textual in nature, but even if it is in fact text (which has been base64-encoded for some reason), and being text, does have a charset, this charset would still be irrelevant to the task at hand which is just to upload the raw byte stream.

@Conal-Tuohy

Conal-Tuohy Sep 20, 2016

Contributor

The base64-encoded data would typically be non-textual in nature, but even if it is in fact text (which has been base64-encoded for some reason), and being text, does have a charset, this charset would still be irrelevant to the task at hand which is just to upload the raw byte stream.

@@ -561,28 +563,25 @@ private void doPutOrPostSinglepart(HttpEntityEnclosingRequest method, XdmNode bo
if (encoding != null && !"base64".equals(encoding)) {
throw XProcException.stepError(52);
}
HttpEntity requestEntity = null;

This comment has been minimized.

@Conal-Tuohy

Conal-Tuohy Sep 20, 2016

Contributor

The requestEntity variable's type is now declared to be the interface HttpEntity rather than a StringEntity. The concrete instance will be either a ByteArrayEntity, if encoding="base64", or a StringEntity otherwise.

@Conal-Tuohy

Conal-Tuohy Sep 20, 2016

Contributor

The requestEntity variable's type is now declared to be the interface HttpEntity rather than a StringEntity. The concrete instance will be either a ByteArrayEntity, if encoding="base64", or a StringEntity otherwise.

@ndw

This comment has been minimized.

Show comment
Hide comment
@ndw

This comment has been minimized.

Show comment
Hide comment

ndw added a commit that referenced this pull request Dec 16, 2016

@ndw ndw closed this in f091455 Dec 16, 2016

ndw added a commit that referenced this pull request Dec 16, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment