Pastes get truncated #328

hasufell · 2018-06-19T11:37:59Z

Steps to reproduce

Create a paste of ~1.2mb

What happens

When copy pasting the link into a new browser window, the paste is truncated to a length of 604K.

What should happen

The paste should not be truncated.

Additional information

Basic information

Server OS: Gentoo, but PrivateBin runs in docker

Webserver: jwilder/nginx-proxy as front proxy with the docker container of PrivateBin

Browser: tried with firefox 60.0.2 and chrome 67.0.3396.87

PrivateBin version: privatebin/nginx-fpm-alpine:1.1.1

The size limit in the config is set to a high value (8mb) and the nginx front proxy is configured with client_max_body_size 0; to allow arbitrary big body size.

The text was updated successfully, but these errors were encountered:

elrido · 2018-06-19T11:54:47Z

I could not reproduce this one. With the latest container, I could submit a paste of just a bit under 2 MiB and it was stored completely and could fully be read. I am using big.txt as a source for large paste texts.

hasufell · 2018-06-19T12:22:24Z

Still the same with the lastest images.

hasufell · 2018-06-19T14:27:40Z

I can even reproduce it on privatebin.net: https://privatebin.net/?e8a9c53420b88e94#/4RAjO3ThEO73SP35jZ3tLD6Ye39/v2QfycMRGE0uJU=

The input paste is: https://gist.githubusercontent.com/hasufell/7f0d19bcd5db37a5dd9c13b65d21b17e/raw/5a144797189636e96dab1261e91465a02d102236/gistfile1.txt

elrido · 2018-06-19T15:02:07Z

Great find! Its something in that particular input. The data is already truncated when it gets sent to the server, so it must be some issue in the JS logic. I'll try to find a smaller part that triggers the issue and turn it into a unit test to track it down more easily.

elrido · 2018-06-19T15:27:13Z

Haven't been able to nail it down, but it looks like the content gets truncated by the DOMPurify library. Something in there makes it look like risky HTML to that tool.

@rugk any idea what it might not like? Is it the ANSI escape sequences?

hasufell · 2018-06-19T15:28:15Z

That's pretty bad. Why is it not treated as bytes?

elrido · 2018-06-19T15:40:17Z

Because we don't want to just display JS exploits to someone visiting a paste? You always have to sanitize user input and since the server can't do it, we have to at least try to do it on the JS end.

PS: Also in JS there are only strings and no "raw byte" type. It's not a strongly typed language. JS treats its strings as UTF-16, while we use UTF-8 in the user input and content encoding from and to the server.

hasufell · 2018-06-19T16:14:50Z

You always have to sanitize user input

Uhm. I don't follow. How can it be an exploit if the data pasted into the input field is not executed and just treated as bytes?

And even if you sanitize, why is it not rejected if it's malicious?

A paste service that cannot guarantee that input = output is a problem.

rugk · 2018-06-19T17:14:35Z

No it would be executed when attached to the DOM. Well… but that ("why") does not matter anyway, let's rather tackle the actual issue.

I could reproduce it and it get's truncated here (the selected text is the last one, which is included):

So it is not really a particular symbol. And when I do not use the whole file, it truncates the input at a different position, so uggh… 😣

In any case, I'd also guess it is a bug in DOMPurify:

And, BTW, a new version has been released some days ago, so we might upgrade and also test it with this one.

But here something similar seems to have happened, where it recognized a tag it should not… mhh… (cannot reproduce this example anymore, however)

However, maybe it's actually not DOMPurify's problem, because on their test page I cannot reproduce it with that input.

rugk · 2018-06-19T17:21:16Z

As for @hasufell's argument, I guess #330 should cover it.

elrido · 2018-06-25T19:58:06Z

Another update on this: I finally got the content into a unit test for the CryptTool. This shows me that the en- & decryption & compression doesn't truncate it, the results are identical.

The issues I had with this string is that it contains all types of quotes, necessitating that I add escaping for one of them. There are also Unicode U+200B Zero-width space characters in there, not ANSI escape sequences as I had assumed on first glance. These cause "Unexpected token ILLEGAL" syntax errors in node if not used in template string quoting (backticks).

I'll now change this test to use DOMPurify on it, something we haven't tested so far, since they run their own unit tests.

elrido · 2018-06-25T20:18:18Z

Nope, it's not DOMPurify either. Will have to test the full stack then.

rugk · 2018-06-26T14:11:28Z

So what else would it be? Some simple display error? Did you look into the source code that is actually added to the DOM?

elrido · 2018-06-26T18:57:47Z

That is my next step. So far I focused on the obvious targets.

I found out that we have this issue since at least ZeroBin 0.18. Doesn't excuse it, but is further evidence that it is something in the core of the application. Something that hasn't changed (much) since.

What I hadn't considered but found out during testing this morning was that this sample compresses really well, from 1.2 MiB down to 30 KiB when deflated, encryption and base64 then inflates this to about 90 KiB. This was the POST size that made me initially think that it is already truncated before storing it. Not so sure about that now, will probably have to dump the contents at various stages and unpack them to be sure where exactly it happens.

rugk · 2018-06-26T19:31:06Z

Or just using the debugger somehow and get into different stages? I mean at some point it has to be truncated?

elrido · 2018-06-26T20:08:32Z

That was what I was talking about.

Now, bad news: The truncation happens during the rawdeflate/rawinflate compression. I am not sure if it happens during the compression or the decompression. Probably the latter, since the compression gives me about the same sized output, as when I use gzip on the sample on the command line.

I hadn't found this yesterday, because, as described, I had to escape some characters to get that string into a unit test. This apparently made it safe for rawdeflate/inflate. I'll rewrite the test to instead read the string out of a file, hope I can that way finally create a test case.

With 1.3 we anyway planned to switch to the pako library, since rawdeflate isn't 100% standard compliant and hence can't be deflated by standard libraries. If we can't find a workaround for this (maybe we can escape whatever causes this?) we will have to leave this in the 1.2 release and fix it with the switch to pako in 1.3.

elrido · 2018-06-30T15:28:47Z

Lets revisit this issue once we switched to pako.

…pto + zlib testing, proving this fixes #328

rugk mentioned this issue Jun 19, 2018

Do not sanitize text-only content, but display text #330

Closed

elrido self-assigned this Jun 26, 2018

elrido added a commit that referenced this issue Jun 26, 2018

adding unit test for truncation issue #328

40a3717

elrido added a commit that referenced this issue Jun 26, 2018

adding unit test for truncation issue #328

c76957b

elrido added this to the Release 1.3 - Review & refactor paste format milestone Jun 30, 2018

elrido added this to In progress in Release 1.3 - Review & refactor paste format Jul 22, 2018

rugk mentioned this issue Aug 29, 2018

Malformed raw text #358

Closed

elrido added a commit that referenced this issue May 15, 2019

integrating compression test case that failed in rawdeflate in webcry…

5779d87

…pto + zlib testing, proving this fixes #328

elrido mentioned this issue May 15, 2019

Webcrypto, v2 paste format, zlib compression #431

Merged

4 tasks

elrido closed this as completed in #431 May 26, 2019

Release 1.3 - Review & refactor paste format automation moved this from In progress to Done May 26, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pastes get truncated #328

Pastes get truncated #328

hasufell commented Jun 19, 2018

elrido commented Jun 19, 2018

hasufell commented Jun 19, 2018

hasufell commented Jun 19, 2018 •

edited

Loading

elrido commented Jun 19, 2018

elrido commented Jun 19, 2018

hasufell commented Jun 19, 2018

elrido commented Jun 19, 2018 •

edited

Loading

hasufell commented Jun 19, 2018 •

edited

Loading

rugk commented Jun 19, 2018

rugk commented Jun 19, 2018

elrido commented Jun 25, 2018

elrido commented Jun 25, 2018

rugk commented Jun 26, 2018

elrido commented Jun 26, 2018

rugk commented Jun 26, 2018

elrido commented Jun 26, 2018

elrido commented Jun 30, 2018

Pastes get truncated #328

Pastes get truncated #328

Comments

hasufell commented Jun 19, 2018

Steps to reproduce

What happens

What should happen

Additional information

Basic information

elrido commented Jun 19, 2018

hasufell commented Jun 19, 2018

hasufell commented Jun 19, 2018 • edited Loading

elrido commented Jun 19, 2018

elrido commented Jun 19, 2018

hasufell commented Jun 19, 2018

elrido commented Jun 19, 2018 • edited Loading

hasufell commented Jun 19, 2018 • edited Loading

rugk commented Jun 19, 2018

rugk commented Jun 19, 2018

elrido commented Jun 25, 2018

elrido commented Jun 25, 2018

rugk commented Jun 26, 2018

elrido commented Jun 26, 2018

rugk commented Jun 26, 2018

elrido commented Jun 26, 2018

elrido commented Jun 30, 2018

hasufell commented Jun 19, 2018 •

edited

Loading

elrido commented Jun 19, 2018 •

edited

Loading

hasufell commented Jun 19, 2018 •

edited

Loading