New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Break submissions table into files and messages #229
Comments
I started implementing this and realized we are opening ourselves up to both security vulnerabilities and usability shortcomings. There is nothing in the backend that prevents large messages from being sent. There is a 500mb cap on the size of the request, but we typically think of this as files. Someone could repeatedly send 500mb messages. These would be loaded into the database which could degrade performance (especially with migrations). Additionally, this could break the UI by causing excessive memory usage or extreme sluggishness as massive messages are read from disk and loaded into widgets. My proposal would be to keep them on disk and read the fist X unicode characters and store them in a "preview" column. If there are additional characters, there would we a "see full message" button that would open the message in a disposable VM like all other documents. |
Hmm. In user testing journalists consistently asserted that "legit" Sources send long messages. Of course, there's always exceptions to the rule and we don't want to tweak the product in such a way that prevents a journalist from seeing a very important leak. WRT "See full message" button... I feel that'd be a lot of work for a very marginal likelihood use-case. I also don't see it being evident to a user why they couldn't just view the full message in the message pane, even if it is multiple window-length scrolls long. My primary concern for right now, though, would be to set-up thing on the backend to enable multiple file uploads for Sources. I'm not sure if the submissions table config would have anything to do with that, but only allowing Sources 1 file at a time to upload feels tedious and like a poor experience. I also question if it encourages Source users to .zip multiple files into a single upload, which would keep Journalists from being able to preview/triage individual files as we're now enabling them to do in the Workstation—while also putting the whole thing at risk to not upload via being too big. |
If it is possible to crash the app by sending say, 20 500mb messages (10GB of text), then some trolls might do it for teh lulz. Or a government that is worried about an impeding leak might DOS all SecureDrops as a disruptive measure. This isn't something a source would do under normal conditions. This is specifically a guard against malicious content. |
Good thoughts @heartsucker, I think we should (unordered list follows because both can be done independently):
What do you think? |
I love the character-limit suggestion, and feel it's a great way to mitigate potential necessity of CAPTCHAs. My concern with it is around user messaging in execution, however, and already there are a lot of things that are either potentially confusing or not clearly explained in the Source experience. If it were to not be exposed as an up-front caveat and rather only presented as an error returned on a submission in excess of 10,000 characters, that to me seems like a good interim solution. Tangential consideration is that some languages have much longer common words than English. Tagalog and Nordic languages come to mind. 10,000 characters is a lot, tho. Before studying this issue, I'd rather study how the Source interface currently looks in other alphabets—and how Journalists might receive those communications in the Workstation (like: if a Source submits in Hindi but the Workstation language is set to English, will the message show-up to journalists in the Workstation in Hindi characters, or as English alphabet garble matching ISO keys?). |
@redshiftzero Should we still consider only storing a partial amount of the files in the sqlite db? My reasoning being that if a SD was compromised, someone could put arbitrarily large messages into it so that all clients download them and break. Once the clients are clogged up with large messages, there might not be an easy way for a user to go in and delete them. This might require an admin to execute sqlite commands or blow away the whole database. |
Part of #231.
This ticket is for breaking up the
submissions
table into two tables:files
andmessages
without other schema changes (other non-schema code changes will need to occur elsewhere in the application code). Note for the purpose of rapid development, let's not write alembic migrations since we'll do #48 before the next release.Motivation: Ticket #217 describes adding a content column to the submission table. However, this column would only be populated for the messages submission type (not files). @joshuathayer and I were chatting yesterday, and were thinking it would make sense for us to first break up the submissions table into
files
andmessages
. This is because then in #217 we'd only add thecontent
column to themessages
table, and it would resolve the common confusion for people new to the client code between the distinction between submissions vs messages vs files.The text was updated successfully, but these errors were encountered: