-
-
Notifications
You must be signed in to change notification settings - Fork 320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
data: URIs are incorrectly stripped #80
Comments
Thanks. I'm not sure that we want to support data URIs in WriteFreely -- at least not for all instances. But I'm open to your input. First, some things to consider: I know WF doesn't have photo hosting built-in, so this could be an attractive feature. But some admins, especially of multi-user instances, might not be expecting to host large chunks of data inside each post, as supporting data URIs would enable. Also, as far as I understand it, adding support would make the site vulnerable to XSS attacks by post authors. Again, this is most important for multi-user instances where the admin can't trust user input, and malicious users could have a wider reach with the Reader feature. Due to these issues, I wouldn't want to support data URIs for everyone, including on Write.as. But one thing we could do, for those that really need this, would be to only allow data URIs on single-user instances. What do you think? To add support, we'd actually need to chance this func: https://github.com/writeas/writefreely/blob/32e99d00415c6e86a9536d9b824dcdf0b119270d/postrender.go#L157-L167 to include this line: policy.AllowDataURIImages() |
Thanks for posting via Fediverse about this Issue - I want to voice my support for it; I have these sort of But I agree that this should be something that is clearly - explicitly - disabled by default, for the reasons y'all say. But i think it should be real clear in the documentation that this sort of protection is in place - as I see it (semantically, not necessarily in implementation), limiting what kinds of URIs can be used is a deviation from the "standard," as much as there can be one in this situation, yeah? Like, HTML doesn't normally block (or fail to account for?) data URIs, so if y'all do, you should be explicit about that. tl;dr: I support parsing data URIs as an off-by-default option. |
I appreciate the input, @emsenn. I agree the documentation is lacking on this subject right now -- if you wouldn't mind, could you create a quick issue for this on the documentation repo so we can make sure it gets addressed? Otherwise, would you be okay if that optionality is tied to the single-/multi-user config setting, like I mentioned? That is: data URIs are disabled for multi-user instances but enabled for single-user instances? |
I think that could be a sensible way to do it, though if it is as simple as adding the specified line to the specified function, could the solution be as simple as documenting that one could add that line to allow that parsing? I'm concerned that parsing data uri images is such a niche feature, it's not worth introducing a new deviation between single- and multi-user instances behavior, even if it only affects posters, not readers. |
Yeah, I would say that tying it to that setting only makes sense if we also relax other parts of the allowed HTML policy. But actually, then that would affect mobility: writers' posts would break if they move from a single-user blog to a multi-user instance. So yeah, this kind of deviation might not make sense. As far as post rendering, it should probably be consistent across configurations, above all else. |
To confirm I understand, you're saying that, in general, you want to stay away from options that affect how the source gets rendered, to preserve easy migration? And so in the specific, this should not be an option because it'd require converting posts to migrate between an instance with it to one without? I don't know if I agree with prioritizing easy migration, but the logic seems reasonable. (To be clear, I don't disagree with the priority, I just haven't thought about it.) Would that mean this issue can be closed just by adding documentation so that users know what the expected behavior is? Seems fair to me. As a user, I'm already posting to writefreely using an Emacs script, which I suspect could be modified to parse or flag data URIs if I wanted. (I mention only to say that the problem being "unsolved" within your software does not mean the problem is unsolvable, for me. I'm generally just trying to include as much info I as I can think of since this seems a pretty "philosophical" |
Right. Data portability is important, but also: the way that the software renders user input is core to the entire platform. If we support these minor differences in core functionality, in the same application but with different configurations across different servers, it means users will inevitably end up more confused, need to spend more time reading docs, getting support from instance admins, etc. Even if you're not moving your data, I think the end-user expectation will be that one WF instance behaves, at its core, in the same way as on any other instance.
That's good to know. I think the best solution will be to get photo hosting built-in (as is planned) so clients can upload photos and just use image URLs instead. I think we've arrived at a pretty good conclusion, but I'll leave the issue open for a bit longer, in case anyone else has any input (including @nikclayton). |
Maybe this could be enabled only for subset of data payloads (only base64 encoded png,jpg,gif) and only for certain, small max size. It is an useful feature to enable small images inside the document itself. |
Whitelisting image MIME types was the approach chosen in https://snyk.io/vuln/npm:markdown-it:20160912, FWIW. |
Thanks @nikclayton, good to know. We'll need to see if I am still concerned that supporting this (and publicizing it) will encourage people to encode and embed large images in posts while there's no other built-in way to add media to posts. Again, the problems arise both for admins not expecting large chunks of non-text data and for readers who have to deal with un-cache-able images. Plus, limiting the size of images embedded this way will be difficult, and likely sub-optimal from a UX perspective (e.g. how do I know ahead of time if I'm embedding an image that's too large?). |
@thebaer I still don't understand why posts are not cacheable. At very least they should use etags, no? |
@dpc I think the chance of a single post being re-viewed by the average reader is pretty low, so client-side caching would have minimal impact. But if we did add it, as things are right now it would just mean that page views aren't counted. |
@thebaer With etags client does send the request every time, so it seems to me there is possibility to count it. And it seems to me that re-reviewing the same post is actually quite common. Especially the authors tends to re-read their posts multiple time (I do, at least). |
@dpc Ah, good point -- if you don't mind, let's continue this discussion on the forum, since it isn't related to this particular issue. |
Hi guys, I was wondering if there any ideas out there on how to embed images into posts since this issue was opened? |
@dangom We've had a small discussion about this on the forum -- hopefully some resources there can help out! On another note, I'm closing this issue since it turned into more of a discussion, and we haven't made any further progress here. If anyone is interested in continuing this, we'd be happy to look at a pull request that adds support for |
Sure, as long as we had an API to upload photos too, that'd be great. The data: URI would've been just a workaround. This issue was originally opened because OP and I were publishing to writefreely from Emacs, but there was no way to get images uploaded transparently. |
Describe the bug
The markup
Should show a red dot as an image.
It doesn't. Inspecting the generated HTML from the Markdown shows that the generated
img
element has nosrc
attribute.I think, but have not confirmed, that this is because https://github.com/writeas/writefreely/blob/32e99d00415c6e86a9536d9b824dcdf0b119270d/posts.go#L1368 does not include data: as a valid protocol.
Expected behavior
img
element should be created with the correctsrc
attribute.The text was updated successfully, but these errors were encountered: