Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support additional file name and path characters in media manager #4564

Merged
merged 2 commits into from Dec 5, 2019

Conversation

LarryBarker
Copy link
Contributor

When working with abstract file names that may contain additional characters, such as quotes or ampersands, the media manager would throw an error. This PR adds two additional characters to the character whitelist.

This will help support complex file names and paths, especially those that may contain international characters or file naming conventions.

When working with abstract file names that may contain additional characters, such as quotes or ampersands, the media manager would throw an error. This PR adds two additional characters to the character whitelist.
@LukeTowers
Copy link
Contributor

@LArbearrr could you provide some examples of the filenames you're having issues with?

@w20k
Copy link
Contributor

w20k commented Sep 2, 2019

@LArbearrr ping-ping!)

@LarryBarker
Copy link
Contributor Author

@w20k @LukeTowers Sorry for the delay... Here's one file that was causing issues:
BG中国通讯期刊(Blend'r)创刊号.pdf

@LukeTowers
Copy link
Contributor

@bennothommo @w20k do we have unit tests for attempting to perform XSS with Media Manager filenames?

@LukeTowers LukeTowers added this to the v1.0.459 milestone Sep 3, 2019
@w20k
Copy link
Contributor

w20k commented Sep 3, 2019

@LukeTowers just a simple name check based on the rules, nothing fancy like XSS.

@bennothommo
Copy link
Contributor

@w20k
Copy link
Contributor

w20k commented Sep 4, 2019

@bennothommo thought of something more brutal, for me personally those are simple checks that you've pointed out 😉

Like:
GIF89a/*<svg/onload=alert(1)>*/=alert(document.domain)//; or
helloworld.<RLO>cod.exe (Right-to-Left Override) or
?file=secret.doc%00.pdf (byte logic)

@LukeTowers
Copy link
Contributor

@w20k are those examples of malicious filenames that we should be blocking?

@w20k
Copy link
Contributor

w20k commented Sep 4, 2019

@LukeTowers I've used this XSS payload (https://github.com/payloadbox/xss-payload-list) for few White Hacking jobs that I did.
Will do the quick test run, and report back if we really need to add more tests.

@w20k
Copy link
Contributor

w20k commented Sep 5, 2019

@bennothommo @LukeTowers luckily not much, only a few from the list.
First is quite common pitfall, if you google for it 😄. Also unicode and hex chars (U+003C < Less-than sign, \x3C is hexadecimal for <) :

['žscriptualert(EXSSE)ž/scriptu'],
['[color=red width=expression(alert(123))][color]'],
['\x3c'],
['\x3C'],
['\u003c'],
['\u003C'],

Screenshot:
Screenshot 2019-09-05 at 11 04 14

@LukeTowers
Copy link
Contributor

hmm, sticking with regex validation seems like an exercise in futility. Can't we adjust the code to escape filenames / paths so that we don't have to worry about XSS no matter what gets input and only validate against path inclusion attacks?

@LukeTowers LukeTowers modified the milestones: v1.0.459, v1.0.460 Sep 11, 2019
@github-actions
Copy link

This pull request will be closed and archived in 3 days, as there has been no activity in the last 30 days. If this is still being worked on, please respond and we will re-open this pull request.

@LukeTowers LukeTowers reopened this Oct 25, 2019
@bennothommo
Copy link
Contributor

bennothommo commented Nov 21, 2019

@LukeTowers Following up on this, the only way I can think of to circumvent having to sanitise a filename for the Media library would be to store the filename on server as a hash and store the original filename somewhere else (like in the DB) and route to the correct file when retrieving the file through URL, like what is happening with attached files in the storage/app/uploads folder.

@LukeTowers
Copy link
Contributor

@bennothommo file name sanitization happens to protect against two attack vectors: Path manipulation attacks (and just an unacceptable path for the storage disk to handle) and XSS injections. Storing the filename verbatim isn't adequately protecting against XSS (although we should re-evaluate our current solution and make sure all output of stored file paths are escaped).

Once we've ensured that the filename / path isn't a vector for XSS attacks, we can revisit our regex to make it more permissive by just preventing path manipulation attacks and unsupported characters (blacklist instead of whitelist).

@bennothommo
Copy link
Contributor

bennothommo commented Nov 22, 2019

@LukeTowers The reason I suggested the hash filename method is we'd be able to apply more thorough XSS-prevention methods (such as htmlentities and the like, and correctly sanitising URLs) without worrying about it not mapping to the correct file path. It would also allow us to not have to worry about path manipulation, as effectively we'd control where the file goes "on disk".

The only downsides would be we'd have to create a "virtual" filesystem in wherever we store the file hash map (ie. in the database), and people wouldn't be able to simply copy their files from somewhere else into the media folder - they'd have to be uploaded through the Media Manager so all necessary steps can be taken in hashing and mapping the file.

@LukeTowers
Copy link
Contributor

@bennothommo that sounds too much like a breaking change to me unfortunately. On further thought I realized that filename sanitization to prevent XSS is also required to be able to protect end users of the files (the themes) from malicious file names in their media library. This is a pretty tricky issue and right now it's looking like we'll have to keep refining the sanitization regex.

@daftspunk
Copy link
Member

The original changes were implemented as a result of a Common Vulnerabilities and Exposures (CVE) report.

I have tested this against the original report, that used the following folder name to store an XSS possibility in its proof-of-concept:

<svg onload=alert(1)>

The original issue reported that you could create a valid folder, then rename to a poisonous one, like the one provided above. The strict regex was a rushed solution so its good these rules are now relaxed. The proposed additions in this PR also appear safe

Thanks!

@daftspunk daftspunk merged commit 2b05d01 into octobercms:develop Dec 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

5 participants