New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support additional file name and path characters in media manager #4564
Conversation
When working with abstract file names that may contain additional characters, such as quotes or ampersands, the media manager would throw an error. This PR adds two additional characters to the character whitelist.
@LArbearrr could you provide some examples of the filenames you're having issues with? |
@LArbearrr ping-ping!) |
@w20k @LukeTowers Sorry for the delay... Here's one file that was causing issues: |
@bennothommo @w20k do we have unit tests for attempting to perform XSS with Media Manager filenames? |
@LukeTowers just a simple name check based on the |
@LukeTowers @w20k Looks like there's some: https://github.com/octobercms/october/blob/master/tests/unit/system/classes/MediaLibraryTest.php. |
@bennothommo thought of something more brutal, for me personally those are simple checks that you've pointed out 😉 Like: |
@w20k are those examples of malicious filenames that we should be blocking? |
@LukeTowers I've used this XSS payload (https://github.com/payloadbox/xss-payload-list) for few |
@bennothommo @LukeTowers luckily not much, only a few from the list.
|
hmm, sticking with regex validation seems like an exercise in futility. Can't we adjust the code to escape filenames / paths so that we don't have to worry about XSS no matter what gets input and only validate against path inclusion attacks? |
This pull request will be closed and archived in 3 days, as there has been no activity in the last 30 days. If this is still being worked on, please respond and we will re-open this pull request. |
@LukeTowers Following up on this, the only way I can think of to circumvent having to sanitise a filename for the Media library would be to store the filename on server as a hash and store the original filename somewhere else (like in the DB) and route to the correct file when retrieving the file through URL, like what is happening with attached files in the |
@bennothommo file name sanitization happens to protect against two attack vectors: Path manipulation attacks (and just an unacceptable path for the storage disk to handle) and XSS injections. Storing the filename verbatim isn't adequately protecting against XSS (although we should re-evaluate our current solution and make sure all output of stored file paths are escaped). Once we've ensured that the filename / path isn't a vector for XSS attacks, we can revisit our regex to make it more permissive by just preventing path manipulation attacks and unsupported characters (blacklist instead of whitelist). |
@LukeTowers The reason I suggested the hash filename method is we'd be able to apply more thorough XSS-prevention methods (such as The only downsides would be we'd have to create a "virtual" filesystem in wherever we store the file hash map (ie. in the database), and people wouldn't be able to simply copy their files from somewhere else into the media folder - they'd have to be uploaded through the Media Manager so all necessary steps can be taken in hashing and mapping the file. |
@bennothommo that sounds too much like a breaking change to me unfortunately. On further thought I realized that filename sanitization to prevent XSS is also required to be able to protect end users of the files (the themes) from malicious file names in their media library. This is a pretty tricky issue and right now it's looking like we'll have to keep refining the sanitization regex. |
The original changes were implemented as a result of a Common Vulnerabilities and Exposures (CVE) report. I have tested this against the original report, that used the following folder name to store an XSS possibility in its proof-of-concept:
The original issue reported that you could create a valid folder, then rename to a poisonous one, like the one provided above. The strict regex was a rushed solution so its good these rules are now relaxed. The proposed additions in this PR also appear safe Thanks! |
When working with abstract file names that may contain additional characters, such as quotes or ampersands, the media manager would throw an error. This PR adds two additional characters to the character whitelist.
This will help support complex file names and paths, especially those that may contain international characters or file naming conventions.