Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle spaces in filenames of Media Manager uploads #9608

Merged
merged 2 commits into from
Mar 29, 2016
Merged

Handle spaces in filenames of Media Manager uploads #9608

merged 2 commits into from
Mar 29, 2016

Conversation

ralain
Copy link
Contributor

@ralain ralain commented Mar 26, 2016

Summary of Changes

Currently when uploading a file through the Media Manager with a space in its name an error is thrown. All other unwanted chars are already removed by JFile::makeSafe() but this method explicitly allows spaces.

Errors tend to confuse inexperienced users, so this PR ensures that spaces are replaced with dashes instead of causing a failed upload.

Testing Instructions

  1. In the Media Manager attempt to upload a file with one or more spaces in its name. Upload should fail.
  2. Apply patch.
  3. Repeat step 1. Upload should succeed and spaces in the filename be replaced with dashes.

@brianteeman
Copy link
Contributor

I have tested this item ✅ successfully on a7709ed


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9608.

@MATsxm
Copy link

MATsxm commented Mar 26, 2016

I have tested this item ✅ successfully on a7709ed

Able to reproduce then #9608 works as described - Thanks


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9608.

@brianteeman
Copy link
Contributor

Maybe we need to update the error message as well
COM_MEDIA_ERROR_WARNFILENAME="File name must only contain alphanumeric characters and no spaces."


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9608.

@joomla-cms-bot joomla-cms-bot added the RTC This Pull Request is Ready To Commit label Mar 26, 2016
@ralain
Copy link
Contributor Author

ralain commented Mar 26, 2016

The error message is triggered by JHelperMedia::canUpload() so I think the message would still be accurate if someone used that method directly without sanitizing the filename first.

@cyrezdev
Copy link
Contributor

This reminds me this issue #7841 ( @brianteeman still open but could be closed as issue seems already fixed, even if i don't know when, and in which PR...) and my comment about spaces in file upload : #7841 (comment)

So i agree to accept space on upload 👍 , but i prefer dash - as replacement.

In Google’s search engine algorithm, the underscore isn’t a recognized word separator.

For example:
my-new-black-kitten.jpg will be seen as 4 (key)words my new black kitten
my_new_black_kitten.jpg will be seen as 1 word mynewblackkitten

Another link with simple and clear explanation : http://searchengineland.com/9-seo-quirks-you-should-be-aware-of-146465

So... dash or underscore ?

@ralain
Copy link
Contributor Author

ralain commented Mar 27, 2016

I actually prefer the dash myself but went with underscore as I (mistakenly) thought that was the more accepted word separator. Happy to change it.

@brianteeman
Copy link
Contributor

brianteeman commented Mar 27, 2016 via email

@cyrezdev
Copy link
Contributor

After many search since months, underscore seems to be used when it's an icon, a design element... in fact when you don't want and need to list this image in search engines.
Unfortunately, nothing definitely stated as a norm.

As image is included in a url (<img src="url-to-image"/>, or <a href="url-to-image">, metadata img...),
i am personally for a url safe output...

But this would mean to replace this:
$file['name'] = JFile::makeSafe($file['name']);

By this (the first 3 lines. Last ones are what i use in my own extension for empty sanitized $filename (eg. Asian characters filename)):

$filename = JFilterOutput::stringURLSafe(JFile::stripExt($file['name']));
$ext = strtolower(JFile::getExt($file['name']));
$file['name'] = $filename . '.' . $ext;

// If filename empty (for example, if you have a filename like this 代替品.jpg, it is currently returning the same error as with spaces in current J version), filename based on current date/time
if ( ! $filename)
{
    $file['name'] = JFactory::getDate()->format("YmdHis") . '.' . $ext;
}

This will return a lowcase dash separated string, the same way alias are generated.
For example, Spread the Joomla! Love.JPG will be spread-the-joomla-love.jpg
(today, with this PR, it would be Spread_the_Joomla_Love.JPG, not the best url SEO-friendly filename...)

More discussion here (and not having a clear position from all...): #7841 (comment)

In fact, a decision from PLT and translators, would help to decide the good way to go!

@brianteeman
Copy link
Contributor

Why are underscore not URL safe?
On 27 Mar 2016 2:26 pm, "Cyril Rezé" notifications@github.com wrote:

After many search since months, underscore seems to be used when it's an
icon, a design element... in fact when you don't want and need to list this
image in search engines.
Unfortunately, nothing definitely stated as a norm.

As image is included in a url, i am personally for a url safe output...
But this would mean to replace this:
$file['name'] = JFile::makeSafe($file['name']);

By this (the first 3 lines. Last ones are what i use in my own extension
for empty sanitized $filename (eg. Asian characters filename)):

$filename = JFilterOutput::stringURLSafe(JFile::stripExt($file['name']));
$ext = strtolower(JFile::getExt($file['name']));
$file['name'] = $filename . '.' . $ext;

// If filename empty (for example, if you have a filename like this 代替品.jpg, it is currently returning the same error as with spaces in current J version), filename based on current date/time
if ( ! $filename)
{
$file['name'] = JFactory::getDate()->format("YmdHis") . '.' . $ext;
}

This will return a lowcase dash separated string, the same way alias are
generated.
For example, Spread the Joomla! Love.JPG will be
spread-the-joomla-love.jpg
(today, with this PR, it would be Spread_the_Joomla_Love.JPG, not the
best url safe filename...)

More discussion here (and not having a clear position from all...): #7841
(comment)
#7841 (comment)

In fact, a decision from PLT and translators, would help to decide the
good way to go!


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#9608 (comment)

@cyrezdev
Copy link
Contributor

@brianteeman they are safe, but not SEO friendly.

@cyrezdev
Copy link
Contributor

@brianteeman direct link to video of Matt Cutts, in case you didn't watch it already in one of the previous links i posted ;-)
https://www.youtube.com/watch?v=AQcSFsQyct8

@brianteeman
Copy link
Contributor

I was only querying this statement of yours

(today, with this PR, it would be Spread_the_Joomla_Love.JPG, not the
best url safe filename...)

On 27 March 2016 at 15:23, Cyril Rezé notifications@github.com wrote:

@brianteeman https://github.com/brianteeman direct link to video of
Matt Cutts, in case you didn't watch it already in one of the previous
links i posted ;-)
https://www.youtube.com/watch?v=AQcSFsQyct8


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#9608 (comment)

Brian Teeman
Co-founder Joomla! and OpenSourceMatters Inc.
http://brian.teeman.net/

@cyrezdev
Copy link
Contributor

@brianteeman Yes, an error, and i understood after your comment why, and so i just edited my message ;-)

@ralain
Copy link
Contributor Author

ralain commented Mar 27, 2016

@JoomliC The additional sanitation you suggest is probably beyond the scope of this PR. I suggest opening a new one referencing issue #7841.

I will note that filenames based entirely on a datetime could be problematic, as one can upload multiple files at the same time.

From the links you have provided it's evident that dashes are the way to go and I'll update the PR to reflect that.

@joomla-cms-bot
Copy link

This PR has received new commits.

CC: @brianteeman, @MATsxm


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9608.

@cyrezdev
Copy link
Contributor

I will note that filenames based entirely on a datetime could be problematic, as one can upload multiple files at the same time.

I understand, but is there a multilple upload function on Joomla media component today ?...
And the same will apply if multiple uploads of files named for example joomla.jpg ;-)

Note: in this case, an auto-increment if file exists is not difficult to add for media manager too, as the same process as for alias.

In all cases, this is not to be included until feedback from translators.

I mention PR #7841, as i think "Why not to have norms that will suit all languages, in one shot!"...
Ok, so it seems that we have to process step by step...

  • accented characters in uploaded filename: recently fixed (so JFile::makeSafe creates invalid filenames and prevents upload #7841 is FIXED!)
  • spaces in filename: fixed by this PR
  • Asian characters (Kanji...): not yet! (Asian users will have to wait a bit more to not have the notice "This file type is not supported.", which is of course not the appropriate alert message!). I will see to create a PR for this...

About dash to replace space: 👍

@infograf768
Copy link
Member

accented characters in uploaded filename: recently fixed (so #7841 is FIXED!)

By which PR?

@ralain
Copy link
Contributor Author

ralain commented Mar 27, 2016

I understand, but is there a multilple upload function on Joomla media component today ?...

Indeed, the Media Manager already allow you to select multiple files to upload in the same request.

@cyrezdev
Copy link
Contributor

@infograf768 this is exactly the question i had just above... The fact is in 3.5, no more issue, but #7841 is still open... and i don't find where, and when this was fixed... :-|

@ralain oh! you're right, didn't notice it before 😶
Will test this PR, as it is already an improvement for users, and a more advanced solution is not for tomorrow ;-)

@cyrezdev
Copy link
Contributor

I have tested this item ✅ successfully on 11c1b75


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/9608.

@Kubik-Rubik Kubik-Rubik added this to the Joomla 3.5.2 milestone Mar 29, 2016
@Kubik-Rubik
Copy link
Member

Another question is whether spaces make sense at all in file names on the server? IMO it would be better to change the behavior globally in the makeSafe function but this could lead to a b/c break...

I agree with @JoomliC, thank you for the examples. In one of my plugins I also use underscores for the replacement of whitespaces but will change it to dashes in the next version. Additionally I also make the string lowercase.

We need another test, then it can go into the 3.5.2 version.

@Kubik-Rubik Kubik-Rubik modified the milestones: Joomla 3.5.1, Joomla 3.5.2 Mar 29, 2016
@Kubik-Rubik
Copy link
Member

Okay, this is an easy one and I will merge it for 3.5.1. (Tests by Marc-Antoine and Brian included since only small change).
Thanks @ralain!

@Kubik-Rubik Kubik-Rubik merged commit 4d8edcc into joomla:staging Mar 29, 2016
@joomla-cms-bot joomla-cms-bot removed the RTC This Pull Request is Ready To Commit label Mar 29, 2016
@cyrezdev
Copy link
Contributor

Another question is whether spaces make sense at all in file names on the server?
Of course, no space on server! ;-)

But users mainly have spaces in their images, pdf... on their computer. That's where this PR is really useful for a better UX.

IMO it would be better to change the behavior globally in the makeSafe function but this could lead to a b/c break...

I would better think of a new function (globalizeFilenameOnUpload...), for file renaming on upload only, which could be used for com_media, and taking care of all possible filename depending of culture. A globalized naming.
On Asian sites i've visited, i mainly found numerics as the file name, often something like this : 20160329-000010.jpg (where the last 6 digits seems to be an increment).
But i'm not Asian, so i wonder if there are norms or recommendations for such languages...

Thanks @ralain 👍

@Kubik-Rubik
Copy link
Member

Sorry, I had to revert the merge because we are already in RC state and I should not have merged it.

@ralain Could you please do the same PR once again, then we can merge it into 3.5.2. Sorry for the inconvenience!

@ralain
Copy link
Contributor Author

ralain commented Mar 29, 2016

@Kubik-Rubik No problem, will do.

@ZoFx
Copy link

ZoFx commented Sep 13, 2016

infograf768 this is exactly the question i had just above... The fact is in 3.5, no more issue, but #7841 is still open... and i don't find where, and when this was fixed... :-|

@JoomliC I just tested it in J3.6.2 and it's not fixed at all. Accented characters still get stripped entirely and filenames only containing accented characters result in the "filetype not supported" error thrown.

@Kubik-Rubik
Copy link
Member

@ZoFx Can you please try my plugin "EIR" (https://joomla-extensions.kubik-rubik.de/eir-easy-image-resizer) with the option "Safe Upload Names" enabled and check whether these images are uploaded with a readable image name?

@cyrezdev
Copy link
Contributor

cyrezdev commented Sep 14, 2016

@ZoFx when testing since 3.5 with åäö.jpg, this one works, and file is renamed aao.jpg (just tested again on Joomla 3.6.2, and still works for me).
That's why i supposed this fixed, but didn't see where the change was applied...
Tested with åäöéèû.jpg and it returns aaoeeu.jpg, and file successfully uploaded.
(note: tested in com_media by using the upload button)

What does not work yet is a filename like this : 代替品.jpg
This why i proposed this in comment : #9608 (comment) (i can do a PR for asian, arabic ... characters to generate a datetime-based filename if accepted as a possible workaround ? IMO to be solved for all people from those cultures and languages!)

@ZoFx
Copy link

ZoFx commented Sep 14, 2016

A similar discussion is going on in #7841 ...

@JoomliC maybe I'm completely barking up the wrong tree, but when I'm uploading files through media manager, there's no transliteration at all and åäö.jpg is refused ("filetype not supported"). I have everything updated to the latest stable (J3.6.2, com_media 3.0.0). If it matters (it proably does not ...) I'm using FF48.0.2 on Win 10. Either you or I seem to have installed an extension which influences com_media upload. I might need to test it on a complete clean install ...

@Kubik-Rubik with EIR installed in version 3.4.2 it works as described by JoomliC. åäö.jpg gets uploaded and renamed to aaeoe.jpg.

@ZoFx
Copy link

ZoFx commented Sep 14, 2016

I already commented in #7841
Just to add it to this discussion as well: I just confirmed that it's not fixed in 3.6.2 stable (did a complete fresh installation). Uploading åäö.jpg through media manager results in "Notice: This file type is not supported."

@cyrezdev
Copy link
Contributor

@ZoFx commented here #7841 (comment) to keep messages on the right place ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants