Escape file names in the Collection process #196

Closed
benjaminvialle opened this Issue Nov 25, 2010 · 7 comments

Comments

Projects
None yet
4 participants
@benjaminvialle
Member

benjaminvialle commented Nov 25, 2010

When adding a pdf file with an accent or with a special caracter like ° or a space, the name is not escaped when given to imageconverter and the conversion process fails silentely.

@benjaminvialle

This comment has been minimized.

Show comment
Hide comment
@benjaminvialle

benjaminvialle Nov 25, 2010

Member

Here is Severin's advice on this point :

Hi,

I'm a bit speculating here, but those lines in our code makes we wonder
if the accented character had something to do with it (Aside: maybe we
should reconsider sanitizing filenames of submission files again):

 # ImageMagick can only save images with heights not exceeding 65500
    pixels.
 # Larger images result in a conversion failure.
    begin
 #This is an ImageMagick command, see
    http://www.imagemagick.org/script/convert.php for documentation
    `convert -limit memory
    #{MarkusConfigurator.markus_config_pdf_conv_memory_allowance} 
    -limit map
    0 -density 150 -resize 66% #{storage_path}/#{self.filename} -append
    #{file_path}`
     m_logger.log(I18n.t("markus_logger.pdf_convert_success"))
    rescue Exception => e
     m_logger.log(I18n.t("markus_logger.pdf_convert_failed"))
    end

More food of thought here:
http://hans.fugal.net/blog/2007/11/03/backticks-2-0/

If that's something related, you (Benjamin) may be able to reproduce
some kind of error using a simple small ruby script.

Maybe something along those lines:

    #!/usr/bin/ruby
    storage_path=something
    filename=something
    file_path=something
    `convert -limit memory 100 -limit map 0 -density 150 -resize 66%
    #{storage_path}/#{filename} -append #{file_path}

Maybe worth a shot. What do you think? I might be on the wrong track,
though.

@benjamin: What is the output of the command "locale" (run on the
markus-hosting server)?
The output is fr_FR.UTF-8

Cheers,
Severin

I didn't had time to investigate this now, but I don't forget this issue ;-)

Member

benjaminvialle commented Nov 25, 2010

Here is Severin's advice on this point :

Hi,

I'm a bit speculating here, but those lines in our code makes we wonder
if the accented character had something to do with it (Aside: maybe we
should reconsider sanitizing filenames of submission files again):

 # ImageMagick can only save images with heights not exceeding 65500
    pixels.
 # Larger images result in a conversion failure.
    begin
 #This is an ImageMagick command, see
    http://www.imagemagick.org/script/convert.php for documentation
    `convert -limit memory
    #{MarkusConfigurator.markus_config_pdf_conv_memory_allowance} 
    -limit map
    0 -density 150 -resize 66% #{storage_path}/#{self.filename} -append
    #{file_path}`
     m_logger.log(I18n.t("markus_logger.pdf_convert_success"))
    rescue Exception => e
     m_logger.log(I18n.t("markus_logger.pdf_convert_failed"))
    end

More food of thought here:
http://hans.fugal.net/blog/2007/11/03/backticks-2-0/

If that's something related, you (Benjamin) may be able to reproduce
some kind of error using a simple small ruby script.

Maybe something along those lines:

    #!/usr/bin/ruby
    storage_path=something
    filename=something
    file_path=something
    `convert -limit memory 100 -limit map 0 -density 150 -resize 66%
    #{storage_path}/#{filename} -append #{file_path}

Maybe worth a shot. What do you think? I might be on the wrong track,
though.

@benjamin: What is the output of the command "locale" (run on the
markus-hosting server)?
The output is fr_FR.UTF-8

Cheers,
Severin

I didn't had time to investigate this now, but I don't forget this issue ;-)

@jerboaa

This comment has been minimized.

Show comment
Hide comment
@jerboaa

jerboaa Nov 25, 2010

Member

Looking at this now we HAVE TO sanitize at least self.filename, which is user input, before passing this to convert.

Member

jerboaa commented Nov 25, 2010

Looking at this now we HAVE TO sanitize at least self.filename, which is user input, before passing this to convert.

@benjaminvialle

This comment has been minimized.

Show comment
Hide comment
@benjaminvialle

benjaminvialle Feb 10, 2011

Member

I just add a label 1.0.0 because this bug is very tedious in Centrale Nantes. A lot of students are sending files with accents in file name.

Member

benjaminvialle commented Feb 10, 2011

I just add a label 1.0.0 because this bug is very tedious in Centrale Nantes. A lot of students are sending files with accents in file name.

@hoboman313

This comment has been minimized.

Show comment
Hide comment
@hoboman313

hoboman313 Feb 11, 2012

I am looking at this bug as a part of #338.
First of all, I have not been able to successfully convert a .pdf to an image in the latest master ( may be a problem with my environment or the code ). However, I believe that this issue has already been fixed. The file name is now strictly enforced to certain characters and I am unable to submit a file with accents or any other non-alphanumeric characters.
The implementation for file sanitation can be found on line 384 of submissions_controller and line 98 of submission_helper.

If you feel that my findings are not correct, please provide steps to reproduce this issue.

I am looking at this bug as a part of #338.
First of all, I have not been able to successfully convert a .pdf to an image in the latest master ( may be a problem with my environment or the code ). However, I believe that this issue has already been fixed. The file name is now strictly enforced to certain characters and I am unable to submit a file with accents or any other non-alphanumeric characters.
The implementation for file sanitation can be found on line 384 of submissions_controller and line 98 of submission_helper.

If you feel that my findings are not correct, please provide steps to reproduce this issue.

@ghost ghost assigned ghigt May 4, 2013

@benjaminvialle

This comment has been minimized.

Show comment
Hide comment
@benjaminvialle

benjaminvialle May 4, 2013

Member

@ghigt please check if this issue still happens.

Member

benjaminvialle commented May 4, 2013

@ghigt please check if this issue still happens.

@ghigt

This comment has been minimized.

Show comment
Hide comment
@ghigt

ghigt May 6, 2013

Member

I don't have any problems to submit a file named "éà& ê|%". The filename is renamed with "__".
But like @hoboman313 I haven't been able to convert my pdf file. I don't know if it's very slow or broken, I stay on the "Now converting pdf files...0/2" message and no errors on term.

Member

ghigt commented May 6, 2013

I don't have any problems to submit a file named "éà& ê|%". The filename is renamed with "__".
But like @hoboman313 I haven't been able to convert my pdf file. I don't know if it's very slow or broken, I stay on the "Now converting pdf files...0/2" message and no errors on term.

@benjaminvialle

This comment has been minimized.

Show comment
Hide comment
@benjaminvialle

benjaminvialle May 8, 2013

Member

Ok. Thank you guys, I'm closing this issue.

We will update the documentation with the installation of ghoscript needed to convert pdf file.

Member

benjaminvialle commented May 8, 2013

Ok. Thank you guys, I'm closing this issue.

We will update the documentation with the installation of ghoscript needed to convert pdf file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment