New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fread may lead to vulnerabilities #1702

Closed
edwardabraham opened this Issue May 12, 2016 · 11 comments

Comments

Projects
None yet
8 participants
@edwardabraham

edwardabraham commented May 12, 2016

If the argument to fread is a path, it opens the file and reads it. If the argument has a space, it interprets it as a system command and runs it.

I am concerned that filenames are the kind of information that people pass in from websites. As Rserve and shiny are being used more widely, fread may open up a vulnerability. By passing in a filename with a space, fread can be triggered to switch from reading in files to running arbitrary code, including deleting files. This is reminiscent of imagetragick, which allowed a crafted image upload to cause arbitrary code execution.

For this to be an issue, there would need to be a way to get a string from a web form or API to fread. I have no idea how common this may be. And a careful person might protect against this. Nevertheless, this handy fread behaviour creates a potential vulnerability. Ideally reading files and executing code are kept clearly separate.

@edwardabraham

This comment has been minimized.

Show comment
Hide comment
@edwardabraham

edwardabraham May 12, 2016

Here is an example of this kind of thing in the wild (well, in github code)

edwardabraham commented May 12, 2016

Here is an example of this kind of thing in the wild (well, in github code)

@jangorecki

This comment has been minimized.

Show comment
Hide comment
@jangorecki

jangorecki May 12, 2016

Member

Adding new argument file which accept only file path seems pretty easy and sufficient solution.

Member

jangorecki commented May 12, 2016

Adding new argument file which accept only file path seems pretty easy and sufficient solution.

@jangorecki jangorecki added the fread label May 12, 2016

@mattdowle

This comment has been minimized.

Show comment
Hide comment
@mattdowle

mattdowle May 12, 2016

Member

Interesting. Different ways to do it then :
fread(file="rm -rf *") would fail with 'file not found'. (First argument is called input=, not used here.)
fread("rm -rf *", file=TRUE) where file='auto' by default
and/or
freadFile("rm -rf *") could call either of above
and/or
options(datatable.webSafeMode = TRUE) would turn off running system commands from fread.
and/or
by default only allow a restricted list of safe commands e.g. grep, cut, gunzip etc.

Member

mattdowle commented May 12, 2016

Interesting. Different ways to do it then :
fread(file="rm -rf *") would fail with 'file not found'. (First argument is called input=, not used here.)
fread("rm -rf *", file=TRUE) where file='auto' by default
and/or
freadFile("rm -rf *") could call either of above
and/or
options(datatable.webSafeMode = TRUE) would turn off running system commands from fread.
and/or
by default only allow a restricted list of safe commands e.g. grep, cut, gunzip etc.

@mattdowle mattdowle added this to the v1.9.8 milestone May 12, 2016

@jangorecki

This comment has been minimized.

Show comment
Hide comment
@jangorecki

jangorecki May 12, 2016

Member

by default only allow a restricted list of safe commands e.g. grep, cut, gunzip etc.

I like it, could be well managed as option, so any application can add own if needed, or set to NULL to disable system calls, and allow file path only.

Member

jangorecki commented May 12, 2016

by default only allow a restricted list of safe commands e.g. grep, cut, gunzip etc.

I like it, could be well managed as option, so any application can add own if needed, or set to NULL to disable system calls, and allow file path only.

@mgahan

This comment has been minimized.

Show comment
Hide comment
@mgahan

mgahan May 24, 2016

by default only allow a restricted list of safe commands e.g. grep, cut, gunzip etc.

@mattdowle @jangorecki I understand the appeal of this, but it seems like it would be a shame to give up the flexibility that fread currently provides. For example, combining fread with the aws command line interface allows users to read data into R straight from Amazon S3 (not many languages can do this).

dat <- fread(paste0("aws s3 cp s3://model-data/data.csv.gz - | gunzip"))

This is pretty awsome IMHO.

mgahan commented May 24, 2016

by default only allow a restricted list of safe commands e.g. grep, cut, gunzip etc.

@mattdowle @jangorecki I understand the appeal of this, but it seems like it would be a shame to give up the flexibility that fread currently provides. For example, combining fread with the aws command line interface allows users to read data into R straight from Amazon S3 (not many languages can do this).

dat <- fread(paste0("aws s3 cp s3://model-data/data.csv.gz - | gunzip"))

This is pretty awsome IMHO.

@eantonya

This comment has been minimized.

Show comment
Hide comment
@eantonya

eantonya May 24, 2016

Contributor

I agree with @mgahan - restricting it by default would have a severe negative impact for me. I do all kinds of crazy things now with fread and shell commands.

Contributor

eantonya commented May 24, 2016

I agree with @mgahan - restricting it by default would have a severe negative impact for me. I do all kinds of crazy things now with fread and shell commands.

@edwardabraham

This comment has been minimized.

Show comment
Hide comment
@edwardabraham

edwardabraham May 24, 2016

I like the idea of adding an argument which means that fread will only read
files (rather than evaluate system commands). This would be compatible with
other uses of fread, but would provide a way of protecting against
malicious input in a context where fread was processing user input.

Of course, the person writing the code would need to know that they should
this argument ...

On 25 May 2016 at 08:49, eduard notifications@github.com wrote:

I agree with @mgahan https://github.com/mgahan - restricting it by
default would have a severe negative impact for me. I do all kinds of crazy
things now with fread and shell commands.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#1702 (comment)

edwardabraham commented May 24, 2016

I like the idea of adding an argument which means that fread will only read
files (rather than evaluate system commands). This would be compatible with
other uses of fread, but would provide a way of protecting against
malicious input in a context where fread was processing user input.

Of course, the person writing the code would need to know that they should
this argument ...

On 25 May 2016 at 08:49, eduard notifications@github.com wrote:

I agree with @mgahan https://github.com/mgahan - restricting it by
default would have a severe negative impact for me. I do all kinds of crazy
things now with fread and shell commands.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#1702 (comment)

jangorecki added a commit to jangorecki/data.table that referenced this issue Jun 2, 2016

@franknarf1

This comment has been minimized.

Show comment
Hide comment
@franknarf1

franknarf1 Jul 6, 2016

Apart from the security concerns mentioned here, I would also like to see the distinction between input and file args, so I can give a one-line character input without it being interpreted as a command-line call.

franknarf1 commented Jul 6, 2016

Apart from the security concerns mentioned here, I would also like to see the distinction between input and file args, so I can give a one-line character input without it being interpreted as a command-line call.

@jangorecki

This comment has been minimized.

Show comment
Hide comment
@jangorecki

jangorecki Jul 14, 2016

Member

I really like sprintf api, i.e.

fread(sprintf("hadoop fs -cat %s", hdfs.url))

It has nicer api for more complex substitution in piping string. Maybe that could be handled somewhere along current options?

Member

jangorecki commented Jul 14, 2016

I really like sprintf api, i.e.

fread(sprintf("hadoop fs -cat %s", hdfs.url))

It has nicer api for more complex substitution in piping string. Maybe that could be handled somewhere along current options?

@jangorecki

This comment has been minimized.

Show comment
Hide comment
@jangorecki

jangorecki Jul 24, 2016

Member

any feedback on jangorecki@6e16d26 ?

Member

jangorecki commented Jul 24, 2016

any feedback on jangorecki@6e16d26 ?

@MichaelChirico

This comment has been minimized.

Show comment
Hide comment
@MichaelChirico

MichaelChirico Jul 25, 2016

Contributor

@jangorecki perhaps put more emphasis in the man page about why this is useful/when developers should strictly be using file instead of input

Contributor

MichaelChirico commented Jul 25, 2016

@jangorecki perhaps put more emphasis in the man page about why this is useful/when developers should strictly be using file instead of input

@arunsrinivasan arunsrinivasan modified the milestones: v2.0.0, v1.9.8 Aug 26, 2016

@jangorecki jangorecki modified the milestones: v1.9.8, v2.0.0 Sep 15, 2016

jangorecki added a commit that referenced this issue Sep 18, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment