Permalink
Browse files

range: reject char globs with missing end like '[L-]'

... which previously would lead to out of boundary reads.

Reported-by: Luật Nguyễn
  • Loading branch information...
bagder committed Oct 4, 2016
1 parent 269a889 commit ee4f76606cfa4ee068bf28edd37c8dae7e8db317
Showing with 19 additions and 15 deletions.
  1. +19 −15 src/tool_urlglob.c
View
@@ -188,32 +188,36 @@ static CURLcode glob_range(URLGlob *glob, char **patternp,
/* character range detected */
char min_c;
char max_c;
char end_c;
int step=1;
pat->type = UPTCharRange;
rc = sscanf(pattern, "%c-%c", &min_c, &max_c);
rc = sscanf(pattern, "%c-%c%c", &min_c, &max_c, &end_c);
if((rc == 2) && (pattern[3] == ':')) {
char *endp;
unsigned long lstep;
errno = 0;
lstep = strtoul(&pattern[4], &endp, 10);
if(errno || (*endp != ']'))
step = -1;
else {
pattern = endp+1;
step = (int)lstep;
if(step > (max_c - min_c))
if(rc == 3) {
if(end_c == ':') {
char *endp;
unsigned long lstep;
errno = 0;

This comment has been minimized.

Show comment
Hide comment
@jay

jay Nov 1, 2016

Member

This will be a problem, we don't have a portable way to set errno on Windows at the moment see #895. I think for the time being we could check if endp == &pattern[4] and lstep == 0 then problem, else lstep == ULONG_MAX then assume overflow.

@jay

jay Nov 1, 2016

Member

This will be a problem, we don't have a portable way to set errno on Windows at the moment see #895. I think for the time being we could check if endp == &pattern[4] and lstep == 0 then problem, else lstep == ULONG_MAX then assume overflow.

This comment has been minimized.

Show comment
Hide comment
@bagder

bagder Nov 1, 2016

Member

So...

Always clear errno by calling _set_errno(0) immediately before a call that may set it, and check it immediately after the call

From https://msdn.microsoft.com/en-us/library/t3ayayh1.aspx isn't good enough?

I understand errno is generally a problem with Windows but I figured maybe the standard libc functions were different as they are defined to work with errno like this as part of their standard functionality.

However, the strtoul docs says it will return ULONG_MAX on overflow which certainly will make it bail out in the check below anyway, right? Maybe that is good enough?

@bagder

bagder Nov 1, 2016

Member

So...

Always clear errno by calling _set_errno(0) immediately before a call that may set it, and check it immediately after the call

From https://msdn.microsoft.com/en-us/library/t3ayayh1.aspx isn't good enough?

I understand errno is generally a problem with Windows but I figured maybe the standard libc functions were different as they are defined to work with errno like this as part of their standard functionality.

However, the strtoul docs says it will return ULONG_MAX on overflow which certainly will make it bail out in the check below anyway, right? Maybe that is good enough?

This comment has been minimized.

Show comment
Hide comment
@jay

jay Nov 2, 2016

Member

we don't have a portable way to set errno on Windows at the moment see #895.

Disregard what I said about that, I addressed it in the bug just now.

There is one issue though you should be aware of. POSIX says EINVAL may be returned if no conversion could be performed. Microsoft does not do that. I'm not familiar with what you're doing here but basically what I'm saying is if &pattern[4] can't be converted you could end up with errno may or may not be 0. Therefore if that if statement depends errno being set when a conversion could not be performed it is incorrect and should instead be something like this:

if(errno || &pattern[4] == endp || (*endp != ']'))

@jay

jay Nov 2, 2016

Member

we don't have a portable way to set errno on Windows at the moment see #895.

Disregard what I said about that, I addressed it in the bug just now.

There is one issue though you should be aware of. POSIX says EINVAL may be returned if no conversion could be performed. Microsoft does not do that. I'm not familiar with what you're doing here but basically what I'm saying is if &pattern[4] can't be converted you could end up with errno may or may not be 0. Therefore if that if statement depends errno being set when a conversion could not be performed it is incorrect and should instead be something like this:

if(errno || &pattern[4] == endp || (*endp != ']'))

This comment has been minimized.

Show comment
Hide comment
@jay

jay Nov 8, 2016

Member

@bagder ping. this is ambiguous, errno may or may not be set on a failed conversion. it should be clear one way or the other.

@jay

jay Nov 8, 2016

Member

@bagder ping. this is ambiguous, errno may or may not be set on a failed conversion. it should be clear one way or the other.

This comment has been minimized.

Show comment
Hide comment
@bagder

bagder Nov 8, 2016

Member

I didn't change that logic now so this is old code that we've had no trouble with before AFAIK so it feels like this is a theoretical discussion on what can be a problem somewhere. But sure, we should fix those as well.

@bagder

bagder Nov 8, 2016

Member

I didn't change that logic now so this is old code that we've had no trouble with before AFAIK so it feels like this is a theoretical discussion on what can be a problem somewhere. But sure, we should fix those as well.

This comment has been minimized.

Show comment
Hide comment
@jay

jay Nov 9, 2016

Member

here's an example

http://example.com/file[a-z:].txt

step ends up either 0 or -1 depending on the implementation of strtoul. I don't have an opinion which way is more correct (they both seem to have the same bad glob result), but can't we agree that step should be consistent regardless of implementation? It just seems like bad practice even if any potential side effects are theoretical -- at the moment. Are you ok with the proposed change?

@jay

jay Nov 9, 2016

Member

here's an example

http://example.com/file[a-z:].txt

step ends up either 0 or -1 depending on the implementation of strtoul. I don't have an opinion which way is more correct (they both seem to have the same bad glob result), but can't we agree that step should be consistent regardless of implementation? It just seems like bad practice even if any potential side effects are theoretical -- at the moment. Are you ok with the proposed change?

This comment has been minimized.

Show comment
Hide comment
@jay

jay Nov 25, 2016

Member

I'm assuming you're ok with it, landed in a6618b5.

@jay

jay Nov 25, 2016

Member

I'm assuming you're ok with it, landed in a6618b5.

lstep = strtoul(&pattern[4], &endp, 10);
if(errno || (*endp != ']'))
step = -1;
else {
pattern = endp+1;
step = (int)lstep;
if(step > (max_c - min_c))
step = -1;
}
}
else if(end_c != ']')
/* then this is wrong */

This comment has been minimized.

Show comment
Hide comment
@falconindy

falconindy Nov 15, 2016

Contributor

This seems totally fine since it's the char that comes after the end of the range. Should this perhaps instead of comparing max_c and not end_c? I ask because globbing is broken in 7.51.0:

$ curl www.example.com/[a-c]
curl: (3) [globbing] unmatched close brace/bracket in column 21
@falconindy

falconindy Nov 15, 2016

Contributor

This seems totally fine since it's the char that comes after the end of the range. Should this perhaps instead of comparing max_c and not end_c? I ask because globbing is broken in 7.51.0:

$ curl www.example.com/[a-c]
curl: (3) [globbing] unmatched close brace/bracket in column 21

This comment has been minimized.

Show comment
Hide comment
@bagder

bagder Nov 15, 2016

Member

Great catch! It fails to advance the pointer in this case. I suggest this fix and I'll work on writing a test case for this.

--- a/src/tool_urlglob.c
+++ b/src/tool_urlglob.c
@@ -211,10 +211,13 @@ static CURLcode glob_range(URLGlob *glob, char **patternp,
         }
       }
       else if(end_c != ']')
         /* then this is wrong */
         rc = 0;
+      else
+        /* end_c == ']' */
+        pattern += 4;
     }
 
     *posp += (pattern - *patternp);
 
     if((rc != 3) || (min_c >= max_c) || ((max_c - min_c) > ('z' - 'a')) ||
@bagder

bagder Nov 15, 2016

Member

Great catch! It fails to advance the pointer in this case. I suggest this fix and I'll work on writing a test case for this.

--- a/src/tool_urlglob.c
+++ b/src/tool_urlglob.c
@@ -211,10 +211,13 @@ static CURLcode glob_range(URLGlob *glob, char **patternp,
         }
       }
       else if(end_c != ']')
         /* then this is wrong */
         rc = 0;
+      else
+        /* end_c == ']' */
+        pattern += 4;
     }
 
     *posp += (pattern - *patternp);
 
     if((rc != 3) || (min_c >= max_c) || ((max_c - min_c) > ('z' - 'a')) ||
rc = 0;
}
else
pattern += 4;
*posp += (pattern - *patternp);
if((rc != 2) || (min_c >= max_c) || ((max_c - min_c) > ('z' - 'a')) ||
if((rc != 3) || (min_c >= max_c) || ((max_c - min_c) > ('z' - 'a')) ||
(step <= 0) )
/* the pattern is not well-formed */
return GLOBERROR("bad range", *posp, CURLE_URL_MALFORMAT);

0 comments on commit ee4f766

Please sign in to comment.