BUGFIX: crash when "content-length" > max_int #142

MannyClicks · 2019-04-12T10:17:47Z

Hello Yuji,

this PR fixes a bug that happens when the Content-Length > max_int.
This crash was verified and reproduced on MacOS.

Thanks for creating this nice library, very helpful for us. ❤️

yhirose · 2019-04-12T13:02:24Z

@MannyClicks, thank you for the really good bug report!

After ponder over your PR, I came to the conclusion that it should convert the Content-Length to size_t, because the value will eventually used as size_t with basic_string& assign( size_type count, CharT ch ) at this line.

So I am thinking to remove get_header_value_int and make the following method instead.

inline size_t get_header_value_size_t(const Headers &headers, const char *key, size_t def = 0)

When it comes to the conversion from string to size_t, we could follow the suggestion on the following stackoverflow post.
https://stackoverflow.com/questions/34043894/convert-string-to-size-t#answer-34044121

I'll let you know when it's ready. Thanks a lot!

yhirose · 2019-04-12T13:03:55Z

Also, I am thinking to add the length check with the actual size of size_t type on a user's environment.

MannyClicks · 2019-04-12T13:28:26Z

Thanks for the kind words.

I recommend not using size_t directly. This choice would clamp the range of content-length to 32bit on 32bit platforms. Any upload > 4gb would break the server. Also, I assume the STL throws an exception if you try to convert a string stream value exceeding the size_t range to a size_t.

My PR guarantees an unsigned 64bit type on all platforms.
You can still handle the result gracefully on 32bit platforms - if this is what you need to do.

yhirose · 2019-04-12T14:50:52Z

@MannyClicks, thanks for the clear information. I now understand that size_t conversion shouldn't be done. I have two more questions to understand fully this PR.

(1) Can we use uint64_t for unsigned long long for consistency? (It's because uint64_t is already used at various places in httplib.h like here.)

(2) Should we fix read_content_with_length to work properly on 32-bit platform? (I am using httplib.h on 32-bit Windows applications...)

Here is my thought regarding it:

The return value from get_header_value_uint64 is passed to size_t len parameter on read_content_with_length. So if the value is > 64GB on 32-bit platform, the value will be truncated since the size of size_t is 32-bit. Then, the following code will only read the content up to the first 32-bit length, and the rest of content will just remain on the socket. That would cause problems.

Also out cannot hold the entire content, because the maximum std::string buffer size is 32-bit on 32-bit platform.

cpp-httplib/httplib.h

Lines 822 to 838 in 07ed076

    
           inline bool read_content_with_length(Stream &strm, std::string &out, size_t len, 
        
                                                Progress progress) { 
        
             out.assign(len, 0); 
        
             size_t r = 0; 
        
             while (r < len) { 
        
               auto n = strm.read(&out[r], len - r); 
        
               if (n <= 0) { return false; } 
        
               r += n; 
        
               if (progress) { 
        
                 if (!progress(r, len)) { return false; } 
        
               } 
        
             } 
        
             return true; 
        
           }

So I am thinking to change the type of size_t len on read_content_with_length to uint64_t. Also if a request comes with huge content > 4GB on 32-bit platform, it should read whole content anyway, but return false in read_content_with_length because out cannot hold the entire content.

Does it make sense to you?

MannyClicks · 2019-04-12T17:34:23Z

Sure, happy to help the project 👍

RE 1: yes, I would - but I didn't because from looking at the project style it didn't seem to me that you'd be using stdint.h types. uint64_t would be the correct type as per my explanation above.

RE 2: I think it should only load a user set maximum into the buffer on 32bit platforms and keep the rest on the socket. However, you'd need to inform the user that another read is necessary. Keep in mind that 32bit processes have 4GB memory limits that many people forget these days. It will make the handling the stream much more difficult than on a 64bit platform. I think an OK'ish alternative for 32bit platforms is to just reject such a big request (which can be done naturally if what I propose next is implemented). A better alternative would be to only load the content from the stream in the request path handler setup in the server if the user wants to do. Then the user can build his own chunked stream.

Another thing I noticed in our stress testing today, we have to reject streams if the content length exceeds a maximum size specified in the server instance. Otherwise, bad clients can create 40gb requests - effectively killing our process.

BTW, one thing I dislike about the API is that we have std::string everywhere for data buffers. I would prefer std::vector<uint8_t>

yhirose · 2019-04-12T21:45:25Z

Thanks for the further explanation. I removed get_header_value_int which is no longer used, and changed the return type of get_header_value_uint64 from unsigned long long to uint64_t, and merged the PR. Thanks for your contribution!

As for RE 2, I'll ponder over those options.

Another thing I noticed in our stress testing today, we have to reject streams if the content length exceeds a maximum size specified in the server instance. Otherwise, bad clients can create 40gb requests - effectively killing our process.

BTW, one thing I dislike about the API is that we have std::string everywhere for data buffers. I would prefer std::vector<uint8_t>

Thanks for the suggestions, too. I made corresponding tickets. (Of course, you are welcome to implement these. 😄)

MannyClicks · 2019-04-13T09:53:50Z

Sounds good. Expect another PR soon! 😄

BUGFIX: crash when "content-length" > max_int

07ed076

yhirose merged commit 07ed076 into yhirose:master Apr 12, 2019

This was referenced Apr 12, 2019

Use std::vector<uint8_t> for data buffer #143

Closed

Reject a request with huge payload which exceeds the server's size limit #144

Closed

yhirose mentioned this pull request Aug 30, 2019

uint64_t and size_t conflicts on 32 bit #209

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

BUGFIX: crash when "content-length" > max_int #142

BUGFIX: crash when "content-length" > max_int #142

Uh oh!

MannyClicks commented Apr 12, 2019

Uh oh!

yhirose commented Apr 12, 2019

Uh oh!

yhirose commented Apr 12, 2019

Uh oh!

MannyClicks commented Apr 12, 2019

Uh oh!

yhirose commented Apr 12, 2019

Uh oh!

MannyClicks commented Apr 12, 2019

Uh oh!

yhirose commented Apr 12, 2019

Uh oh!

MannyClicks commented Apr 13, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

BUGFIX: crash when "content-length" > max_int #142

BUGFIX: crash when "content-length" > max_int #142

Uh oh!

Conversation

MannyClicks commented Apr 12, 2019

Uh oh!

yhirose commented Apr 12, 2019

Uh oh!

yhirose commented Apr 12, 2019

Uh oh!

MannyClicks commented Apr 12, 2019

Uh oh!

yhirose commented Apr 12, 2019

Uh oh!

MannyClicks commented Apr 12, 2019

Uh oh!

yhirose commented Apr 12, 2019

Uh oh!

MannyClicks commented Apr 13, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants