Skip to content

Latest commit

 

History

History
419 lines (343 loc) · 18 KB

CVE-2022-38667.md

File metadata and controls

419 lines (343 loc) · 18 KB

Summary:

Crow version 1.0+4 and older are vulnerable to a use-after-free condition due to incorrect Connection object lifetime management (though partial HTTP pipelining support is the deeper underlying problem). On successful exploitation this can potentially lead to either Remote Code Execution or Information Disclosure (a failed exploitation will lead to a Denial of Service as the server will crash). See sections below for details.

Affected: Crow version 1.0+4 and older

https://github.com/CrowCpp/Crow - maintained version (fork)

https://github.com/ipkn/crow - original version

CVE ID: CVE-2022-38667

CVSS: 9.8 (AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H)

Discovered by:
 hebi
 Gynvael Coldwind

Method of discovery:
 Fuzzing, Manual Analysis (reading the code)

"Crow is a C++ framework for creating HTTP or Websocket web services. It uses routing similar to Python's Flask which makes it easy to use. It is also extremely fast, beating multiple existing C++ frameworks as well as non C++ frameworks."
(source: project's README.md)

IMPORTANT: This vulnerability is reported under the 90-day policy (version 2021), i.e. this report will be shared publicly with the defensive community on 16th November 2022 if a patch/fix is not available by that time, or 30 days after the fix becomes available. For details please see: https://googleprojectzero.blogspot.com/2021/04/policy-and-disclosure-2021-edition.html

Crow execution model and life of a request

Note: If you understand Crow's execution model you can safely skip this section.

Note: The description below is a bit simplified and contains only the information relevant to the vulnerability.

Crow is using an asynchronous queued task execution model (implemented using the Asio C++ library - https://think-async.com/Asio/), which is not unlike typical JavaScript model.

Technically there are actually N+1 task queues (where N denotes the number of threads chosen by the application developer) - one per each worker thread and an extra "main" queue for some timers, signals, etc.

When a new client is connected, the connection (represented by a Connection object) is bound to a certain thread and its task queue. This means that all processing done by that connection is going to happen within the constrains of the same thread.

In case the server handles more simultaneous connections than there are threads, each thread will handle tasks related to multiple connections. On the flip side, if there are very few connections, then basically a connection has a task queue (and thread) for itself.

After the connection is established (and a Connection object made) a task tasked with receiving some data is added to the task queue. Once it completes, the completion callback is called (that's the lambda function inside the Connection::do_read() method in http_connection.h).

When data is received this callback passes the data to the parser (call to parser_.feed() method; the implementation is split between parser.h and http_parser_merged.h).

The parser (still operating within the same task) can then invoke a set of callbacks on various parsing events like method received, URL received, header name / value received, and so on. While all of these callbacks are technically processed within parser.h, some of them invoke methods of the Connection object.

Sooner (error) or later (no error) one of these called methods will end up with a prepared ready-to-send response (e.g. a 404 response on error or 200 response with some app-generated data on success). In such case the response will either be immediately (synchronously) sent out or a new task will be added to the task queue to send the response out later and call a completion callback (see the lambda function inside the Connection::do_write()'s method).

Once the parser returns, it might either require more data (in which case a new task to receive more data will be added to the task queue) or decide it's time to close the connection and finish the task.

Next, if any data was asynchronously being scheduled to be sent out, a completion callback (do_write()'s lambda) might be called. The completion callback clears the response buffers and might close the connection as well if needed.

The last task related to the given connection will delete the Connection object.

To summarize, the list of tasks executed while processing a connection might look like this:

  async_accept →
    async_accept's completion cb (creates Connection object) →
      async_receive (receive HTTP request) →
        async_receive's completion cb (invokes parser/schedules sending) →
          async_write (send HTTP response) →
            aync_write's completion cb (deletes Connection object if needed).

There might be more async_receive related tasks if data is trickling in slowly (multiple receives per HTTP request), and multiple async_write tasks if there are multiple responses to be sent out (Connection: keep-alive).

As mentioned, the last task is responsible for deleting the Connection object. To control and decide when that should happen a set of two flags is used:

  Connection::is_reading
  Connection::is_writing

If any receiving task is scheduled, is_reading flag is set. If any sending task is scheduled, is_writing flag is set. Once the tasks are complete and the connection is being closed, respective flags are cleared.

In various places a method called Connection::check_destroy() is called, which checks the status of both these flags. If neither is set (i.e. both are cleared), then the Connection object is deemed to be no longed needed and is deleted.

High-level root cause

In essence, the vulnerability in this report boils down to incomplete HTTP pipelining support.

When using HTTP pipelining the client sends multiple HTTP requests as part of the same connection one after another without waiting to receive HTTP responses for any of the previous requests (in opposition to the "normal" way of doing it, where only one request is sent, then the client waits for the complete response, and only then it proceeds with the next request). In theory this allows HTTP servers to process requests in parallel and then just send replies in order. Historically though HTTP pipelining proved to be tricky to implement correctly on both the client and server side. As such none of popular modern browsers have it enabled by default.

In case of Crow the HTTP parser does support HTTP pipelining, or rather it continues to invoke callbacks for all HTTP packets that are present in the received TCP packet (perhaps it's more fair to say it's HTTP pipelining agnostic). That being said, the asynchronous Connection layer is unaware of pipelining, which leads to various logic errors born from more than one HTTP request being handled at the same time, while being limited to a state tailored to a single request and response.

The simplest (and benign) way to demonstrate this is to create two endpoints that return 512KB of "1" and "2", as shown below:

  CROW_ROUTE(app, "/one")
  ([] {
    return std::string(1024 * 512, '1');
  });

  CROW_ROUTE(app, "/two")
  ([] {
    return std::string(1024 * 512, '2');
  });

And then request data from both endpoints in the same TCP packet:

GET /one HTTP/1.1
Host: example.com

GET /two HTTP/1.1
Host: example.com

In the response we would expect (in this order):

  1. HTTP response header for endpoint /one.
  2. 512KB of "1"s.
  3. HTTP response header for endpoint /two.
  4. 512KB of "2"s.
Here's what we actually receive:
$ cat packet.http - | nc -v 127.0.0.1 18080 | hexdump -C
Connection to 127.0.0.1 18080 port [tcp/*] succeeded!
00000000  48 54 54 50 2f 31 2e 31  20 32 30 30 20 4f 4b 0d  |HTTP/1.1 200 OK.|
00000010  0a 43 6f 6e 74 65 6e 74  2d 4c 65 6e 67 74 68 3a  |.Content-Length:|
00000020  20 35 32 34 32 38 38 0d  0a 53 65 72 76 65 72 3a  | 524288..Server:|
00000030  20 43 72 6f 77 2f 6d 61  73 74 65 72 0d 0a 44 61  | Crow/master..Da|
00000040  74 65 3a 20 54 75 65 2c  20 31 36 20 41 75 67 20  |te: Tue, 16 Aug |
00000050  32 30 32 32 20 31 39 3a  33 36 3a 34 38 20 47 4d  |2022 19:36:48 GM|
00000060  54 0d 0a 43 6f 6e 6e 65  63 74 69 6f 6e 3a 20 4b  |T..Connection: K|
00000070  65 65 70 2d 41 6c 69 76  65 0d 0a 0d 0a 31 31 31  |eep-Alive....111|
00000080  31 31 31 31 31 31 31 31  31 31 31 31 31 31 31 31  |1111111111111111|
*
00010000  48 54 54 50 2f 31 2e 31  20 32 30 30 20 4f 4b 0d  |HTTP/1.1 200 OK.|
00010010  0a 63 6f 6e 6e 65 63 74  69 6f 6e 3a 20 4b 65 65  |.connection: Kee|
00010020  70 2d 41 6c 69 76 65 0d  0a 43 6f 6e 74 65 6e 74  |p-Alive..Content|
00010030  2d 4c 65 6e 67 74 68 3a  20 30 0d 0a 53 65 72 76  |-Length: 0..Serv|
00010040  65 72 3a 20 43 72 6f 77  2f 6d 61 73 74 65 72 0d  |er: Crow/master.|
00010050  0a 44 61 74 65 3a 20 54  75 65 2c 20 31 36 20 41  |.Date: Tue, 16 A|
00010060  75 67 20 32 30 32 32 20  31 39 3a 33 36 3a 34 38  |ug 2022 19:36:48|
00010070  20 47 4d 54 0d 0a 43 6f  6e 6e 65 63 74 69 6f 6e  | GMT..Connection|
00010080  3a 20 4b 65 65 70 2d 41  6c 69 76 65 0d 0a 31 31  |: Keep-Alive..11|
00010090  31 31 31 31 31 31 31 31  31 31 31 31 31 31 31 31  |1111111111111111|
*
00020080  31 31 31 31 31 31 31 31  31 31 31 31 31 31 0d 0a  |11111111111111..|
00020090  31 31 31 31 31 31 31 31  31 31 31 31 31 31 31 31  |1111111111111111|
*
00080100  31 31 31 31 31 31 31 31  31 31 31 31 31           |1111111111111|
0008010d

Note: The * sign above denotes a repeated row.

To present the response in a form of the list, we get:

  1. HTTP response header for endpoint /one.
  2. 64KB (minus response header size) of "1"s.
  3. HTTP response header for endpoint /two with Content-Length set to 0.
  4. The remaining amount of "1"s.

Pretty obviously responses to both packets were mixed, both in sending (HTTP response header of /two injected in the middle of /one's response) and in processing (Content-Length: 0 in /two's response).

Note: As mentioned before, thankfully modern browsers disabled HTTP pipelining. Otherwise the above behavior under certain circumstances would allow for a UXSS (or straight-up JS injection).

To summarize, the vulnerability discussed below exploits the fact that the Connection layer isn't aware it processes multiple requests before ever finalizing previous ones.

Vulnerability (use-after-free)

As mentioned before, if both the is_reading and is_writing flags are cleared, then the next call to Connection::check_destroy() will delete the Connection object. In case there would still be another task related to the Connection scheduled to be executed, this would lead to a use-after-free vulnerability (and potentially also a double-free in case Connection::check_destroy() is invoked again).

One of the discovered ways to achieve this is to send two HTTP requests in one TCP packet:

  1. The first requests any dynamic endpoint whose response is smaller than 1MB (required to fall into a specific code path in Connection::do_write_general()).
  2. The second is both an invalid HTTP request (e.g. missing Host header) and is referring to a non-existing endpoint (to get a 404 error response).

For example:

GET /endpoint HTTP/1.1
Host: 127.0.0.1

GET /idontexist HTTP/1.1

The execution for the above example will flow as follows:

async_accept →
  async_accept's completion cb →
    async_receive (receive HTTP request) →
    - The is_reading flag is set.
      async_receive's completion cb (lambda in do_read())
      - The method parser_::feed() is invoked with the received data.
        - ...
        - Parser calls on_url callback, which in turn calls
          Connection::handle_url(), which establishes a route for /endpoint.
        - ...
        - Parser calls on_message_complete callback, which in turn invokes the
          endpoint's handler.
          - After the endpoint's handler finishes, the data is scheduled to be
            sent (i.e. a new "write" task is added to the task queue). This also
            causes the is_writing flag to be set.
        - ... parser continues with the second packet ...
        - Parser calls on_url callback, which in turn calls
          Connection::handle_url(), which establishes that there is no route
          specified for /idontexist.
          - This in turn causes a 404 packet to be scheduled to be sent to the
            client (i.e. a second "write" task is added to the task queue). This
            also causes the is_writing to be set (yes, it was already set).
        - The Parser notices there are no headers, which is invalid for HTTP/1.1
          as at least Host: is required, and exists with an error.
      - Given Parser's error, the is_reading is cleared, the socket is closed,
        and this tasks finishes (note: Connection object is still alive, since
        is_writing is still set). →
        async_write (try to send first HTTP response) →
          aync_write's completion cb (lambda in do_write())
          - The is_writing flag is cleared.
          - Since the socket was closed (and therefore sending failed),
            Connection::check_destroy() is called.
            - Since both is_reading and is_writing flags are cleared, the
              Connection object is deleted.
          - The task finishes. →
          async_write (try to send 404 HTTP response) →
            async_write's completion cb (lambda in do_write())
            - Attempts to access the deleted Connection object, which by this
              point is already freed. Application crashes.

In case of the example packet Crow will likely crash at that point, though it can potentially already crash when trying to execute the second scheduled async_write (since the buffers with the response data are already deleted).

While we haven't attempted to exploit this vulnerability beyond triggering it, we believe successful use-after-free exploitation cannot be ruled out at this time. Exploitation would likely involve either using two Connections in the same thread (i.e. using the same task queue) with carefully chosen order of data transmission. Likely winning a race condition might be requires as well.

Successful exploitation of this use-after-free vulnerability would lead to Remote Code Execution, i.e. executing arbitrary code chosen by the attacker in the context of the Crow app's process.

Proposed fix:

Given that HTTP pipelining support is not needed (browsers have it blocked, there is HTTP/2 which supports multiplexing), the best solution is to block HTTP pipelining altogether.

This would likely require adding a mechanism to make sure that any evidence of multiple HTTP requests within a single parser_.feed() call (as indicated by the on_... callbacks being invoked) will result in the connection being dropped after the first reply, and no extra HTTP request will be processed.

Other worrying patterns

As mentioned before, the Connection::check_destroy() function potentially deletes the Connection object. Given this, if the object is deleted, no other operations can be performed on it (i.e. any other operation is technically a use-after-free).

That said, there are 3 additional places where check_destroy() is called, but operations are still made on the object regardless.

It must be said that we were not able to find a way to actually trigger these code paths, but these do look like accidents bound to happen some day, so perhaps some refactoring in the area might be in order.

Place 1: End of do_write_static() function

      if (close_connection_)
      {
          adaptor_.shutdown_readwrite();
          adaptor_.close();
          CROW_LOG_DEBUG << this << " from write (static)";
          check_destroy();
      }

      res.end();  // Connection is potentially already deleted here.
      res.clear();
      buffers_.clear();
  }

Place 2: End of do_write_general() for large (over 1MB) responses

          if (close_connection_)
          {
              adaptor_.shutdown_readwrite();
              adaptor_.close();
              CROW_LOG_DEBUG << this << " from write (res_stream)";
              check_destroy();
          }

          res.end();  // Connection is potentially already deleted here.
          res.clear();
          buffers_.clear();
      }
  }

Place 3: do_write_sync()'s completion lambda (executed before function returns)

Note: In this case the issue lies in functions calling do_write_sync() not being aware that the Connection object might be potentially deleted in this method.

      else
      {
          CROW_LOG_ERROR << ec << " - happened while sending buffers";
          CROW_LOG_DEBUG << this << " from write (sync)(2)";
          check_destroy();
          return true;
      }
  });

Perhaps making check_destroy() return a boolean value indicating whether the object was or wasn't deleted might be a solution. That way the calling functions would know whether to immediately exit, or continue execution.

Timeline

  • 2022-08-14: Vulnerability discovered.
  • 2022-08-17: Vulnerability reported.
  • 2022-08-21: Public fix was proposed.
  • 2022-08-22: Public fix was merged in.
  • 2022-08-22: CVE requested and assigned.
  • 2022-09-23: Details were published.

Links

Gynvael's blog post:

https://gynvael.coldwind.pl/?lang=en&id=753