Skip to content

Chunked Decoding hangs forever if source Stream stops sending data #37

@ccrawford

Description

@ccrawford

Is there a way to access a ChunkDecodingStream with a timeout on the stream read? I don't see one and I've been having some rare production failures when a chunk stream stops responding mid-stream. I've set timeout values for the stream, but looking at the implementation, it doesn't appear it is used.

I'm seeing a rare run-time condition when I'm calling an API from a public server (that I do not control). The server needs https 1.1, so I'm using a ChunkDecodingStream. If the server stops responding mid-stream, serial debugging just shows
[386273][W][NetworkClient.cpp:506] readBytes(): Timeout waiting for data on fd 49
every http.setTimeout() seconds in an endless loop. I am checking for bytes in the stream before I go do deserializeJson with the Chunk stream as a source.

Tracing the issue, it appears ChunkDecodingStream tries to read, the http readBytes() timeout gets hit and returns 0 (bytes read), causing ChunkDecodingPolicy doReadBytes() to go for another (useless) try because (ChunkDecodingPolicy.hpp, line 72) size is > 0, no error is set, the chunk never ended, and we're still in the chunk body. This results in an endless loop with no timeout.

Here's a python server that can force the error (credit: Claude helped me with the server code):

#!/usr/bin/env python3
# enhanced_problematic_server.py
# Extended test server to reproduce timeout issues
# Run: python3 enhanced_problematic_server.py
import socketserver
import threading
import time
import traceback
from urllib.parse import urlparse, parse_qs

HOST = "0.0.0.0"
PORT = 8000
SLEEP_FOREVER = 3600  # used to simulate long stalls

class Handler(socketserver.StreamRequestHandler):
    def handle(self):
        try:
            line = self.rfile.readline().decode('utf-8', errors='ignore')
            if not line:
                return
            # very simple request-line parse
            method, path, _ = line.split()
            # read & discard headers
            while True:
                h = self.rfile.readline()
                if not h or h == b'\r\n':
                    break

            path = urlparse(path).path
            print(f"[{self.client_address}] {method} {path}")
if path == "/chunked_incomplete_stall":
                # REPLICATES WEATHERAPI.COM HANG
                # Send proper chunked transfer with several complete chunks
                # Then send chunk size but INCOMPLETE chunk data
                # Connection stays open causing repeating timeouts
                # This exactly mimics the [NetworkClient.cpp:506] readBytes() timeout pattern
                resp = b"HTTP/1.1 200 OK\r\nContent-Type: application/json\r\nTransfer-Encoding: chunked\r\nConnection: keep-alive\r\n\r\n"
                self.wfile.write(resp)
                self.wfile.flush()
                print("-> sent chunked response headers")

                # Send 3 complete chunks successfully
                chunks = [
                    b'{"location":{"name":"St. Charles"},"current":',
                    b'{"temp_f":65.0,"condition":{"text":"Partly cloudy"}},',
                    b'"forecast":{"forecastday":['
                ]

                for i, chunk in enumerate(chunks):
                    chunk_size = f"{len(chunk):X}\r\n".encode()
                    self.wfile.write(chunk_size)
                    self.wfile.write(chunk)
                    self.wfile.write(b"\r\n")
                    self.wfile.flush()
                    print(f"-> sent complete chunk {i+1}: {len(chunk)} bytes")
                    time.sleep(0.1)

                # Now send a chunk size but INCOMPLETE data
                # Declare 100 bytes but only send 20
                incomplete_data = b'{"date":"2025-11-0'
                declared_size = 100
                chunk_header = f"{declared_size:X}\r\n".encode()

                self.wfile.write(chunk_header)
                self.wfile.flush()
                print(f"-> sent chunk header declaring {declared_size} bytes")

                time.sleep(0.5)
                self.wfile.write(incomplete_data)
                self.wfile.flush()
                print(f"-> sent only {len(incomplete_data)} bytes of {declared_size} declared")
                print("-> NOW HANGING - client expects 80 more bytes that will never arrive")
                print("-> ChunkDecodingStream will timeout repeatedly trying to read remaining bytes")
                print("-> This causes the infinite loop: [NetworkClient.cpp:506] readBytes(): Timeout")

                # Keep connection open but send nothing more
                time.sleep(SLEEP_FOREVER)
                return

if __name__ == "__main__":
    print("Starting enhanced problematic test server on port", PORT)
    print("\nAvailable test endpoints:")
    print("  /chunked_incomplete_stall   - **REPLICATES WEATHERAPI HANG** - incomplete chunk data")
    print()

    server = socketserver.ThreadingTCPServer((HOST, PORT), Handler)
    server.allow_reuse_address = True
    try:
        server.serve_forever()
    except KeyboardInterrupt:
        print("\nShutting down")
        server.shutdown()

Using deserializeJsonDoc with a ChunkDecodingStream stream as the source and with the above service as the http get will result in a hang as described above.

If there's a better way to avoid this by manipulating timeouts, I'm all ears, however it doesn't appear that the ChunkDecodingPolicy tests for any timeouts. Adding one in the while loop at line 72 would be beneficial.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions