Skip to content

v1.3.1

Choose a tag to compare

@geoff-davis geoff-davis released this 14 Apr 21:58
· 98 commits to main since this release

Performance

  • Replace character-by-character Python loop in text newline decoding with C-speed string operations (str.find, in, str.replace), with a fast path that skips translation entirely when no \r is present.
  • Eliminate per-line string slice copies in readline() and async for iteration by searching directly in the text buffer with an offset instead of creating a slice copy.
  • Track already-scanned buffer positions in readline() / __anext__() to avoid O(n²) re-scanning when lines span multiple chunks.
  • Dedicated read(-1) fast path that bypasses per-chunk buffer append/consume overhead, performing a single binary read and decode instead.
  • Use memoryview for zero-copy buffer slicing in binary read(), read1(), and readline(), eliminating an intermediate bytearray allocation per call.
  • Add binary readline() fast path for the common case where the line fits in the current buffer, avoiding list allocation and b"".join().
  • Add binary read(size) early exit when the buffer already satisfies the request.
  • Make text _capture_buffer_origin() synchronous by accessing binary position directly, eliminating coroutine creation overhead per chunk.
  • Inline small helper methods in hot paths to eliminate method call overhead.
  • Pre-compute _universal_newlines flag at init to replace per-chunk set membership tests.
  • Defer text buffer compaction until dead space exceeds a threshold, matching the binary layer's strategy.

Benchmark improvements vs 1.3.0

Benchmark 1.3.0 1.3.1 Improvement
Text I/O throughput 17.4 MB/s 37.1 MB/s 2.1x
readline() 10K lines 17.3ms 6.3ms 2.8x
Line iteration 10K 18.2ms 6.5ms 2.8x
JSONL processing 0.78x vs gzip 1.4x vs gzip Now beats stdlib