Skip to content

fix: prevent sticky progress bar ghost lines from terminal wrapping#565

Merged
andreatgretel merged 5 commits intomainfrom
andreatgretel/fix/sticky-progress-display
Apr 22, 2026
Merged

fix: prevent sticky progress bar ghost lines from terminal wrapping#565
andreatgretel merged 5 commits intomainfrom
andreatgretel/fix/sticky-progress-display

Conversation

@andreatgretel
Copy link
Copy Markdown
Contributor

📋 Summary

Fixes ghost line accumulation in StickyProgressBar during async dataset generation. When a bar line exceeded the terminal width (due to eta/rate field overflow with slow generation), it wrapped to multiple physical lines but _drawn_lines only counted 1 per bar. Cursor-up clearing missed the wrapped portions, leaving ghost copies that compounded every redraw cycle.

🔗 Related Issue

N/A - reported by Nabin on Slack

🔄 Changes

🐛 Fixed

  • sticky_progress_bar.py: Count physical terminal lines in _redraw() by comparing visible line width (ANSI-stripped) against terminal width, so cursor-up clearing removes all wrapped portions
  • sticky_progress_bar.py: Cap rate display at 9999.9 rec/s and eta at 999s to prevent stats field overflow that triggers wrapping
  • sticky_progress_bar.py: Remove max(10, ...) bar_width floor that forced overflow on narrow terminals; degrade gracefully (stats-only, no visual bar) when terminal is too narrow

✨ Added

  • sticky_progress_bar.py: update_many() method for batch updates with a single clear+redraw cycle instead of N individual redraws
  • async_progress_reporter.py: Use update_many() in _update_bar() to reduce O(N²) redraw overhead to O(N)
  • test_sticky_progress_bar.py: 10 tests covering TTY guard, cursor management, drawn_lines tracking, log interleaving, line wrapping, batch updates, and AsyncReporter integration

🧪 Testing

  • make test passes (engine suite)
  • Unit tests added: 10 tests in test_sticky_progress_bar.py
  • E2E tests: N/A - visual terminal behavior verified manually with pyte terminal emulator

✅ Checklist

  • Follows commit message conventions
  • Commits are signed off (DCO)
  • Architecture docs updated: N/A

When a formatted bar line exceeded the terminal width, it wrapped to
multiple physical lines but _drawn_lines only counted 1 per bar.
Subsequent cursor-up clears missed the wrapped portions, leaving ghost
lines that accumulated every redraw cycle.

- Count physical lines in _redraw based on visible width vs terminal width
- Cap rate at 9999.9 and eta at 999s to prevent stats field overflow
- Remove max(10, ...) bar_width floor; degrade gracefully on narrow terminals
- Add update_many() for batch updates with a single redraw cycle
- Use update_many() in AsyncProgressReporter to reduce N redraws to 1
@andreatgretel andreatgretel requested a review from a team as a code owner April 21, 2026 19:27
@github-actions
Copy link
Copy Markdown
Contributor

Code Review: PR #565

Title: fix: prevent sticky progress bar ghost lines from terminal wrapping
Author: @andreatgretel
Files changed: 3 (+207 / -5)

Summary

Fixes a display regression in StickyProgressBar where bar lines that wrapped past the terminal width left ghost copies because _drawn_lines counted 1 per bar regardless of wrapping. The cursor-up clear loop then missed the wrapped physical lines, so each redraw compounded the residue.

The fix has three complementary parts:

  1. Count physical lines in _redraw() by stripping ANSI escapes and comparing visible width against shutil.get_terminal_size().columns, incrementing _drawn_lines by the ceiling of visible / width.
  2. Cap stats-field values (rate ≤ 9999.9 rec/s, eta ≤ 999s) so the stats segment matches its pre-computed stats_width and stops forcing wraps on slow runs.
  3. Graceful narrow-terminal fallback — drop the max(10, …) floor that was forcing overflow; when bar_width < 1, render stats-only truncated to the terminal width.

A batching update_many() method was added and wired into AsyncProgressReporter._update_bar() to collapse per-column redraws into a single clear+redraw cycle. 10 new tests cover TTY gating, cursor management, drawn-line invariants, log interleaving, narrow wrapping, and reporter integration.

Findings

Correctness

  • Wrapping math is correct. (visible + width - 1) // width is proper ceiling division, guarded by width > 0 to avoid division edge cases on detached streams (sticky_progress_bar.py:179-182).
  • ANSI regex scope is adequate. _ANSI_RE only matches SGR (m) sequences, which is fine here because the only ANSI in rendered bar lines comes from the color escapes in _format_bar (:205). Control escapes like \033[A, \033[2K, \033[?25l are written separately, not through _format_bar, so the regex doesn't need to handle them. Worth a one-line comment noting this assumption so a future contributor doesn't accidentally embed a non-SGR escape in _format_bar and silently miscount.
  • Stale _drawn_lines on terminal resize remains a pre-existing limitation (not caused by this PR): if the user resizes the terminal between a redraw and the next clear, the stored line count reflects the old width. Not worth blocking on, but could be worth a follow-up that re-queries shutil.get_terminal_size() in _clear_bars or uses \033[J to clear-to-end.
  • eta still uses uncapped rate in the division. min(remaining / rate, 999) is bounded by the outer min, so no format overflow, but when rate is clamped to 9999.9 the eta is slightly wrong (floor-divided by a smaller denominator). Fine for display — just worth noting that the cap is cosmetic, not numerically faithful.

Style / Project conventions

  • from __future__ import annotations, modern PEP 604/585 typing (dict[str, tuple[int, int, int]], TextIO | None), absolute imports — all consistent with AGENTS.md invariants.
  • Module layout (engine-internal util) respects the interface → engine → config dependency direction.
  • SPDX header present on the new test file.
  • Tuple-typed batch API is terse. dict[str, tuple[int, int, int]] in update_many is opaque at call sites. Given the only caller is async_progress_reporter._update_bar, not a blocker, but a NamedTuple or dataclass (or just keyword kwargs matching update()'s signature) would read better and match the "declare, don't orchestrate" ethos.

Test coverage

  • test_drawn_lines_stable_across_many_updates and test_narrow_terminal_wrapping_counts_physical_lines both assert stability of _drawn_lines, not that the wrap count is numerically correct. The latter would be stronger if it stubbed shutil.get_terminal_size to a small width (e.g., 40) and asserted _drawn_lines == <expected wrapped count>. As written, on a runner with a wide default terminal width (80+), the bars likely don't wrap and the test silently degrades into "1-per-bar holds," not exercising the new code path.
  • test_narrow_terminal_graceful_degradation has no assertions beyond "doesn't crash." Consider asserting that the output line length is ≤ width - 1 and contains the stats snippet but no color escapes.
  • test_update_many_single_redraw asserts CURSOR_UP_CLEAR count == 2, which is exactly the clear for the 2 drawn lines before the single redraw — this is a good behavioral check and cleanly distinguishes from the N-redraw path.
  • No test for the bar_width < 1 truncation branch specifically slicing to width - 1.
  • Tests reach into bar._drawn_lines (private). Acceptable for a utility module's unit tests, but brittle if the internal representation changes.

Performance

  • Replacing N update() calls with one update_many() is the documented O(N²) → O(N) win, and the call site in async_progress_reporter.py:104-108 is clean.
  • Regex strip + ceil-div per bar line is negligible (a few bars per redraw, not in a hot loop).

Security

  • No new I/O, subprocess, or deserialization surface.
  • Regex is a fixed literal, no user input compiled as a pattern.

Verdict

Looks good to merge with two small follow-up suggestions:

  1. Strengthen the wrapping test by monkeypatching shutil.get_terminal_size to return a narrow width and asserting the exact _drawn_lines count — otherwise the wrap-counting code path may not be exercised in CI.
  2. Add an assertion to test_narrow_terminal_graceful_degradation (output length, shape) so it actually guards the degradation path.

Optional polish: one-line comment on _ANSI_RE noting it targets SGR only and assumes control escapes are never embedded in _format_bar output; consider a NamedTuple for the update_many value type.

None of these block the fix, which is a clean, well-scoped correction to a real user-visible bug.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 21, 2026

Greptile Summary

This PR fixes ghost line accumulation in StickyProgressBar by correctly counting physical terminal lines during wrapping, capping rate/eta to prevent overflow, and removing the max(10, …) bar-width floor. It also adds update_many() to collapse per-column redraws into a single clear+redraw cycle, reducing O(N²) terminal writes to O(N) per report tick. The implementation is correct, the caps are consistent with the _compute_stats_width sample string, and the 10 new tests cover the key invariants thoroughly.

Confidence Score: 5/5

Safe to merge — no P0/P1 issues found; all remaining observations are P2 or lower.

The wrapping-count logic is correct (ceiling division, ANSI-stripped visible width), the rate/eta caps match the stats_width sample string exactly, the narrow-terminal fallback is safe, and update_many() correctly batches state updates before a single redraw. Tests cover all changed code paths. No logic errors or data integrity concerns identified.

No files require special attention.

Important Files Changed

Filename Overview
packages/data-designer-engine/src/data_designer/engine/dataset_builders/utils/sticky_progress_bar.py Core fix: counts physical terminal lines via ANSI-stripped visible width, caps rate/eta to prevent overflow, removes bar_width floor, adds update_many() — all correct and well-scoped.
packages/data-designer-engine/src/data_designer/engine/dataset_builders/utils/async_progress_reporter.py Switches _update_bar() from N individual update() calls to a single update_many(), reducing per-record redraw cost from O(N²) to O(N); logic is correct.
packages/data-designer-engine/tests/engine/dataset_builders/utils/test_sticky_progress_bar.py 10 focused tests covering TTY guard, cursor lifecycle, drawn_lines tracking, log interleaving, line wrapping counts, narrow-terminal degradation, batch updates, and AsyncReporter integration.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant AsyncProgressReporter
    participant StickyProgressBar

    Caller->>AsyncProgressReporter: record_success(col)
    AsyncProgressReporter->>AsyncProgressReporter: _maybe_report()
    AsyncProgressReporter->>AsyncProgressReporter: _update_bar() — snapshot all trackers
    AsyncProgressReporter->>StickyProgressBar: update_many({col: (completed, success, failed), …})
    StickyProgressBar->>StickyProgressBar: update all _BarState fields
    StickyProgressBar->>StickyProgressBar: _redraw()
    StickyProgressBar->>StickyProgressBar: _clear_bars() — N cursor-ups (physical lines)
    StickyProgressBar->>StickyProgressBar: _format_bar() × N bars
    Note over StickyProgressBar: visible=len(ANSI-stripped line)<br/>drawn_lines += ceil(visible/width)
    StickyProgressBar-->>Caller: terminal updated once
Loading

Reviews (5): Last reviewed commit: "Merge branch 'main' into andreatgretel/f..." | Re-trigger Greptile

- Monkeypatch shutil.get_terminal_size to force narrow terminal in tests
- Inject oversized lines via _format_bar patch to exercise physical line
  counting (ceiling division) code path
- Assert output line width <= width-1 in graceful degradation test
- Assert _drawn_lines == 1 in degradation mode (no false wrapping)
Address review feedback: use flat pytest functions per DEVELOPMENT.md
conventions instead of class-based test suites. Move inline imports
to module level.

Replace most _drawn_lines and _bars assertions with public output
proxies (counting CURSOR_UP_CLEAR sequences and checking rendered
bar content). Keep _drawn_lines access only where no clean public
proxy exists (multi-checkpoint add/remove test, zero-bars-remaining
after log_final).
Replace _drawn_lines access in tests with a public property,
consistent with the existing is_active pattern.
@andreatgretel andreatgretel merged commit f612822 into main Apr 22, 2026
49 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants