fix: prevent sticky progress bar ghost lines from terminal wrapping by andreatgretel · Pull Request #565 · NVIDIA-NeMo/DataDesigner

andreatgretel · 2026-04-21T19:27:00Z

📋 Summary

Fixes ghost line accumulation in StickyProgressBar during async dataset generation. When a bar line exceeded the terminal width (due to eta/rate field overflow with slow generation), it wrapped to multiple physical lines but _drawn_lines only counted 1 per bar. Cursor-up clearing missed the wrapped portions, leaving ghost copies that compounded every redraw cycle.

🔗 Related Issue

N/A - reported by Nabin on Slack

🔄 Changes

🐛 Fixed

sticky_progress_bar.py: Count physical terminal lines in _redraw() by comparing visible line width (ANSI-stripped) against terminal width, so cursor-up clearing removes all wrapped portions
sticky_progress_bar.py: Cap rate display at 9999.9 rec/s and eta at 999s to prevent stats field overflow that triggers wrapping
sticky_progress_bar.py: Remove max(10, ...) bar_width floor that forced overflow on narrow terminals; degrade gracefully (stats-only, no visual bar) when terminal is too narrow

✨ Added

sticky_progress_bar.py: update_many() method for batch updates with a single clear+redraw cycle instead of N individual redraws
async_progress_reporter.py: Use update_many() in _update_bar() to reduce O(N²) redraw overhead to O(N)
test_sticky_progress_bar.py: 10 tests covering TTY guard, cursor management, drawn_lines tracking, log interleaving, line wrapping, batch updates, and AsyncReporter integration

🧪 Testing

make test passes (engine suite)
Unit tests added: 10 tests in test_sticky_progress_bar.py
E2E tests: N/A - visual terminal behavior verified manually with pyte terminal emulator

✅ Checklist

Follows commit message conventions
Commits are signed off (DCO)
Architecture docs updated: N/A

When a formatted bar line exceeded the terminal width, it wrapped to multiple physical lines but _drawn_lines only counted 1 per bar. Subsequent cursor-up clears missed the wrapped portions, leaving ghost lines that accumulated every redraw cycle. - Count physical lines in _redraw based on visible width vs terminal width - Cap rate at 9999.9 and eta at 999s to prevent stats field overflow - Remove max(10, ...) bar_width floor; degrade gracefully on narrow terminals - Add update_many() for batch updates with a single redraw cycle - Use update_many() in AsyncProgressReporter to reduce N redraws to 1

github-actions · 2026-04-21T19:29:19Z

Code Review: PR #565

Title: fix: prevent sticky progress bar ghost lines from terminal wrapping
Author: @andreatgretel
Files changed: 3 (+207 / -5)

Summary

Fixes a display regression in StickyProgressBar where bar lines that wrapped past the terminal width left ghost copies because _drawn_lines counted 1 per bar regardless of wrapping. The cursor-up clear loop then missed the wrapped physical lines, so each redraw compounded the residue.

The fix has three complementary parts:

Count physical lines in _redraw() by stripping ANSI escapes and comparing visible width against shutil.get_terminal_size().columns, incrementing _drawn_lines by the ceiling of visible / width.
Cap stats-field values (rate ≤ 9999.9 rec/s, eta ≤ 999s) so the stats segment matches its pre-computed stats_width and stops forcing wraps on slow runs.
Graceful narrow-terminal fallback — drop the max(10, …) floor that was forcing overflow; when bar_width < 1, render stats-only truncated to the terminal width.

A batching update_many() method was added and wired into AsyncProgressReporter._update_bar() to collapse per-column redraws into a single clear+redraw cycle. 10 new tests cover TTY gating, cursor management, drawn-line invariants, log interleaving, narrow wrapping, and reporter integration.

Findings

Correctness

Wrapping math is correct. (visible + width - 1) // width is proper ceiling division, guarded by width > 0 to avoid division edge cases on detached streams (sticky_progress_bar.py:179-182).
ANSI regex scope is adequate. _ANSI_RE only matches SGR (m) sequences, which is fine here because the only ANSI in rendered bar lines comes from the color escapes in _format_bar (:205). Control escapes like \033[A, \033[2K, \033[?25l are written separately, not through _format_bar, so the regex doesn't need to handle them. Worth a one-line comment noting this assumption so a future contributor doesn't accidentally embed a non-SGR escape in _format_bar and silently miscount.
Stale _drawn_lines on terminal resize remains a pre-existing limitation (not caused by this PR): if the user resizes the terminal between a redraw and the next clear, the stored line count reflects the old width. Not worth blocking on, but could be worth a follow-up that re-queries shutil.get_terminal_size() in _clear_bars or uses \033[J to clear-to-end.
eta still uses uncapped rate in the division. min(remaining / rate, 999) is bounded by the outer min, so no format overflow, but when rate is clamped to 9999.9 the eta is slightly wrong (floor-divided by a smaller denominator). Fine for display — just worth noting that the cap is cosmetic, not numerically faithful.

Style / Project conventions

from __future__ import annotations, modern PEP 604/585 typing (dict[str, tuple[int, int, int]], TextIO | None), absolute imports — all consistent with AGENTS.md invariants.
Module layout (engine-internal util) respects the interface → engine → config dependency direction.
SPDX header present on the new test file.
Tuple-typed batch API is terse. dict[str, tuple[int, int, int]] in update_many is opaque at call sites. Given the only caller is async_progress_reporter._update_bar, not a blocker, but a NamedTuple or dataclass (or just keyword kwargs matching update()'s signature) would read better and match the "declare, don't orchestrate" ethos.

Test coverage

test_drawn_lines_stable_across_many_updates and test_narrow_terminal_wrapping_counts_physical_lines both assert stability of _drawn_lines, not that the wrap count is numerically correct. The latter would be stronger if it stubbed shutil.get_terminal_size to a small width (e.g., 40) and asserted _drawn_lines == <expected wrapped count>. As written, on a runner with a wide default terminal width (80+), the bars likely don't wrap and the test silently degrades into "1-per-bar holds," not exercising the new code path.
test_narrow_terminal_graceful_degradation has no assertions beyond "doesn't crash." Consider asserting that the output line length is ≤ width - 1 and contains the stats snippet but no color escapes.
test_update_many_single_redraw asserts CURSOR_UP_CLEAR count == 2, which is exactly the clear for the 2 drawn lines before the single redraw — this is a good behavioral check and cleanly distinguishes from the N-redraw path.
No test for the bar_width < 1 truncation branch specifically slicing to width - 1.
Tests reach into bar._drawn_lines (private). Acceptable for a utility module's unit tests, but brittle if the internal representation changes.

Performance

Replacing N update() calls with one update_many() is the documented O(N²) → O(N) win, and the call site in async_progress_reporter.py:104-108 is clean.
Regex strip + ceil-div per bar line is negligible (a few bars per redraw, not in a hot loop).

Security

No new I/O, subprocess, or deserialization surface.
Regex is a fixed literal, no user input compiled as a pattern.

Verdict

Looks good to merge with two small follow-up suggestions:

Strengthen the wrapping test by monkeypatching shutil.get_terminal_size to return a narrow width and asserting the exact _drawn_lines count — otherwise the wrap-counting code path may not be exercised in CI.
Add an assertion to test_narrow_terminal_graceful_degradation (output length, shape) so it actually guards the degradation path.

Optional polish: one-line comment on _ANSI_RE noting it targets SGR only and assumes control escapes are never embedded in _format_bar output; consider a NamedTuple for the update_many value type.

None of these block the fix, which is a clean, well-scoped correction to a real user-visible bug.

greptile-apps · 2026-04-21T19:30:29Z

Greptile Summary

This PR fixes ghost line accumulation in StickyProgressBar by correctly counting physical terminal lines during wrapping, capping rate/eta to prevent overflow, and removing the max(10, …) bar-width floor. It also adds update_many() to collapse per-column redraws into a single clear+redraw cycle, reducing O(N²) terminal writes to O(N) per report tick. The implementation is correct, the caps are consistent with the _compute_stats_width sample string, and the 10 new tests cover the key invariants thoroughly.

Confidence Score: 5/5

Safe to merge — no P0/P1 issues found; all remaining observations are P2 or lower.

The wrapping-count logic is correct (ceiling division, ANSI-stripped visible width), the rate/eta caps match the stats_width sample string exactly, the narrow-terminal fallback is safe, and update_many() correctly batches state updates before a single redraw. Tests cover all changed code paths. No logic errors or data integrity concerns identified.

No files require special attention.

Important Files Changed

Filename	Overview
packages/data-designer-engine/src/data_designer/engine/dataset_builders/utils/sticky_progress_bar.py	Core fix: counts physical terminal lines via ANSI-stripped visible width, caps rate/eta to prevent overflow, removes bar_width floor, adds update_many() — all correct and well-scoped.
packages/data-designer-engine/src/data_designer/engine/dataset_builders/utils/async_progress_reporter.py	Switches _update_bar() from N individual update() calls to a single update_many(), reducing per-record redraw cost from O(N²) to O(N); logic is correct.
packages/data-designer-engine/tests/engine/dataset_builders/utils/test_sticky_progress_bar.py	10 focused tests covering TTY guard, cursor lifecycle, drawn_lines tracking, log interleaving, line wrapping counts, narrow-terminal degradation, batch updates, and AsyncReporter integration.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant AsyncProgressReporter
    participant StickyProgressBar

    Caller->>AsyncProgressReporter: record_success(col)
    AsyncProgressReporter->>AsyncProgressReporter: _maybe_report()
    AsyncProgressReporter->>AsyncProgressReporter: _update_bar() — snapshot all trackers
    AsyncProgressReporter->>StickyProgressBar: update_many({col: (completed, success, failed), …})
    StickyProgressBar->>StickyProgressBar: update all _BarState fields
    StickyProgressBar->>StickyProgressBar: _redraw()
    StickyProgressBar->>StickyProgressBar: _clear_bars() — N cursor-ups (physical lines)
    StickyProgressBar->>StickyProgressBar: _format_bar() × N bars
    Note over StickyProgressBar: visible=len(ANSI-stripped line)<br/>drawn_lines += ceil(visible/width)
    StickyProgressBar-->>Caller: terminal updated once

_{Reviews (5): Last reviewed commit: "Merge branch 'main' into andreatgretel/f..." | Re-trigger Greptile}

- Monkeypatch shutil.get_terminal_size to force narrow terminal in tests - Inject oversized lines via _format_bar patch to exercise physical line counting (ceiling division) code path - Assert output line width <= width-1 in graceful degradation test - Assert _drawn_lines == 1 in degradation mode (no false wrapping)

Address review feedback: use flat pytest functions per DEVELOPMENT.md conventions instead of class-based test suites. Move inline imports to module level. Replace most _drawn_lines and _bars assertions with public output proxies (counting CURSOR_UP_CLEAR sequences and checking rendered bar content). Keep _drawn_lines access only where no clean public proxy exists (multi-checkpoint add/remove test, zero-bars-remaining after log_final).

Replace _drawn_lines access in tests with a public property, consistent with the existing is_active pattern.

andreatgretel requested a review from a team as a code owner April 21, 2026 19:27

andreatgretel temporarily deployed to agentic-ci April 21, 2026 19:27 — with GitHub Actions Inactive

nabinchha reviewed Apr 21, 2026

View reviewed changes

Comment thread packages/data-designer-engine/tests/engine/dataset_builders/utils/test_sticky_progress_bar.py Outdated

nabinchha reviewed Apr 22, 2026

View reviewed changes

Comment thread packages/data-designer-engine/tests/engine/dataset_builders/utils/test_sticky_progress_bar.py Outdated

refactor: expose drawn_lines as public read-only property

b521b3f

Replace _drawn_lines access in tests with a public property, consistent with the existing is_active pattern.

nabinchha approved these changes Apr 22, 2026

View reviewed changes

Merge branch 'main' into andreatgretel/fix/sticky-progress-display

867224b

andreatgretel merged commit f612822 into main Apr 22, 2026
49 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: prevent sticky progress bar ghost lines from terminal wrapping#565

fix: prevent sticky progress bar ghost lines from terminal wrapping#565
andreatgretel merged 5 commits intomainfrom
andreatgretel/fix/sticky-progress-display

andreatgretel commented Apr 21, 2026

Uh oh!

github-actions Bot commented Apr 21, 2026

Uh oh!

greptile-apps Bot commented Apr 21, 2026 •

edited

Loading

Confidence Score: 5/5

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

andreatgretel commented Apr 21, 2026

📋 Summary

🔗 Related Issue

🔄 Changes

🐛 Fixed

✨ Added

🧪 Testing

✅ Checklist

Uh oh!

github-actions Bot commented Apr 21, 2026

Code Review: PR #565

Summary

Findings

Correctness

Style / Project conventions

Test coverage

Performance

Security

Verdict

Uh oh!

greptile-apps Bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps Bot commented Apr 21, 2026 •

edited

Loading