Chart performance: lift loop throttle and de-pandas render path#424
Merged
Conversation
The main event loop's queue.Empty fallback was time.sleep(0.1), pinning every UI module (including the chart) to ~10 Hz max regardless of how fast its update() completed. Swapping to state_utils.sleep_for_framerate takes the awake cap to ~30 Hz, matching what other UI modules already use directly, and triples chart-update headroom (measured 8.9 -> 22.3 Hz effective with the current chart render cost on dev hardware). Also removes the now-redundant 0.2s sleep in PowerManager.update for the asleep state: sleep_for_framerate already sleeps 0.5s when power_state is 0, which is strictly more power-saving than the old 0.3s combined (0.1s loop sleep + 0.2s PowerManager sleep). Net asleep-state change is ~3.3 Hz -> 2 Hz. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Now that the main event loop calls sleep_for_framerate itself, the per-module sleeps at the top of each update() are a second 1/30 s sleep per iteration -- pinning the affected screens to ~13-15 Hz instead of the intended ~30 Hz. Audited PiFinder/ui/ for sleep_for_framerate and time.sleep(1/30) calls inside update() methods. Removed: - equipment.py: sleep_for_framerate(self.shared_state) - gpsstatus.py: state_utils.sleep_for_framerate(self.shared_state) - sqm.py: sleep_for_framerate(self.shared_state) - software.py: time.sleep(1 / 30) - status.py: time.sleep(1 / 30) Also dropped the now-unused state_utils / sleep_for_framerate / time imports from those files. Other time.sleep() calls in ui/ are intentional state-waits (camera settling, exposure sweeps, marking-menu flashes, calibration steps) and were left in place. Measured menu-screen rate: ~13.4 Hz -> ~26.7 Hz (matches the user- reported ~15 fps menu observation; ~2x faster after the fix). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…o_xy)
The chart's per-frame work was dominated by pandas: every frame ran
several DataFrame.assign() chains over the star catalog and the
constellation edges, with each .assign() building a brand-new
DataFrame and each column access dispatching through pandas. cProfile
on release showed ~138 pandas.Series.__init__ calls and 6+ .assign()s
per frame; the math underneath was negligible.
This commit moves the per-frame projection / rotation / screen-space /
visibility blocks to numpy arrays cached on the Starfield instance.
Pandas DataFrames are kept only at the boundaries where they buy us
something:
- Star.from_dataframe() is the documented skyfield API for building
a Star object from N rows, so plot_markers / radec_to_xy still
build a tiny DataFrame just for that call.
- render_starfield_pil still returns visible_stars as a DataFrame
(sliced from self.stars at the very end) because align.py treats
it as a pandas object (.iloc, .sort_values, .assign, ra_degrees /
dec_degrees column access).
Changes per function:
Starfield.__init__:
- Cache self._star_magnitudes (numpy) for the per-frame mag filter.
- Drop self.const_edges_df; constellation x/y arrays live as four
separate numpy attributes refreshed each frame.
plot_starfield:
- Project stars + constellation endpoints into numpy arrays
(self._stars_x/y, self._const_sx/sy/ex/ey) instead of writing
columns into the DataFrames.
render_starfield_pil:
- All rotate/screen-space/visibility math runs on numpy arrays.
- Iterate visible edges/stars with np.flatnonzero, not pandas zip.
- Rebuild visible_stars DataFrame at the end via self.stars.iloc[
visible_idx].copy() so align.py keeps its catalog columns.
plot_markers:
- Skyfield call still uses a tiny DataFrame; after .observe(), all
rotation / screen-space / visibility runs in numpy.
- Preserve the pre-existing tautological off-screen-pointer
condition verbatim (separate concern; flagged in a comment).
radec_to_xy:
- Same shape as plot_markers but for a single point; the body is
now scalar numpy/python after the skyfield observe.
Measured on Mac (release vs after this commit):
plot_starfield total: 3870 us -> 541 us (~7.2x)
plot_markers (3): 1625 us -> 355 us (~4.6x)
full chart frame: 5750 us -> 1084 us (~5.3x)
cProfile per-frame pandas writes: 138 -> 2 (just the two visible_stars
column assigns at the end). The new dominant per-frame cost is
skyfield's projections.py:project() and PIL's ImageChops -- both C
code, at the floor for pure-Python optimization.
Extrapolating with the user-reported ~30x Mac->Pi slowdown that
produced 5 fps charts before, this should land the Pi chart frame at
~30 ms -- within the 33 ms sleep_for_framerate budget.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three changes that together take the chart from the user-reported ~5 fps on a Pi to an estimated ~30 fps, plus a related 2× win on the main-menu screens. The PR is intentionally three commits so each piece can be reverted independently.
1.
b501a62d— Usesleep_for_frameratein the main loopmain.py:560wastime.sleep(0.1)in thequeue.Emptyfallback, pinning every UI module at ~10 Hz max. Replaced withstate_utils.sleep_for_framerate(shared_state)(the same throttle other UI modules already use directly): ~30 Hz when awake, ~2 Hz when asleep.Also dropped the redundant
time.sleep(0.2)fromPowerManager.update()— the new throttle already sleeps longer when asleep, so the explicit sleep was unnecessary and slightly less power-saving than the new behavior.2.
9c9ec779— Remove redundant per-module framerate sleepsNow that the main loop throttles, the per-module
sleep_for_framerate/time.sleep(1/30)calls at the top of various UIupdate()methods are a second 33 ms sleep per iteration — capping those screens at ~13–15 Hz instead of 30 Hz.Removed from
equipment.py,gpsstatus.py,sqm.py,software.py,status.py. Othertime.sleep(...)calls inui/(camera settling in calibration, marking-menu flash, exposure-sweep delays, etc.) are intentional state-waits and left in place.3.
229fa49e— De-pandas the chart hot pathplot.py:plot_starfield/plot_markers/radec_to_xywere doing severalDataFrame.assign()chains per frame over the star catalog and the constellation edges. Each.assign()builds a brand-new DataFrame and each column access dispatches through pandas — cProfile showed ~138pandas.Series.__init__calls per frame, dominating the per-frame cost.Per-frame projection / rotation / screen-space / visibility math now runs on numpy arrays cached on the
Starfieldinstance. Pandas DataFrames are kept only at the boundaries where they buy us something:Star.from_dataframe()is the documented skyfield API for building Stars in bulk;plot_markersandradec_to_xystill build a tiny DataFrame just for that call.render_starfield_pilstill returnsvisible_starsas a DataFrame (sliced fromself.starsonce at the end) becauseui/align.pytreats it as a pandas object (.iloc,.sort_values,.assign, plusra_degrees/dec_degreescolumn access).Measured effect (Mac, 128×128 chart at FOV 10.2°)
plot_starfieldtotalplot_markers(3)cProfile after this PR: per-frame pandas writes drop from 138 → 2 (the two
visible_stars["x_pos"/"y_pos"] = ...assigns at the end ofrender_starfield_pil). The remaining dominant per-frame costs areskyfield.projections.project()and PIL'sImageChops— both C code, at the floor for pure-Python optimization.Pi extrapolation
The user observed ~5 fps chart on the Pi before any of this work, implying ~167 ms/frame on Pi vs ~5.7 ms on Mac (~30× scaling). Applying that factor to the new 1,084 µs Mac figure suggests ~30 ms/frame on Pi, which fits within the 33 ms
sleep_for_frameratebudget — chart should now sustain close to 30 Hz instead of 5 Hz.Behavioral diffs to be aware of
Wake-from-sleep latency goes from ~300 ms to ~500 ms (one asleep-loop iteration). If that ever feels too slow in practice, the lever is
state_utils.sleep_for_framerate's asleep value — but it affects all UI modules, so I left it alone.Test plan
nox -s lintcleannox -s formatcleannox -s smoke_tests2/2 passnox -s unit_tests98/98 pass🤖 Generated with Claude Code