Skip to content

feat: torque-inertia profiler & optimal P estimation (squash-rebase onto #143)#145

Merged
nerdCopter merged 18 commits into
masterfrom
feature/tip-squash-rebase
May 21, 2026
Merged

feat: torque-inertia profiler & optimal P estimation (squash-rebase onto #143)#145
nerdCopter merged 18 commits into
masterfrom
feature/tip-squash-rebase

Conversation

@nerdCopter
Copy link
Copy Markdown
Owner

@nerdCopter nerdCopter commented May 20, 2026

AI Generated Pull-request


Binaries: https://github.com/nerdCopter/BlackBox_CSV_Render/actions/runs/26246804179?pr=145


Summary

Physics-aware optimal P gain estimation added as an experimental optional feature (--estimate-optimal-p), complementing the existing P:D ratio recommendations. Inspired by Brian White's (@bw1129, PIDtoolbox) insight that optimal response timing is aircraft-specific and governed by the power-to-rotational-inertia ratio. Where PIDtoolbox derives targets from known prop size, this implementation measures the torque-inertia ratio directly from throttle-punch events in the log — no prop-size input required.

Clean squash-rebase replacement for the closed #141 (84-commit branch → single diff rebased onto master at 56c4aa4 post-#143), with all subsequent fixups applied on top.


Lineage: #132#141#145

#132 is the original development PR for this feature. It carries the full pre-squash commit history, all CodeRabbit AI review passes, pilot testing, binary test artifacts, and the iterative change of functional direction — from a prop-size-aware empirical table approach to the current log-derived torque-inertia profiler. #132 is the primary reference for the reasoning behind each design decision. It was kept open throughout development as a working branch and was subsequently superseded by the squash-rebase.

#141 was the first squash attempt, closed in favour of this PR when the branch was found to still carry excess commits.

This PR (#145) is the clean single-diff squash of all #132 work, rebased onto master, with all post-merge fixups applied on top.


Feature: Torque-Inertia Profiler & Optimal P Estimation

How it works

Phase 1 — Aircraft profiling (per group, before per-file processing)

  • Logs are grouped by aircraft key: extract_aircraft_key() strips _YYYYMMDD_HHMMSS timestamps so sessions from the same aircraft share one profile. When the filename prefix ends with BLACKBOX_LOG (Betaflight/EmuFlight generic naming), the craft name from the post-date suffix is appended (BTFL_BLACKBOX_LOG_CRAFTNAME, EMUF_BLACKBOX_LOG_CRAFTNAME) to prevent distinct aircraft collapsing into one group.
  • extract_punch_ratios() detects throttle-punch events (setpoint[3] step ≥ THROTTLE_PUNCH_MIN_DELTA within THROTTLE_PUNCH_WINDOW_MS) and measures peak angular acceleration in the gyro response window, divided by the normalised throttle delta → torque_inertia_ratio per axis.
  • Ratios are aggregated into AircraftProfile (median + half-IQR spread, Roll and Pitch only). Yaw is collected for diagnostic completeness but excluded from optimal-P analysis — Yaw dynamics differ structurally from Roll/Pitch and the formula is not calibrated for Yaw.
  • Requires ≥ TORQUE_PROFILER_MIN_EVENTS punch events; insufficient events produce a skip reason in both console and PNG.

Phase 2 — Per-file optimal-P analysis

  • Physics formula: Td_ms = TORQUE_PROFILER_TD_CALC_K / sqrt((P / TORQUE_PROFILER_P_SCALE) × torque_inertia_ratio) × TORQUE_PROFILER_ACHIEVABILITY_FACTOR
  • TORQUE_PROFILER_ACHIEVABILITY_FACTOR bridges the physics formula to real-world flight (ESC lag, motor startup, prop wash); empirically calibrated on a 5" 6S freestyle build.
  • Individual Td samples are collected from all valid step-response windows and compared against the physics-derived target.
  • HF D-term spectral energy informs whether noise limits further P increase.
  • OptimalPAnalysis classifies as Increase, Optimal, Decrease, or Investigate.

Requirements

  • A .headers.csv metadata file must be present alongside each input CSV (produced by blackbox_decode or the modern alternative bbl_parser). Without it, P gain values are unavailable and optimal P estimation is skipped with a skip reason shown in console and PNG. All other analyses (step response, spectrums, P:D recommendations) remain unaffected.

Output (console + PNG legend overlay)

  • Optimal P (Experimental, log-derived) section header — always the same label whether analysis succeeded or was skipped
  • Td mean with windows= count, physics-derived target, deviation %
  • Noise level (LOW / MODERATE / HIGH)
  • Consistency % — always shown; orange with CV value and threshold when poor (> TD_COEFFICIENT_OF_VARIATION_MAX)
  • [LOW AUTHORITY] warning when max setpoint across valid windows is below LOW_AUTHORITY_SETPOINT_THRESHOLD_DEG_S (hover/slow-cruise logs produce noise-dominated Td measurements, making recommendations unreliable regardless of CV)
  • P recommendation with calculated D adjustment — D falls back to current P/D ratio when step response is already optimal (no P:D change needed), so D is always shown when D gain is known
  • Skip reason when profiling could not complete

Relationship to existing P:D recommendations

The two features are complementary and appear simultaneously:

  • P:D recommendations (existing): peak overshoot → adjust D relative to P
  • Optimal P (new): response timing (Td) → adjust P magnitude

Fixes & Refinements Applied Post-Rebase

  • BTFL/EMUF craft name grouping: extract_aircraft_key() now appends craft name for BLACKBOX_LOG-prefixed files so logs from different aircraft don't collapse into one profile
  • Settle window time-based: replaced TORQUE_PROFILER_SETTLE_SAMPLES (a fixed sample count, correct only at 1 kHz) with TORQUE_PROFILER_SETTLE_MS = 5.0 converted to samples at runtime from the actual log sample rate. At high loop rates (BMI270 3.2 kHz, 4 kHz, 8 kHz) the old constant measured before ESCs responded, underestimating torque_inertia_ratio and biasing recommendations. Behaviour at 1 kHz is unchanged.
  • CodeRabbit review items: array sizes use AXIS_COUNT constant throughout; magic 1000.0 divisors replaced with THROTTLE_COMMAND_SCALE and OPTIMAL_P_SECONDS_TO_MS_MULTIPLIER; 1e-9 dt guard replaced with TORQUE_PROFILER_MIN_DT_S; response window loop off-by-one corrected
  • Orphaned test file removed: tests_optimal_p.rs covered FrameClass / TdTargetSpec / for_frame_inches — all removed with the prop-size approach; file was never declared in mod.rs and never compiled
  • Recommendation label verbiage: normalised throughout plot_step_response.rs to match console ("Recommendation (conservative):" / "Recommendation (moderate):")
  • Duplicate legend entry removed: "Recommendation: None (Optimal)" no longer appears alongside "Recommendation (none): No obvious tuning adjustments needed" for optimal-response axes
  • PNG skip reason: shown even when analysis was not performed (not only on success)
  • Uniform PNG section label: skip and success branches both use "Optimal P (Experimental, log-derived)" — the skip-reason line already explains the outcome
  • Consistency always shown in PNG: Consistency: X% emitted for every result (grey when good, orange with CV and threshold when poor), matching console parity
  • windows= replaces n=: more intuitive for pilots — number of valid step-response windows contributing to the Td mean
  • CV threshold in warning: unreliable (>40%) where 40 is derived from TD_COEFFICIENT_OF_VARIATION_MAX, so the displayed value tracks the constant

Docs

  • OVERVIEW.md: Torque-Inertia Profiler section with aircraft grouping, key constants, recommendation types, CV/consistency/low-authority reliability interpretation guide, and output signal summary
  • OVERVIEW.md: Windowing section clarified — FRAME_LENGTH_S (deconvolution resolution) is intentionally longer than RESPONSE_LENGTH_S (display/analysis); only the first RESPONSE_LENGTH_S of each estimated step response is retained
  • OVERVIEW.md: Over-P limitation — when P is too high and the craft oscillates, the profiler may report "Optimal" because the oscillatory step response produces a short Td that satisfies the physics formula for the current (excessive) P. High CV without [LOW AUTHORITY] is the indirect signal; experienced pilots should start from a lower P if gains feel high
  • OVERVIEW.md: High CV without [LOW AUTHORITY] documented as an investigative signal — rules out hover/low-energy inputs as the scatter cause; remaining causes include propwash, inconsistent maneuvers, and oscillation from over-P
  • OVERVIEW.md + README.md + --help: .headers.csv requirement documented — P gain is unavailable without it; skip reason is shown; all other analyses unaffected
  • README.md: --prop-size removed (no longer exists); --estimate-optimal-p description and examples updated; optimal P added to console output section

Remaining Open Items (post-merge)


Test Plan

  • cargo build --release passes cleanly
  • cargo clippy zero warnings
  • Run --step --estimate-optimal-p on logs with deliberate throttle-punch sequences (≥ TORQUE_PROFILER_MIN_EVENTS per aircraft group)
  • Verify PNG and console show matching stats: Td, windows=, noise, consistency %, LOW AUTHORITY (when applicable), recommendation with D
  • Verify PNG consistency line always present (grey = good, orange = poor with CV and threshold)
  • Verify skip reason appears in both PNG and console when punch events are insufficient
  • Verify BTFL/EMUF files with craft name after timestamp group correctly per aircraft
  • Verify no duplicate "none" recommendation on optimal-response axes
  • Verify .headers.csv absent → skip reason shown in console and PNG, all other plots unaffected

🤖 Generated with Claude Code

nerdCopter and others added 4 commits May 20, 2026 14:18
…ctor (squash)

Squash of all feature/torque-inertia-profiler commits for clean rebase onto master.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- extract_aircraft_key: include craft name from post-date suffix when
  prefix ends with BLACKBOX_LOG (fixes BTFL files where craft name
  follows _YYYYMMDD_HHMMSS grouping all into BTFL_BLACKBOX_LOG)
- TORQUE_PROFILER_SETTLE_SAMPLES: 3 → 5 samples to better cover
  ESC/motor lag at 1 kHz (5 ms vs 3 ms)
- TORQUE_PROFILER_ACHIEVABILITY_FACTOR: document empirical calibration
  origin (5" 6S freestyle / HELIO H7)
- extract_punch_ratios: comment that Yaw (axis 2) is collected for
  diagnostics but excluded from optimal-P analysis

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
All three labels now use consistent "Recommendation (X):" form matching
master: conservative, moderate, and aggressive — eliminating the
"Conservative recommendation:" / "Moderate recommendation:" variants
that leaked in from the feature branch during conflict resolution.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When step response is optimal the conservative block already emits
'Recommendation (none): No obvious tuning adjustments needed'.
The moderate else-if fallback then emitted a second redundant
'Recommendation: None (Optimal)' entry. Remove the redundant block;
the conservative branch always covers the no-recommendation case.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 20, 2026

📝 Walkthrough

Walkthrough

This PR introduces a physics-based optimal P gain recommendation system that derives target time-to-50% (Td) from torque-inertia ratios measured in throttle-punch flight dynamics. It implements two-phase aircraft grouping and per-file analysis, adds extensive tuning constants, refactors the plot infrastructure with structured configurations, and integrates an experimental --estimate-optimal-p CLI mode.

Changes

Optimal P Estimation with Physics-Derived Td Targets

Layer / File(s) Summary
Feature Documentation
OVERVIEW.md, README.md
Documentation for the new --estimate-optimal-p CLI option, aircraft grouping design, profiling phases, classification outputs, and windowing clarifications.
Tuning Constants & Axis Infrastructure
src/constants.rs, src/axis_names.rs
Defines low-authority setpoint threshold, Td QC thresholds, P headroom/reduction multipliers, torque-inertia profiler constants, and refactors axis name lookup with compile-time invariants. Introduces ROLL_PITCH_AXIS_COUNT.
Torque-Inertia Profiling & Physics Models
src/data_analysis/torque_inertia_profiler.rs
Implements AxisProfile and AircraftProfile for aggregating per-axis torque ratios from throttle-punch detection. Provides extract_punch_ratios to scan logs, compute peak angular acceleration, derive torque-inertia ratios, and enable Td target derivation.
Optimal P Estimation & Classification Logic
src/data_analysis/optimal_p_estimation.rs
Core analysis engine: TdStatistics computes mean and coefficient-of-variation consistency; classification enums (NoiseLevel, TdDeviation) and PRecommendation enum; OptimalPAnalysis::analyze validates targets, classifies deviation/noise, and generates recommendations with human-readable console output.
High-Frequency Energy Ratio for Noise Classification
src/data_analysis/spectral_analysis.rs
Adds calculate_hf_energy_ratio to compute high-frequency spectral energy above a cutoff using Welch PSD, enabling optional HF noise assessment in optimal P analysis.
Two-Phase Aircraft Grouping & Analysis Orchestration
src/main.rs (partial)
Introduces AnalysisOptions struct, aircraft key extraction, file grouping by aircraft, optional group-level profiling, refactored process_file signature, and optimal P estimation analysis block with Td sample extraction, HF energy computation, and result collection.
Plot Infrastructure & Optimal P Legend Integration
src/plot_functions/plot_step_response.rs, src/main.rs (plot wiring)
Introduces structured configuration types (CurrentPeakAndRatios, PlotDisplayConfig, OptimalPConfig, PdRecommendations variants) replacing many discrete parameters. Refactors signature, computes combined response once for reuse, adds low-authority detection, and expands optimal-P legend section with Td, deviation, noise, consistency, recommendation summary, or skip-reason blocks.
Unit Tests for Optimal P Estimation
src/data_analysis/tests_optimal_p.rs
Tests validating Td target derivation from frame sizes and FrameClass variants.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

The PR introduces substantial new functionality across multiple modules (torque-inertia profiling, optimal P analysis, two-phase orchestration, plot refactoring) with heterogeneous logic density. Key review areas: physics constant validity, Td statistics and classification correctness, aircraft grouping logic, HF energy computation, plot legend rendering with optional overlays, and main.rs pipeline orchestration. The refactored plot signature and structured configs add surface complexity but improve maintainability.

Possibly related PRs

  • nerdCopter/BlackBox_CSV_Render#141: Directly related—both transition optimal P from empirical tables to physics-derived torque-inertia models with achievability factor and skip-reason reporting; this PR implements the full feature.
  • nerdCopter/BlackBox_CSV_Render#143: Both modify step-response constants, P headroom/reduction multipliers, and plot_step_response.rs legend/recommendation rendering logic.
  • nerdCopter/BlackBox_CSV_Render#144: Directly conflicts—PR #144 reverts the entire --estimate-optimal-p feature including this PR's optimal_p_estimation module, CLI option, constants, and plot sections.

Suggested labels

enhancement, DO NOT MERGE

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly describes the primary feature: torque-inertia profiler & optimal P estimation, and accurately reflects the main code changes.
Linked Issues check ✅ Passed The PR implements all coding objectives from issue #141: physics-derived Td targets, torque-inertia profiling, achievability factor, skip-reason reporting on PNG, and autonomous target derivation without hardcoded tables.
Out of Scope Changes check ✅ Passed All code changes are directly related to implementing the torque-inertia profiler and optimal P estimation feature; refactorings like plot restructuring support the core objective.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/tip-squash-rebase

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

nerdCopter and others added 7 commits May 20, 2026 15:22
Replace the stale prop-size / TD_TARGETS table section with accurate
documentation of the new architecture: two-phase profiler (punch-event
detection → aircraft grouping → OptimalPAnalysis), key constants, the
aircraft grouping fix for generic BTFL filenames, and the relationship
between P:D recommendations and optimal-P estimation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ec; doc cleanup

- plot_step_response: show optimal-P section (including skip reason) whenever
  estimate_optimal_p was requested, not only when a full analysis succeeded.
  Fixes blank PNG legend when profiler data is insufficient.
- optimal_p_estimation + plot_step_response: fall back to current P:D when
  recommended_pd_conservative is None (optimal step response) so D is always
  shown alongside the recommended P, not only when overshoot correction exists.
- OVERVIEW.md: remove history/change language; replace constant values with
  constant names; clarify Yaw exclusion rationale and skip-reason output.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ount, Experimental label, skip in PNG

- Add LOW_AUTHORITY_SETPOINT_THRESHOLD_DEG_S constant and detect hover/slow-cruise logs where
  max setpoint < threshold; print warning in both PNG legend and console optimal-P section
- Add n= sample count to Td line in PNG and console for statistical-weight context
- Label Optimal P section as "Optimal P (Experimental, log-derived)" in PNG
- Show skip reason in PNG even when analysis was not performed (not just on success)
- D recommendation falls back to current P/D ratio when step response is optimal (no P:D
  change computed), ensuring D is always shown alongside optimal-P recommendation
- OVERVIEW.md: add CV/consistency/low-authority reliability interpretation section

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rity

- PNG skip branch: use same 'Optimal P (Experimental, log-derived)' label as
  success branch — skip reason line already conveys the skip; no duplicate marker needed
- PNG consistency: always emit Consistency: X% line (orange with CV when poor,
  grey when good) so PNG matches console parity; previously only warning shown
- Rename n= → windows= in PNG and console Td line — more intuitive for pilots
  (windows = valid step-response windows contributing to the Td mean)
- OVERVIEW.md: update windows= description accordingly

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Both console and PNG now report: 'unreliable (>40%)' where 40 is derived
from TD_COEFFICIENT_OF_VARIATION_MAX so the displayed value tracks the
constant without manual updates. Verbiage aligned between outputs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
README:
- Remove --prop-size (no longer exists; torque-inertia ratio is log-derived)
- Update --estimate-optimal-p description to match actual CLI
- Remove --prop-size from example command
- Add optimal P estimation to console output section

OVERVIEW:
- Windowing section: clarify FRAME_LENGTH_S vs RESPONSE_LENGTH_S design
  (longer deconvolution window for frequency resolution; only first
  RESPONSE_LENGTH_S retained for display and analysis)
- Optimal P Output bullet: enumerate all current dependability signals
  (windows=, consistency %, LOW AUTHORITY warning, skip reason)
  and reference the CV/reliability section for details

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove '(no prop-size required)' — the flag description stands on its own.
Trim console output bullet to remove redundant LOW AUTHORITY qualifier.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
src/main.rs (1)

467-468: ⚡ Quick win

Replace the new per-axis 3s with AXIS_COUNT.

These arrays are part of the axis pipeline, so hardcoding 3 here can drift from src/axis_names.rs and the rest of the codebase.

Proposed cleanup
+use crate::axis_names::AXIS_COUNT;
 use crate::data_analysis::torque_inertia_profiler::{extract_punch_ratios, AircraftProfile};

 fn profile_aircraft_group(files: &[String], debug_mode: bool) -> AircraftProfile {
-    let mut all_axis_ratios: [Vec<f64>; 3] = [Vec::new(), Vec::new(), Vec::new()];
+    let mut all_axis_ratios: [Vec<f64>; AXIS_COUNT] = std::array::from_fn(|_| Vec::new());
     let mut files_profiled: usize = 0;
-    let mut optimal_p_analyses: [Option<
-        crate::data_analysis::optimal_p_estimation::OptimalPAnalysis,
-    >; 3] = [None, None, None];
-    let mut optimal_p_skip_reasons: [Option<String>; 3] = [None, None, None];
+    let mut optimal_p_analyses: [Option<
+        crate::data_analysis::optimal_p_estimation::OptimalPAnalysis,
+    >; AXIS_COUNT] = std::array::from_fn(|_| None);
+    let mut optimal_p_skip_reasons: [Option<String>; AXIS_COUNT] =
+        std::array::from_fn(|_| None);

As per coding guidelines, All constants go in src/constants.rs — no hardcoded magic numbers in function code and Axis indices (0=Roll, 1=Pitch, 2=Yaw) must use the src/axis_names.rs module with AXIS_COUNT, AXIS_NAMES, and axis_name() function instead of creating redundant axis index constants.

Also applies to: 1120-1123

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/main.rs` around lines 467 - 468, Replace hardcoded axis-size literals
with the shared AXIS_COUNT constant: in the function profile_aircraft_group
(where all_axis_ratios is declared) change [Vec::new(); 3] / [Vec::new(),
Vec::new(), Vec::new()] style initializations to use AXIS_COUNT (e.g., using
iterators or array initialization sized by AXIS_COUNT) so the array length
matches the axis_names module; do the same for the other similar declarations
referenced in the review (the other per-axis arrays around the later block) and
ensure code imports/uses AXIS_COUNT (and axis_name()/AXIS_NAMES where
appropriate) instead of any hardcoded 3.
src/plot_functions/plot_step_response.rs (1)

576-744: ⚡ Quick win

Hoist the new legend colors into src/constants.rs.

The experimental Optimal P legend adds several fresh inline color literals. Keeping these named in one place will match the repo rule and make the new section easier to tune consistently.

As per coding guidelines, All constants go in src/constants.rs — no hardcoded magic numbers in function code.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/plot_functions/plot_step_response.rs` around lines 576 - 744, The new
legend uses several inline RGBColor literals (e.g., RGBColor(40,40,40),
RGBColor(0,100,200), RGBColor(80,80,80), RGBColor(200,100,0), RGBColor(0,150,0),
RGBColor(120,60,60)) inside the plot_step_response rendering (around the Optimal
P header, Td lines, consistency warning, recommendation, and skip-reason); hoist
these into the central constants module as named constants (for example
OPTIMAL_SECTION_DIVIDER_COLOR, OPTIMAL_P_HEADER_COLOR, LEGEND_TEXT_COLOR,
CONSISTENCY_WARNING_COLOR, RECOMMENDATION_COLOR, SKIP_REASON_COLOR), export them
from the constants module, and replace the inline RGBColor(...) usages in the
PlotSeries constructions (and any related helper closures like
append_d_recommendation if they reference colors) with the new named constants
so the palette is centralized and tunable.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/data_analysis/optimal_p_estimation.rs`:
- Around line 145-156: The function from_samples currently computes stats over
td_samples_ms without rejecting non-finite values; filter td_samples_ms for
finite f64 values (use f64::is_finite) and operate on that filtered
slice/iterator, returning None if the filtered set is empty, and then compute n,
mean and downstream statistics from the filtered values; apply the same
finite-value filtering fix to the other similar blocks referenced (around the
regions corresponding to lines 163-170 and 269-280) so no NaN/±inf can propagate
into mean/deviation/classification logic.

In `@src/data_analysis/torque_inertia_profiler.rs`:
- Around line 113-114: The struct field and hot-path code use hardcoded
axis/count and numeric literals (e.g., the array size [AxisProfile; 3] and magic
numbers 1000.0 and 1e-9); replace the literal 3 with AXIS_COUNT from
src/axis_names.rs and switch any manual axis-index usage to use
axis_name()/AXIS_NAMES as appropriate, and move numeric literals into
src/constants.rs (e.g., TIME_SCALE_MS, NANOSECONDS_TO_SECONDS or the existing
constant names used project-wide) and reference those constants in the profiler
computations (including the array declaration, loops, and any
scaling/multiplication operations). Update references around symbols AxisProfile
and the profiling functions that do scaling so they import and use AXIS_COUNT
and the constants instead of literal numbers.
- Line 200: The for-loop range resp_start..resp_end.saturating_sub(1)
incorrectly excludes the final adjacent sample pair; change the loop to iterate
up to resp_end - 1 inclusive so the last j/j+1 pair is processed (e.g. use an
inclusive range resp_start..=resp_end.saturating_sub(1) or otherwise compute the
end index as resp_end.saturating_sub(1) and use a standard exclusive range that
reaches that value) — update the loop that references resp_start and resp_end in
torque_inertia_profiler.rs accordingly.

In `@src/main.rs`:
- Around line 1260-1262: The stdout-only branch that prints "P gain not
available..." must also persist that reason into optimal_p_skip_reasons so the
PNG overlay legend can show it; update the else branch handling for
axis_name/axis_index to assign optimal_p_skip_reasons[axis_index] =
Some(format!("P gain not available for {}", axis_name)) (or similar descriptive
string) instead of leaving it None, ensuring any existing Option<String> is set
so the plotting code can render the skip reason.

---

Nitpick comments:
In `@src/main.rs`:
- Around line 467-468: Replace hardcoded axis-size literals with the shared
AXIS_COUNT constant: in the function profile_aircraft_group (where
all_axis_ratios is declared) change [Vec::new(); 3] / [Vec::new(), Vec::new(),
Vec::new()] style initializations to use AXIS_COUNT (e.g., using iterators or
array initialization sized by AXIS_COUNT) so the array length matches the
axis_names module; do the same for the other similar declarations referenced in
the review (the other per-axis arrays around the later block) and ensure code
imports/uses AXIS_COUNT (and axis_name()/AXIS_NAMES where appropriate) instead
of any hardcoded 3.

In `@src/plot_functions/plot_step_response.rs`:
- Around line 576-744: The new legend uses several inline RGBColor literals
(e.g., RGBColor(40,40,40), RGBColor(0,100,200), RGBColor(80,80,80),
RGBColor(200,100,0), RGBColor(0,150,0), RGBColor(120,60,60)) inside the
plot_step_response rendering (around the Optimal P header, Td lines, consistency
warning, recommendation, and skip-reason); hoist these into the central
constants module as named constants (for example OPTIMAL_SECTION_DIVIDER_COLOR,
OPTIMAL_P_HEADER_COLOR, LEGEND_TEXT_COLOR, CONSISTENCY_WARNING_COLOR,
RECOMMENDATION_COLOR, SKIP_REASON_COLOR), export them from the constants module,
and replace the inline RGBColor(...) usages in the PlotSeries constructions (and
any related helper closures like append_d_recommendation if they reference
colors) with the new named constants so the palette is centralized and tunable.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1221348f-c631-439f-b0bc-93f90e826402

📥 Commits

Reviewing files that changed from the base of the PR and between 56c4aa4 and 2cb8f88.

📒 Files selected for processing (11)
  • OVERVIEW.md
  • README.md
  • src/axis_names.rs
  • src/constants.rs
  • src/data_analysis/mod.rs
  • src/data_analysis/optimal_p_estimation.rs
  • src/data_analysis/spectral_analysis.rs
  • src/data_analysis/tests_optimal_p.rs
  • src/data_analysis/torque_inertia_profiler.rs
  • src/main.rs
  • src/plot_functions/plot_step_response.rs
📜 Review details
🧰 Additional context used
📓 Path-based instructions (2)
src/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.rs: All constants go in src/constants.rs — no hardcoded magic numbers in function code
Axis indices (0=Roll, 1=Pitch, 2=Yaw) must use the src/axis_names.rs module with AXIS_COUNT, AXIS_NAMES, and axis_name() function instead of creating redundant axis index constants
ALWAYS use Firmware revision header to detect firmware type, NEVER use Firmware type
Firmware revision patterns: Betaflight uses Betaflight 4.x.x or Betaflight YYYY.MM, Emuflight uses EmuFlight 0.x.x, INAV uses INAV x.x.x

Files:

  • src/data_analysis/spectral_analysis.rs
  • src/data_analysis/tests_optimal_p.rs
  • src/axis_names.rs
  • src/data_analysis/mod.rs
  • src/constants.rs
  • src/data_analysis/optimal_p_estimation.rs
  • src/data_analysis/torque_inertia_profiler.rs
  • src/plot_functions/plot_step_response.rs
  • src/main.rs
OVERVIEW.md

📄 CodeRabbit inference engine (AGENTS.md)

OVERVIEW.md: For new features, add documentation in OVERVIEW.md at proper, appropriate locations with information that is balanced or concise, never overly verbose, aligned with existing documentation style
Update Table of Contents in OVERVIEW.md when adding new sections

Files:

  • OVERVIEW.md
🔇 Additional comments (6)
OVERVIEW.md (1)

12-12: LGTM!

Also applies to: 41-41, 193-253

README.md (1)

51-51: LGTM!

Also applies to: 80-82, 107-107

src/constants.rs (1)

52-55: LGTM!

Also applies to: 266-353

src/axis_names.rs (1)

16-21: LGTM!

Also applies to: 27-38

src/data_analysis/mod.rs (1)

9-11: LGTM!

src/data_analysis/tests_optimal_p.rs (1)

3-3: ⚡ Quick win

Missing input: Provide the original <review_comment> content (and any referenced diff/snippet) so it can be rewritten with the required fixes/context.

Comment thread src/data_analysis/optimal_p_estimation.rs
Comment thread src/data_analysis/torque_inertia_profiler.rs Outdated
Comment thread src/data_analysis/torque_inertia_profiler.rs Outdated
Comment thread src/main.rs
nerdCopter and others added 2 commits May 21, 2026 11:21
Actionable (4):
- optimal_p_estimation: filter non-finite (NaN/inf) Td samples before
  computing statistics in TdStats::from_samples to prevent propagation
  into mean/deviation/classification
- torque_inertia_profiler: replace magic 1000.0 with THROTTLE_COMMAND_SCALE,
  1e-9 with TORQUE_PROFILER_MIN_DT_S; both added to constants.rs
- torque_inertia_profiler: fix off-by-one in derivative scan loop —
  resp_start..resp_end.saturating_sub(1) excluded the last valid j/j+1 pair;
  corrected to resp_start..resp_end
- main.rs: persist skip reason into optimal_p_skip_reasons when P gain is
  unavailable so PNG overlay renders the reason instead of nothing

Nitpicks (2):
- main.rs + torque_inertia_profiler: replace all hardcoded [T; 3] axis-count
  literals with AXIS_COUNT (from axis_names.rs); use std::array::from_fn for
  initialisation
- plot_step_response: hoist all inline RGBColor literals in the Optimal P
  legend section into named constants in constants.rs
  (COLOR_OPTIMAL_P_DIVIDER/HEADER/TEXT/WARNING/RECOMMENDATION/SKIP);
  also applies COLOR_OPTIMAL_P_WARNING to the LOW AUTHORITY warning

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- extract_punch_ratios return type: [Vec<f64>; 3] -> [Vec<f64>; AXIS_COUNT]
- ms-to-samples conversions: / 1000.0 -> / OPTIMAL_P_SECONDS_TO_MS_MULTIPLIER
  (constant already existed; now imported and used consistently)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@nerdCopter
Copy link
Copy Markdown
Owner Author

@coderabbitai, is this PR/idea/functionality proper, accurate, and worth merging? Should we consider an alternative or an additive method such as chirp-analysis? This branch works for full-flight and tuning-flights, but not chirp-logs (no punch-outs).

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 21, 2026

Tip

For best results, initiate chat on the files or code changes.

@nerdCopter — great question. Here's a thorough technical assessment:


Is the PR Proper, Accurate, and Worth Merging?

Short answer: Yes, with appropriate caveats — the experimental label is well-earned and correctly applied.

Strengths ✅

  1. Physics-grounded approach. The TORQUE_PROFILER_TD_CALC_K = π/2 × 1000 constant is the classical relationship between natural frequency and 50%-rise time (Td = π/(2·ωn)), scaled to milliseconds. This is textbook control theory applied correctly to a time-domain measurement.

  2. Replaces hardcoded tables. The PR fulfills the #141 objective of eliminating empirical TD_TARGETS lookup tables with a per-aircraft derived target — a genuine improvement in generality and maintainability.

  3. Works on ordinary logs. Throttle punches are a standard, recommended logging maneuver (2–3 clean ramp-ups) to detect oscillations at various throttle levels, so the ≥5-event requirement is not an onerous constraint for any intentional tuning session.

  4. Graceful degradation. Skip-reasons, LOW_AUTHORITY detection, consistency (CV) metrics, and minimum sample guards are all present — the tool won't silently produce wrong answers.

  5. Well-structured two-phase pipeline. Aircraft-level profiling separated from per-file analysis is architecturally clean and correct; profiling across a log group before per-file analysis is the right order of operations.

  6. Statistical rigor. Coefficient of Variation thresholds, minimum event counts (TORQUE_PROFILER_MIN_EVENTS = 5), and the consistency classification are appropriate quality gates.

Concerns / Risks ⚠️

  1. Achievability factor (2.50) is empirical. This is the main uncertainty. It's documented and cited, but it may not transfer well across craft types outside the validation set (HELIO, FOXEER, APEX). Consider making it a CLI-tunable parameter in a follow-up, or at least logging its value in the output.

  2. Yaw excluded from P recommendation. This is fine for now — yaw authority is fundamentally different (reaction torque from motor inertia rather than direct gyro coupling), and treating it as diagnostic-only is reasonable.

  3. Test file vs. removed feature. The AI summary notes that tests_optimal_p.rs tests TdTargetSpec::for_frame_inches and FrameClass variants — but the PR removes --prop-size. If those types were deleted, these tests would fail to compile; if they're dead code that slipped through the rebase/squash, they should be cleaned up.

  4. Settle samples = 5 (~5 ms at 1 kHz). The PR notes this covers ESC/motor lag, which is reasonable, but ESC response varies widely (fast AM32/BLHeli32 vs. slow 50 Hz PWM). This is acceptable for the current scope.


Should Chirp-Analysis Be Added?

Yes — as an additive, complementary path, not a replacement.

There is already an offline controller tuning framework for Betaflight that enables analysis of Blackbox logs with automatic chirp excitation to tune linear filters and PID controllers offline. The approach is:

  • An automatic chirp signal generator excites the quadcopter during flight, with frequency increasing exponentially over a defined period from an initial to a final frequency.
  • It extracts measured controller dynamics, displays step responses of setpoint tracking and input disturbance rejection, and Bode plots of the closed-loop system.

Applying known inputs like step or chirp, recording outputs, and fitting a transfer function is useful when full dynamics are unknown or nonlinear.

The two approaches are fundamentally complementary:

This PR (punch-based) Chirp-based
Log type Any flight with ≥5 punch-outs Dedicated chirp-excitation flight
Output Time-domain Td, P/D recommendation Frequency-domain Bode plot, stability margins
Firmware req. None Betaflight chirp-generator PR (#13105)
Accessibility High (any tuning session) Lower (special flight needed)
Rigor Good (time-domain approximation) Higher (full FRF estimation)
Chirp logs ❌ No punch-outs ✅ Native use case

The best path forward would be to merge this PR as-is (it's ready for its stated scope), then open a follow-up issue for a chirp-log analysis path that produces a frequency-domain plant model. The two paths could share the same --estimate-optimal-p output format or add a complementary --estimate-from-chirp mode. Since chirp logs contain no punch-outs and punch-based logs may not contain chirp excitation, both paths are needed to cover all log types.


Recommendation

Merge. The implementation is physically sound, properly guarded, and experimentally labeled. The achievability factor is the main open question but is no worse than the hardcoded tables it replaces. A follow-up issue tracking chirp-analysis support would be a good way to capture the complementary approach without blocking this PR.

Would you like me to open a follow-up GitHub issue for chirp-log analysis support?

Tests for FrameClass/TdTargetSpec/for_frame_inches — all removed with the
prop-size approach in favour of the physics-derived torque-inertia profiler.
File was not declared in mod.rs and was never compiled; deleted to avoid
confusion.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@nerdCopter
Copy link
Copy Markdown
Owner Author

@coderabbitai re: Achievability factor (2.50) is empirical

The factor is intentionally not generalised beyond what calibration data supports. The key mitigating detail is that torque_inertia_ratio is derived from the log rather than looked up from a prop-size table — so it already self-calibrates for craft size: a 10" produces a smaller ratio and a correspondingly larger Td target; a 3" produces a larger ratio and a smaller target. The achievability factor sits on top as a second-order correction for the gap between physics and reality (ESC lag, motor startup, prop-wash efficiency losses).

That gap is not perfectly constant across craft sizes — larger/slower craft have proportionally more lag — but it is a second-order effect on top of the first-order self-calibration the log-derived ratio already provides. Attempting to model the factor as craft-dependent would require more calibration data than currently exists, and would add complexity without a clear improvement for the "Experimental" status of this feature.

The constant is named, clearly documented as empirical, and easily adjusted. Pilot testing on non-5" craft is the practical path to surfacing whether further tuning is needed. No change recommended at this time.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 21, 2026

Tip

For best results, initiate chat on the files or code changes.

@nerdCopter — that's a well-reasoned clarification. The point that torque_inertia_ratio is log-derived (not prop-size-table-looked-up) is the key: the first-order size dependency is already baked into the measured ratio, so the achievability factor only needs to handle the second-order residual (ESC lag, motor startup transients, efficiency losses). Trying to make that residual craft-dependent without sufficient cross-size calibration data would introduce noise rather than signal.

The current design — named constant, clearly documented, easy to adjust, experimentally labeled — is the right call for this stage. Pilot testing on outlier craft sizes is exactly the right feedback mechanism.

🐇 (nodding thoughtfully)


✏️ Learnings added
Learnt from: nerdCopter
URL: https://github.com/nerdCopter/BlackBox_CSV_Render/pull/145

Timestamp: 2026-05-21T17:52:16.028Z
Learning: In `src/constants.rs` (BlackBox_CSV_Render project, Rust), `TORQUE_PROFILER_ACHIEVABILITY_FACTOR = 2.50` is intentionally a fixed empirical constant. The `torque_inertia_ratio` is derived directly from the flight log (not a prop-size lookup table), so it already self-calibrates for craft size: larger/slower craft produce smaller ratios and correspondingly larger Td targets; smaller/faster craft produce larger ratios and smaller targets. The achievability factor is only a second-order correction for the gap between physics and reality (ESC lag, motor startup, prop-wash efficiency losses). Attempting to make the factor craft-dependent would require more cross-size calibration data than currently exists. No change is recommended; the constant is named, documented as empirical, and easy to adjust as pilot testing on non-5" craft surfaces new data.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

TORQUE_PROFILER_SETTLE_SAMPLES (fixed 5 samples) was only correct at 1 kHz
(5 ms). At higher loop rates the settle period was far too short:
  - 3.2 kHz (BMI270): 1.56 ms — before most ESCs respond
  - 4 kHz: 1.25 ms
  - 8 kHz: 0.63 ms

Premature measurement produces near-zero angular acceleration, underestimates
torque_inertia_ratio, inflates the Td target, and biases recommendations toward
Optimal/Decrease when P may need increasing.

Replace with TORQUE_PROFILER_SETTLE_MS (5.0 ms) converted to samples at runtime
using the actual log sample rate. Behaviour at 1 kHz is unchanged; high-rate
logs now skip the correct number of samples to clear ESC/motor lag before
measuring peak angular acceleration.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@nerdCopter
Copy link
Copy Markdown
Owner Author

@coderabbitai Two additional commits since last exchange:

137f3dc — Remove orphaned tests_optimal_p.rs
The test file covered FrameClass, TdTargetSpec, and for_frame_inches — all removed with the prop-size approach. The file was never declared in mod.rs so it was never compiled; deleted for correctness.

26bbf62 — Settle window time-based (TORQUE_PROFILER_SETTLE_MS) not sample-count-based

Finding: The prior constant TORQUE_PROFILER_SETTLE_SAMPLES = 5 was calibrated for 1 kHz logs (5 samples = 5 ms). At high loop rates it collapses: BMI270 3.2 kHz → 1.56 ms, 4 kHz → 1.25 ms, 8 kHz → 0.63 ms — all before most ESCs have responded (DSHOT lag ≈ 1–5 ms). Measuring |Δgyro/Δt| before motor torque changes produces near-zero alpha, underestimates torque_inertia_ratio, inflates the physics-derived Td target, and biases recommendations toward Optimal/Decrease-P on high-rate logs.

Fix: replaced the constant with TORQUE_PROFILER_SETTLE_MS: f64 = 5.0 and convert to samples at runtime:

let settle_samples = (TORQUE_PROFILER_SETTLE_MS / OPTIMAL_P_SECONDS_TO_MS_MULTIPLIER * sample_rate).ceil() as usize;

sample_rate is the logging rate (Hz), already computed from actual time_sec row deltas in the parser — not the gyro/PID loop rate. This is the correct denominator since we are counting log rows, not hardware cycles. At 1 kHz: no change (5 samples). At 8 kHz log rate: 40 samples = 5 ms. Correct at all rates.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 21, 2026

Tip

For best results, initiate chat on the files or code changes.

@nerdCopter — both commits are well-reasoned and correct.

137f3dc — Orphaned test removal ✅

Clean-up is correct. The file was testing FrameClass/TdTargetSpec/for_frame_inches which were all removed with the prop-size approach. Since it was never declared in mod.rs it was dead weight rather than a compile error, but deletion is the right call — an undeclared test file is worse than no test file (it creates a false impression of coverage that never runs). Good hygiene.

26bbf62 — Time-based settle window ✅

This is a genuine correctness fix, not cosmetic. The math checks out:

Log rate Old (fixed 5 samples) New (ceil(5.0 ms × rate))
1 kHz 5 ms ✅ 5 samples = 5 ms ✅
3.2 kHz (BMI270) 1.56 ms ❌ 16 samples = 5 ms ✅
4 kHz 1.25 ms ❌ 20 samples = 5 ms ✅
8 kHz 0.63 ms ❌ 40 samples = 5 ms ✅

The bug path you described is exactly right: measuring angular acceleration before ESC torque has arrived → near-zero alpha → underestimated ratio → inflated Td target → spurious Optimal/Decrease-P at high loop rates. Using the logging rate (derived from actual time_sec deltas) as the denominator rather than the hardware gyro/PID rate is the correct denominator here — we are indexing log rows, not hardware cycles.

The .ceil() is the right rounding direction (bias toward more settling, not less).

🐇 (ticks both boxes with a satisfied nod)


✏️ Learnings added
Learnt from: nerdCopter
URL: https://github.com/nerdCopter/BlackBox_CSV_Render/pull/145

Timestamp: 2026-05-21T18:20:47.097Z
Learning: In `src/data_analysis/torque_inertia_profiler.rs` (BlackBox_CSV_Render, Rust), the settle window before measuring angular acceleration after a throttle-punch event must be time-based, not sample-count-based. The constant `TORQUE_PROFILER_SETTLE_MS: f64 = 5.0` is converted to samples at runtime as `(TORQUE_PROFILER_SETTLE_MS / OPTIMAL_P_SECONDS_TO_MS_MULTIPLIER * sample_rate).ceil() as usize`, where `sample_rate` is the **logging rate** in Hz (computed from actual `time_sec` row deltas in the parser — not the gyro/PID loop rate). This is required because a fixed sample count (e.g., 5) collapses to sub-millisecond settle time at high log rates (BMI270 3.2 kHz → 1.56 ms, 8 kHz → 0.63 ms), which is shorter than typical DSHOT/ESC lag (1–5 ms), causing near-zero alpha measurements, underestimated torque_inertia_ratio, inflated Td targets, and spurious Optimal/Decrease-P recommendations on high-rate logs.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

The profiler may report Optimal when P is too high because an oscillatory
step response produces a short Td that matches the physics formula for the
current (excessive) P. CV/consistency warning is the indirect signal.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@nerdCopter
Copy link
Copy Markdown
Owner Author

@coderabbitai Three additional commits since last exchange:

26bbf62 — Settle window time-based (TORQUE_PROFILER_SETTLE_MS), not sample-count-based
Already reported above — included here for completeness.

dd8ec9d — Document over-P limitation in CV/reliability section
Finding (Q&A): when P is already too high and the craft oscillates, the profiler may report "Optimal" rather than "Decrease P." An oscillatory step response produces a short measured Td; fed into the physics formula this yields a P_optimal close to the current (excessive) P — the comparison is corrupted by the same oscillation that causes the problem. CV is the indirect signal: severe oscillation scatters Td samples and triggers the consistency warning. Added a note in bold recommending pilots start from a lower P if gains feel high or the craft oscillates, and clarifying this tool is a validator/refinement tool, not a recovery tool for badly mis-tuned aircraft.

012f4fe — Document high CV without [LOW AUTHORITY] as an over-P investigative signal
Finding (Q&A): high CV + no LOW AUTHORITY flag rules out hover/low-energy inputs as the scatter cause. Remaining causes are propwash, inconsistent maneuvers, and oscillation from over-P. An automated "P is too high" diagnosis was rejected because: (1) "high P" is craft-relative with no reliable absolute threshold; (2) P_optimal is corrupted by the same oscillation (Td²-dependence means a 20% short Td inflates P_optimal ~44%, masking the over-P); (3) false positives on legitimate aggressive tunes would erode trust. Added a documentation note only: high CV without LOW AUTHORITY is a prompt to verify the craft is not oscillating before acting on any recommendation.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 21, 2026

Tip

For best results, initiate chat on the files or code changes.

@nerdCopter — all three commits reviewed and confirmed correct.

26bbf62 — Time-based settle window ✅ Already confirmed in the previous exchange.

dd8ec9d — Over-P limitation documentation ✅

The blind spot is real and the reasoning is exact: oscillation shortens measured Td (the step response peaks early due to ringing), the physics formula then yields a P_optimal close to the oscillating P, and the comparison reports "Optimal" when the true state is "too high." CV scatter is the correct indirect signal to surface this — it's the only observable symptom that doesn't require knowing what "too high" means in absolute terms. The framing of this tool as a validator/refinement tool rather than a recovery tool for badly mis-tuned craft is the right boundary to draw in documentation.

012f4fe — High CV without LOW_AUTHORITY as investigative signal ✅

The decision to reject automated diagnosis is well-justified on all three grounds:

  1. No reliable craft-agnostic absolute P threshold — correct, loop dynamics scale with motor KV, prop, weight.
  2. Td²-dependence corrupts P_optimal when oscillating — a short Td inflates P_optimal via P ∝ 1/Td², meaning an already-excessive P maps to a P_optimal that is also excessive-looking, masking the problem.
  3. False positive risk on legitimate aggressive tunes — a high-authority race tune should have tight Td scatter; penalising it would erode trust.

Documentation-only with a clear investigative prompt is the right scope here.

🐇 (approves all three with a considered nod)


✏️ Learnings added
Learnt from: nerdCopter
URL: https://github.com/nerdCopter/BlackBox_CSV_Render/pull/145

Timestamp: 2026-05-21T18:41:33.581Z
Learning: In `OVERVIEW.md` (BlackBox_CSV_Render project), the torque-inertia profiler has a documented over-P blind spot: when P is already too high and the craft oscillates, the oscillatory step response produces a short measured Td. Fed into the physics formula (P_optimal ∝ 1/Td²), this yields a P_optimal close to the current (excessive) P, causing the tool to report "Optimal" instead of "Decrease P." The CV (Coefficient of Variation) scatter is the only reliable indirect signal — severe oscillation scatters Td samples and triggers the consistency warning. The tool is documented as a validator/refinement tool, not a recovery tool for badly mis-tuned aircraft; pilots should start from a lower P if the craft oscillates.

Learnt from: nerdCopter
URL: https://github.com/nerdCopter/BlackBox_CSV_Render/pull/145

Timestamp: 2026-05-21T18:41:33.581Z
Learning: In `OVERVIEW.md` (BlackBox_CSV_Render project), high CV without a `[LOW AUTHORITY]` flag is documented as an investigative signal for over-P, but automated "P is too high" detection was explicitly rejected for three reasons: (1) "high P" has no reliable craft-agnostic absolute threshold; (2) Td²-dependence means oscillation corrupts P_optimal — a 20% shorter Td inflates P_optimal by ~44% (via 1/Td²), masking the over-P problem; (3) false positives on legitimate aggressive-but-correct tunes would erode user trust. Documentation-only guidance is added instead.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Add requirement note to OVERVIEW.md, README.md, and --help output.
Without a .headers.csv file P gain is unavailable and optimal P
estimation is skipped; all other analyses are unaffected.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@nerdCopter nerdCopter force-pushed the feature/tip-squash-rebase branch from bf0e53e to 421fca5 Compare May 21, 2026 18:58
@nerdCopter nerdCopter merged commit 00d4b60 into master May 21, 2026
5 checks passed
@nerdCopter
Copy link
Copy Markdown
Owner Author

@coderabbitai, unexpected bug found when reviewing post-merge PNG's.
grepping some console outputs, i see multiple CV= < 40, yet reported as "unreliable > 40"

  ⚠ Low consistency (CV=93.4%) — unreliable (>40%)
  ⚠ Low consistency (CV=42.1%) — unreliable (>40%)
  ⚠ Low consistency (CV=43.0%) — unreliable (>40%)
  ⚠ Low consistency (CV=78.4%) — unreliable (>40%)
  ⚠ Low consistency (CV=97.3%) — unreliable (>40%)
  ⚠ Low consistency (CV=38.9%) — unreliable (>40%)
  ⚠ Low consistency (CV=99.4%) — unreliable (>40%)
  ⚠ Low consistency (CV=35.0%) — unreliable (>40%)
  ⚠ Low consistency (CV=41.6%) — unreliable (>40%)
  ⚠ Low consistency (CV=46.6%) — unreliable (>40%)
  ⚠ Low consistency (CV=48.9%) — unreliable (>40%)
  ⚠ Low consistency (CV=203.0%) — unreliable (>40%)
  ⚠ Low consistency (CV=55.1%) — unreliable (>40%)
  ⚠ Low consistency (CV=59.2%) — unreliable (>40%)
  ⚠ Low consistency (CV=117.1%) — unreliable (>40%)
  ⚠ Low consistency (CV=115.7%) — unreliable (>40%)
  ⚠ Low consistency (CV=96.7%) — unreliable (>40%)
  ⚠ Low consistency (CV=42.9%) — unreliable (>40%)
  ⚠ Low consistency (CV=31.9%) — unreliable (>40%)
  ⚠ Low consistency (CV=42.3%) — unreliable (>40%)
  ⚠ Low consistency (CV=42.9%) — unreliable (>40%)
  ⚠ Low consistency (CV=104.0%) — unreliable (>40%)
  ⚠ Low consistency (CV=121.3%) — unreliable (>40%)
  ⚠ Low consistency (CV=44.5%) — unreliable (>40%)
  ⚠ Low consistency (CV=43.0%) — unreliable (>40%)
  ⚠ Low consistency (CV=49.3%) — unreliable (>40%)
  ⚠ Low consistency (CV=93.3%) — unreliable (>40%)
  ⚠ Low consistency (CV=90.1%) — unreliable (>40%)
  ⚠ Low consistency (CV=65.9%) — unreliable (>40%)
  ⚠ Low consistency (CV=42.3%) — unreliable (>40%)
  ⚠ Low consistency (CV=84.9%) — unreliable (>40%)

please investigate and open issue ticket with problem/potential solution.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 21, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

Let me dig into the code to confirm the root cause before opening the issue.


🏁 Script executed:

#!/bin/bash
# Read the full TdStatistics implementation and format_console_output
cat src/data_analysis/optimal_p_estimation.rs

Length of output: 24510


Found the bug. Let me open the issue now.


🏁 Script executed:

#!/bin/bash
gh issue create \
  --repo nerdCopter/BlackBox_CSV_Render \
  --assignee nerdCopter \
  --title "bug: misleading CV consistency warning message — blames CV>40% when actual failure is consistency<70%" \
  --body "## Bug Report

**Reported in:** PR `#145` post-merge console output review
**Reporter:** `@nerdCopter`

---

## Symptom

Console output shows \`⚠ Low consistency (CV=XX%) — unreliable (>40%)\` for CV values that are **below** 40%, e.g.:

\`\`\`
⚠ Low consistency (CV=38.9%) — unreliable (>40%)
⚠ Low consistency (CV=35.0%) — unreliable (>40%)
⚠ Low consistency (CV=31.9%) — unreliable (>40%)
\`\`\`

The message incorrectly implies the flag was triggered because \`CV > 40%\`, when the CV value is plainly within the acceptable threshold.

---

## Root Cause

In \`src/data_analysis/optimal_p_estimation.rs\`, \`TdStatistics::is_consistent()\` has **three** independent conditions that all must be true:

\`\`\`rust
pub fn is_consistent(&self) -> bool {
    self.num_samples >= TD_SAMPLES_MIN_FOR_STDDEV          // (1) enough samples
        && self.consistency >= TD_CONSISTENCY_MIN_THRESHOLD // (2) fraction within ±1 SD >= 0.70
        && self.coefficient_of_variation
            .map_or(true, |cv| cv <= TD_COEFFICIENT_OF_VARIATION_MAX) // (3) CV <= 0.40
}
\`\`\`

The warning is emitted whenever \`!is_consistent()\` fires — but the message in \`format_console_output\` **always** attributes the failure to the CV threshold, regardless of which condition actually caused the failure:

\`\`\`rust
if !self.td_stats.is_consistent() {
    let cv_percent = self.td_stats.coefficient_of_variation
        .map_or(0.0, |cv| cv * 100.0);
    output.push_str(&format!(
        \"  ⚠ Low consistency (CV={:.1}%) — unreliable (>{:.0}%)\\n\",
        cv_percent,
        TD_COEFFICIENT_OF_VARIATION_MAX * 100.0  // always prints \">40%\"
    ));
}
\`\`\`

When CV=31.9% (below 40%) but \`consistency < 0.70\` (fraction of samples within ±1 std dev), the flag fires correctly but the message incorrectly blames \`CV > 40%\`.

---

## Fix Suggestion

The warning message should identify **which condition(s)** actually failed. Two viable approaches:

**Option A — report both metrics when either fails:**
\`\`\`rust
if !self.td_stats.is_consistent() {
    let cv_percent = self.td_stats.coefficient_of_variation
        .map_or(0.0, |cv| cv * 100.0);
    let consistency_percent = self.td_stats.consistency * 100.0;
    output.push_str(&format!(
        \"  ⚠ Low consistency (CV={:.1}%, Consistency={:.0}%) — unreliable (CV>{:.0}% or Consistency<{:.0}%)\\n\",
        cv_percent,
        consistency_percent,
        TD_COEFFICIENT_OF_VARIATION_MAX * 100.0,
        TD_CONSISTENCY_MIN_THRESHOLD * 100.0,
    ));
}
\`\`\`

**Option B — conditionally show only the failing reason:**
\`\`\`rust
if !self.td_stats.is_consistent() {
    let cv_percent = self.td_stats.coefficient_of_variation
        .map_or(0.0, |cv| cv * 100.0);
    let consistency_percent = self.td_stats.consistency * 100.0;
    let cv_failed = self.td_stats.coefficient_of_variation
        .map_or(false, |cv| cv > TD_COEFFICIENT_OF_VARIATION_MAX);
    let consistency_failed = self.td_stats.consistency < TD_CONSISTENCY_MIN_THRESHOLD;
    let reason = match (cv_failed, consistency_failed) {
        (true, true)  => format!(\"CV={:.1}% (>{:.0}%) and Consistency={:.0}% (<{:.0}%)\",
                                  cv_percent, TD_COEFFICIENT_OF_VARIATION_MAX * 100.0,
                                  consistency_percent, TD_CONSISTENCY_MIN_THRESHOLD * 100.0),
        (true, false) => format!(\"CV={:.1}% (>{:.0}%)\",
                                  cv_percent, TD_COEFFICIENT_OF_VARIATION_MAX * 100.0),
        (false, true) => format!(\"Consistency={:.0}% (<{:.0}%)\",
                                  consistency_percent, TD_CONSISTENCY_MIN_THRESHOLD * 100.0),
        (false, false) => format!(\"insufficient samples (need >= {})\", TD_SAMPLES_MIN_FOR_STDDEV),
    };
    output.push_str(&format!(\"  ⚠ Low consistency ({}) — unreliable\\n\", reason));
}
\`\`\`

Option B is more precise and avoids printing a threshold that was not the cause of the failure.

---

## Impact

- **Correctness:** The warning fires correctly (the flagging logic in \`is_consistent()\` is sound); only the diagnostic message is wrong.
- **User confusion:** Pilots seeing \`CV=31.9% — unreliable (>40%)\` cannot reconcile the displayed value with the stated threshold, eroding trust in the tool output.
- **No false negatives:** No flight data or recommendations are silently wrong — only the explanation text is misleading.

---

**References:** PR `#145` comment thread (post-merge bug report)"

Length of output: 61


@nerdCopter — root cause confirmed and issue opened: #149.

What's happening:

TdStatistics::is_consistent() has three independent gates:

  1. num_samples >= 2
  2. consistency >= 0.70 ← fraction of samples within ±1 std dev
  3. cv <= 0.40

The warning fires whenever !is_consistent() — but format_console_output always formats the message as "unreliable (>40%)" regardless of which condition actually triggered the flag. When condition 2 fails (e.g. only 60% of samples fell within ±1 std dev) but CV is 31.9%, the message blames a threshold that was never crossed.

The flag logic is correct; only the diagnostic message is wrong. No recommendations are silently miscalculated.

Recommended fix (Option B in the issue): pattern-match which condition(s) failed and emit the appropriate reason string, e.g.:

  • CV only: CV=42.3% (>40%)
  • Consistency only: Consistency=62% (<70%)
  • Both: CV=42.3% (>40%) and Consistency=62% (<70%)
  • Neither (sample count): insufficient samples (need >= 2)

Issue #149 has the full diff-ready code for both Option A (always show both metrics) and Option B (show only the failing reason). Option B is cleaner for pilot readability.

🐇 (⊙_⊙) ← surprised it slipped through, good catch

@nerdCopter
Copy link
Copy Markdown
Owner Author

@coderabbitai, does this (already merged) PR group the filenames/flights properly and use all flights the group to calculate estimated-optimal-p as expected? as planned planned as per #132 (comment) ? did #132 (comment) change the expected behavior and regress the intention of this PR?

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 22, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

Let me pull up the referenced comments from #132 and the current grouping implementation to do a proper comparison.


🏁 Script executed:

#!/bin/bash
# Fetch the two specific comments from PR `#132`
gh api repos/nerdCopter/BlackBox_CSV_Render/issues/comments/4472746290 --jq '.body' 2>/dev/null || echo "NOT FOUND as issue comment"
echo "---SEPARATOR---"
gh api repos/nerdCopter/BlackBox_CSV_Render/issues/comments/4501386876 --jq '.body' 2>/dev/null || echo "NOT FOUND as issue comment"

Length of output: 21718


Good — now let me read the actual merged code in main.rs and related files to compare what was planned vs. what shipped.


🏁 Script executed:

#!/bin/bash
echo "=== main.rs: aircraft grouping / key extraction logic ==="
rg -n "aircraft_key\|aircraft_group\|group_by\|group_files\|extract_aircraft\|craft_name\|GroupBy\|HashMap.*aircraft\|BTreeMap.*aircraft" src/main.rs | head -60

echo -e "\n=== main.rs: full two-phase pipeline around torque_inertia ==="
rg -n "torque_inertia\|AircraftProfile\|aircraft_profile\|Phase 1\|Phase 2\|profil" src/main.rs | head -60

echo -e "\n=== main.rs: how files are iterated and grouped (main fn body) ==="
# Find the main() function and show the file iteration section
ast-grep --pattern 'fn main() {
  $$$
}' src/main.rs | head -200

echo -e "\n=== main.rs: full file showing grouping section ==="
grep -n "input_files\|for.*file\|group\|aircraft\|AircraftProfile\|torque_inertia_profiler" src/main.rs | head -80

echo -e "\n=== main.rs lines around the grouping logic (search for HashMap or group) ==="
rg -n "HashMap\|BTreeMap\|group\|aircraft" src/main.rs | head -60

Length of output: 3643


🏁 Script executed:

#!/bin/bash
echo "=== Show the main() function body - file processing section ==="
# Get the line numbers of fn main and fn process_file to extract the range
MAIN_START=$(grep -n "^fn main" src/main.rs | head -1 | cut -d: -f1)
PROCESS_FILE_LINE=$(grep -n "^fn process_file" src/main.rs | head -1 | cut -d: -f1)
echo "main() starts at line $MAIN_START, process_file at line $PROCESS_FILE_LINE"
sed -n "${MAIN_START},${PROCESS_FILE_LINE}p" src/main.rs

echo -e "\n=== extract_aircraft_key function if present ==="
rg -n "fn extract_aircraft_key\|fn group_files\|fn aircraft_key" src/main.rs src/

echo -e "\n=== torque_inertia_profiler.rs: check for grouping/key logic ==="
rg -n "aircraft_key\|group\|craft_name\|filename\|BTFL\|EMUF" src/data_analysis/torque_inertia_profiler.rs | head -40

Length of output: 275


Good — the two-phase grouping architecture IS present. Now let me read the actual implementations to verify correctness.


🏁 Script executed:

#!/bin/bash
echo "=== extract_aircraft_key() full implementation ==="
sed -n '390,465p' src/main.rs

echo -e "\n=== profile_aircraft_group() full implementation ==="
sed -n '463,520p' src/main.rs

echo -e "\n=== main() two-phase pipeline section ==="
sed -n '1665,1715p' src/main.rs

echo -e "\n=== AircraftProfile::default() impl ==="
rg -n "impl Default for AircraftProfile\|fn default" src/data_analysis/torque_inertia_profiler.rs -A 10 | head -30

echo -e "\n=== BTFL craft-name grouping fix mentioned in PR — check for BTFL special handling ==="
sed -n '390,450p' src/main.rs | grep -n "BTFL\|EMUF\|craft\|name\|Betaflight\|reverse\|suffix"

Length of output: 9487


@nerdCopter — good news: the grouping works correctly as planned, and the BTFL concern was fixed before merge.

Here's the full picture:


Is the grouping implemented as planned?

Yes — exactly. The two-phase pipeline is present and correct:

main()
  │
  ├─ group_files_by_aircraft()           ← BTreeMap<key, Vec<files>>
  │
  └─ for each (craft_key, group_files):
       │
       ├─ Phase 1: profile_aircraft_group()   ← aggregates ALL punch events across entire group
       │     → AircraftProfile (per-axis medians)
       │
       └─ Phase 2: for each file in group:
               process_file(..., &aircraft_profile)   ← shared Td target used per file

All flights in a group contribute their throttle-punch ratios to a single AircraftProfile in Phase 1 before any per-file analysis runs. This is exactly the intended behavior.


Did comment #4501386876 cause a regression?

No. That comment flagged a concern that BTFL_BLACKBOX_LOG_YYYYMMDD_HHMMSS_STELLARH7DEV.02.csv would collapse all BTFL files into a single BTFL_BLACKBOX_LOG group (since the date precedes the craft name in Betaflight's naming scheme). That concern was addressed in the merged code.

The extract_aircraft_key() function has an explicit fix (Lines 423–449):

// When the prefix is a generic firmware placeholder (ends with
// "BLACKBOX_LOG"), the craft name sits in the suffix; include it
// so files from different craft don't collapse into one group.
if !after_datetime.is_empty() && prefix.ends_with("BLACKBOX_LOG") {
    let craft = after_datetime.strip_prefix('_').unwrap_or(after_datetime);
    // Strip trailing log-number suffix (e.g. ".02", ".19")
    ...
    return format!("{}_{}", prefix, craft);
}

Verified grouping behavior:

Filename Extracted Key
EMUF_BLACKBOX_LOG_FOXEERF722V4_426_20240406_132335_notes.19.csv EMUF_BLACKBOX_LOG_FOXEERF722V4_426
EMUF_BLACKBOX_LOG_FOXEERF722V4_426_20240406_161530_notes.01.csv EMUF_BLACKBOX_LOG_FOXEERF722V4_426 ✅ (same group)
BTFL_BLACKBOX_LOG_20250517_130413_STELLARH7DEV.02.csv BTFL_BLACKBOX_LOG_STELLARH7DEV
BTFL_BLACKBOX_LOG_20250517_160413_STELLARH7DEV.01.csv BTFL_BLACKBOX_LOG_STELLARH7DEV ✅ (same group)
BTFL_BLACKBOX_LOG_20250517_130413_APEX5.01.csv BTFL_BLACKBOX_LOG_APEX5 ✅ (different group)
no_date_pattern_here.csv no_date_pattern_here (unique, full stem) ✅

One minor edge case to be aware of

If a BTFL file has notes in the suffix like BTFL_BLACKBOX_LOG_20250517_130413_STELLARH7DEV_notes.02.csv, the craft key becomes BTFL_BLACKBOX_LOG_STELLARH7DEV_notes — which still groups correctly (all sessions with that suffix land in the same group), just with _notes appended to the key. Harmless in practice.


Summary: The merged PR #145 implements aircraft-based grouping correctly, aggregates all flights in a group for the profiling phase, and the BTFL craft-name-after-datetime case is handled. No regression from the concern raised in #4501386876.

🐇 (•‿•) ← relieved it all shipped correctly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant