v0.4.1: Increase test coverage 52.68% -> 89.91%#3
Merged
Conversation
R/bindist.R provides the two-histogram comparison suite: IntersectHistograms, minkowski.dist, intersect.dist, kl.divergence, jeffrey.divergence Despite being one of the most-cited features of the package (specifically called out in the 2013 Google Open Source blog post), it had zero test coverage. This commit adds 17 test_that blocks covering the documented behaviour, the error paths, and a hand-computed numeric example for each distance/divergence. Numeric examples are written using the closed-form expressions (0.5 * log(3), 1.5 * log(1.5) + 0.5 * log(0.5), etc.) rather than hard-coded floats, so the tests document the math and don't drift on platform-specific floating point. Asymmetry test for kl.divergence deliberately picks a non-swap- symmetric pair: (0.75, 0.25) vs (0.5, 0.5). A symmetric swap like (0.8, 0.2) vs (0.2, 0.8) gives identical KL in both directions purely as a numerical coincidence and would be a misleading asymmetry assertion. Coverage delta: HistogramTools overall: 52.68% -> 60.57% R/bindist.R: 0.00% -> 100.00% Tests: [ FAIL 0 | WARN 9 | SKIP 0 | PASS 78 ] (was 55) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
R/ash.R: HistToASH wraps ash::ash1 for histogram smoothing. Tests
verify (a) the ash-shaped result on an equidist histogram, (b) the
non-histogram input error, and (c) the explicit non-equidist guard
(`stop("Histogram must be equidist...")`).
R/plot.R: PlotRelativeFrequency, PlotLog2ByteEcdf, and
PlotLogTimeDurationEcdf are pure side-effect plotting functions.
Tests are smoke tests using a null `pdf(tempfile())` device --
they verify the functions run to completion without erroring on
representative input but do not assert visual correctness. (vdiffr
would be the right tool for visual regression but is overkill for
this codebase.)
Plot tests cover:
* PlotRelativeFrequency on a runif() histogram
* PlotLog2ByteEcdf on a histogram with power-of-2 breaks
* PlotLog2ByteEcdf on a pre-computed ecdf (the non-histogram input branch)
* PlotLog2ByteEcdf error when fewer than 3 power-of-2 boundaries in knots
* PlotLogTimeDurationEcdf on a histogram with time-duration-shaped breaks
* PlotLogTimeDurationEcdf on a pre-computed ecdf
Coverage delta:
HistogramTools overall: 60.57% -> 81.70%
R/ash.R: 0.00% -> 100.00%
R/plot.R: 0.00% -> 98.18%
R/quantiles.R: 81.82% -> 100.00% (Count exercised indirectly)
Tests: [ FAIL 0 | WARN 9 | SKIP 0 | PASS 91 ] (was 78)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
R/histogram.R sits at the centre of the package -- it provides the
S3 histogram constructor (.BuildHistogram / PreBinnedHistogram), the
AddHistograms / ScaleHistogram aggregation helpers, and the two S3
methods that bridge to RProtoBuf. The existing test-histogram.R
covered AddHistograms and MergeBuckets; this commit adds:
* Count: sum-of-counts on a hand-built histogram + non-histogram
error path.
* ScaleHistogram: explicit factor (linear scaling), default factor
(1/Count, normalizes to sum 1), and three error paths
(non-histogram input, non-scalar factor, non-numeric factor).
* PreBinnedHistogram: structure assertions on the constructor's
output (class, breaks, counts, mids, equidist, xname), and
non-equidist detection.
* AddHistograms: the length-1 branch (returns the input list
unchanged), and the mismatched-breaks error path inside
.AddManyHistograms.
* Protocol Buffer round-trip: as.Message(h) produces an S4 Message,
and as.histogram(msg) reconstructs an equivalent histogram. Gated
by skip_if_not_installed("RProtoBuf"). On CI Linux runners that
have the protobuf system library, this exercises both S3 methods
(the major coverage gap in the file).
* Foreign-message-type error: as.histogram on a non-HistogramState
message produces the documented "Unknown protocol message type"
error. Uses RProtoBuf's bundled tutorial.Person descriptor so we
don't need to construct a fake one.
Coverage delta:
HistogramTools overall: 81.70% -> 82.33% (locally, no RProtoBuf)
R/histogram.R: 58.21% -> 61.19% (non-protobuf paths)
The protobuf test paths skip locally because RProtoBuf is not in
the local R library; they will run on CI runners and lift
histogram.R coverage further on Codecov.
Tests: [ FAIL 0 | WARN 9 | SKIP 2 | PASS 108 ] (was 91)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
R/subset.R: add a test for the documented warning path when a caller invokes SubsetHistogram() with neither minbreak nor maxbreak set. The function emits "No new breakpoints specified, returning original histogram." and returns the input unchanged. This branch was the only remaining gap in subset.R after the existing tests covered both the trim-to-range and bound-equals-endpoint paths. R/informationloss.R: smoke tests for PlotKSDCC and PlotEMDCC, the two visual companions to KSDCC and EMDCC. Pattern mirrors the plot.R smoke tests -- a null pdf() device, expect_no_error() on a runif() histogram, plus the documented non-histogram input error path. Visual correctness is out of scope. Coverage delta: HistogramTools overall: 82.33% -> 89.91% R/informationloss.R: 18.52% -> 100.00% R/subset.R: 76.19% -> 85.71% Tests: [ FAIL 0 | WARN 9 | SKIP 2 | PASS 114 ] (was 108) Final v0.4.1 coverage summary: Started: 52.68% (pre-v0.4.1 master) Now: 89.91% +37 percentage points across 4 commits, 36 new tests, no new package dependencies. Files at 100% coverage now: ash, bindist, dtrace, ecdf, informationloss, quantiles. The remaining shortfall is primarily in R/histogram.R (61%) where the RProtoBuf code paths are only exercised when RProtoBuf is installed; on CI runners that have it the codecov number lifts further. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds 36 new test_that blocks across 4 commits, lifting code coverage from 52.68% to 89.91% (+37 percentage points). No new package dependencies; no production code changes; no DESCRIPTION/version bump (this PR is test-only and could merge into master at any time, with the version bump happening in a separate PR when ready to release v0.4.1).
Coverage progression
c79bacdbindist1213089ash + plot86ea44dhistogram0484499subset + plotsPer-file coverage on master
*R/histogram.R's remaining gap is the RProtoBuf code paths (as.Message.histogram,as.histogram.Message). They have tests butskip_if_not_installed("RProtoBuf")skips locally where RProtoBuf isn't available. On the CI Linux runner where RProtoBuf is installed viasetup-r-dependencies, those paths execute and codecov should report a higher number.What's tested
R/bindist.R(NEWtest-bindist.R, 17 tests) —IntersectHistograms,minkowski.dist,intersect.dist,kl.divergence,jeffrey.divergence. Hand-computed numeric examples written using closed-form expressions (0.5 * log(3),1.5 * log(1.5) + 0.5 * log(0.5), …) so they document the math and don't drift on platform-specific floating point.R/ash.R(NEWtest-ash.R, 3 tests) —HistToASHsmoke test on equidist input plus the two stopifnot guards.R/plot.R(NEWtest-plot.R, 6 tests) — smoke tests for the three plot wrappers usingpdf(tempfile())to capture output without rendering.R/histogram.R(extendedtest-histogram.R, 9 tests) —Count,ScaleHistogram(explicit + default factor + 3 error paths),PreBinnedHistogramstructure,AddHistogramslength-1 and mismatched-breaks branches, plus the protobuf round-trip and foreign-message-type error.R/subset.R(extendedtest-subset.R, 1 test) — the both-NULL-bounds warning branch.R/informationloss.R(extendedtest-informationloss.R, 4 tests) —PlotKSDCCandPlotEMDCCsmoke tests + their stopifnot error paths.What's deliberately NOT in scope
vdiffris the right tool but a heavy dep for marginal value here.histogram.R's RProtoBuf path locally — depends on the system protobuf library being installed; CI handles it.stopifnotclauses and edge-case-of-edge-case paths whose tests would be more contrived than informative.Test plan
R CMD check --as-cran HistogramTools_0.4.0.tar.gz: 0 ERRORs, 0 WARNINGs, 5 expected NOTEs (same as v0.4.0)[ FAIL 0 | WARN 9 | SKIP 2 | PASS 114 ](the 2 skips are RProtoBuf-gated tests on a checker without RProtoBuf; the 9 WARNs are the same pre-existing package warnings fromApproxQuantileandTrimHistogramthat have been there since v0.4.0)🤖 Generated with Claude Code