Fix total mean calculation in ANOVA #273

wildart · 2022-05-28T15:15:51Z

Fix for #242

codecov-commenter · 2022-05-28T15:18:44Z

Codecov Report

Merging #273 (f0ff08e) into master (be980f3) will not change coverage.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master     #273   +/-   ##
=======================================
  Coverage   93.65%   93.65%           
=======================================
  Files          28       28           
  Lines        1717     1717           
=======================================
  Hits         1608     1608           
  Misses        109      109

Impacted Files	Coverage Δ
src/var_equality.jl	`98.24% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update be980f3...f0ff08e. Read the comment docs.

devmotion · 2022-05-29T01:24:19Z

src/var_equality.jl

@@ -60,7 +60,7 @@ end
 function anova(scores::AbstractVector{<:Real}...)
    Nᵢ = [length(g) for g in scores]
    Z̄ᵢ = mean.(scores)
-    Z̄ = mean(Z̄ᵢ)
+    Z̄ = sum(Iterators.flatten(scores))/sum(Nᵢ)


Alternatively, one could use

Suggested change

Z̄ = sum(Iterators.flatten(scores))/sum(Nᵢ)

Z̄ = dot(Z̄ᵢ, Nᵢ) / sum(Nᵢ)

In a quick benchmark this seemed to be similarly fast, and usually even marginally faster.

You'd probably need a very large dataset to see the difference.

Actually a tiny dataset is sufficient. Of course, the difference is very small but it seemed to be consistent:

julia> using Statistics, LinearAlgebra, BenchmarkTools julia> function f(scores::AbstractVector{<:Real}...) Nᵢ = [length(g) for g in scores] Z̄ᵢ = mean.(scores) Z̄ = sum(Iterators.flatten(scores)) / sum(Nᵢ) return Nᵢ, Z̄ᵢ, Z̄ end f (generic function with 1 method) julia> function g(scores::AbstractVector{<:Real}...) Nᵢ = [length(g) for g in scores] Z̄ᵢ = mean.(scores) Z̄ = dot(Z̄ᵢ, Nᵢ) / sum(Nᵢ) return Nᵢ, Z̄ᵢ, Z̄ end g (generic function with 1 method) julia> scores = map(n -> rand(n), (3, 9, 12)); julia> @btime f($(scores...)); 33.893 ns (1 allocation: 64 bytes) julia> @btime g($(scores...)); 33.687 ns (1 allocation: 64 bytes) julia> scores = map(n -> rand(n), (3, 9, 12, 134)); julia> @btime f($(scores...)); 33.944 ns (1 allocation: 64 bytes) julia> @btime g($(scores...)); 33.681 ns (1 allocation: 64 bytes) julia> scores = map(n -> rand(n), (3, 9, 12, 134, 12, 4134, 1231, 122, 12, 1, 23, 58)); julia> @btime f($(scores...)); 34.070 ns (1 allocation: 64 bytes) julia> @btime g($(scores...)); 33.781 ns (1 allocation: 64 bytes)

It looks like something else from Z evaluation dominates in the function. But nanoseconds 😏.

nalimilan · 2022-06-10T20:00:12Z

Could you also add a test that failed before the PR?

fix JuliaStats#242

a229016

devmotion reviewed May 29, 2022

View reviewed changes

fixed ANOVA test

f0ff08e

nalimilan requested a review from devmotion August 25, 2022 21:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix total mean calculation in ANOVA #273

Fix total mean calculation in ANOVA #273

wildart commented May 28, 2022

codecov-commenter commented May 28, 2022 •

edited

devmotion May 29, 2022

wildart Jun 1, 2022

devmotion Jun 1, 2022

wildart Jun 1, 2022

nalimilan commented Jun 10, 2022

	Z̄ = sum(Iterators.flatten(scores))/sum(Nᵢ)
	Z̄ = dot(Z̄ᵢ, Nᵢ) / sum(Nᵢ)

Fix total mean calculation in ANOVA #273

Are you sure you want to change the base?

Fix total mean calculation in ANOVA #273

Conversation

wildart commented May 28, 2022

codecov-commenter commented May 28, 2022 • edited

Codecov Report

devmotion May 29, 2022

Choose a reason for hiding this comment

wildart Jun 1, 2022

Choose a reason for hiding this comment

devmotion Jun 1, 2022

Choose a reason for hiding this comment

wildart Jun 1, 2022

Choose a reason for hiding this comment

nalimilan commented Jun 10, 2022

codecov-commenter commented May 28, 2022 •

edited