NAs in agePrecision #9

droglenc · 2016-02-02T18:24:53Z

Here is the super brief summary of my confusion:

When what = “difference” or “absolute difference” the N used to calculate the % appears to exclude the NA values. This makes sense to me.
When what = “precision”, the calculation appears to count NA values as “agreement”.

The long winded version of my confusion follows:

First, a simple example using the formula notation two different ways to achieve the same result (i.e., simply compare agerA1 and agerA2):

Example 1) ap.A<-agePrecision(~agerA1+agerA2,data=shad)
summary(ap.A,what="precision")

Example 2) ap.Ax<-agePrecision(agerA1~agerA2,data=shad)
summary(ap.Ax,what="precision")

How does agePrecision handled “NA” values? Did I miss something in the documentation? Anyway, in the shad data, agerA1 codes the values for two of the samples as “NA” and agerA2 codes the same two scales as NA on the second read leaving 51 samples with an estimated age. There is no difference between agerA1 and agerA2 for 26 scales. 26/51 = 50.9% (NA excluded). However, to achieve the results calculated using agePrecision it is necessary to include the two NA values and consider them as “agreement” ((26+2)/53) = 52.83%. I supposed that as long as the reader is consistent in assigning the scales as “NA” it make sense to include the NA values in the percent agreement calculation because NA is functionally serving as a value, and in this example, readerA agreed with both reads on which scales were NA.

What if I try comparing agerA1 and agerB1?

summary(ap.AB,what="difference")
-2 -1 0 1 2
18.182 15.152 45.455 12.121 9.091
To calculate 45.455 (15 agreed out of 33). Looks like the 20 NAs were excluded. Sound reasonable.

summary(ap.AB,what="precision")
n R ACV APE PercAgree
53 2 12.17 8.608 66.04
Now I am confused. To get 66.04% it’s necessary to count the 20 “NA’s” as agreed and divide by 53. (20 NAs+ 15 agreed)/53 * 100 = 66.04%

How do “NA” values work with >2 reads? At the bottom of page 84.

ap.ABC<-agePrecision(~agerA1 + agerB1 + agerC1, data = shad)
summary(ap.ABC, what = “difference”)
summary(ap.ABC,what= “precision”)

Footnote 7 says “The sample size is much smaller ...because Ager C did not estimate an age for several fish.” Should it be Ager “B” who did not estimate ages?. Either way, there are 33 samples where all 3 agers give an Age. The summary results for what = “difference” are easy to calculate and the row totals clearly show NA values are excluded. For example, ap.ABC$absdiff shows row totals of 33 (A vs B, 51 (A vs C), and 33 (B vs C). With these row totals, I can duplicate the results in summary(ap.ABS, what=”difference”). If what = “difference” or “absolute difference”, NA values appear to be excluded from the N.

However, I am again confused by the inclusion of the “NA” when calculating the PercAgree when what = “precision”. With reader B there are 20 instances where the scale age is NA and only 2 instances where all three readers agree. Therefore, PercAgree is calculated as = (20 of the NA values + 2 where everyone agrees)/53 * 100= 41.51%. Why is the NA being counted as an agreement when reader B essentially said “I can’t read the scale”? Should percent agreement be calculated on all 53 samples or only the 33 samples where everyone provided an age? If we only use the 33 samples with an age, then 2/33 = 6.01%.

Basically I am confused because it seems like NA values are not being treated the same when what is changed from “difference” to “precision”.

droglenc · 2016-02-06T21:41:12Z

I have corrected the bugs associated with this problem. Use the latest development version of FSA to get the corrections illustrated below.

The relevant examples from above are shown below.

> library(FSA)
> # User must set working directory appropriately.
> shad <- read.csv("ShadCR.csv")

Now the results that compare ager A and B match when using what="difference" and what="precision".

> ap.AB <- agePrecision(~agerA1+agerB1,data=shad)
> summary(ap.AB,what="difference")
    -2     -1      0      1      2 
18.182 15.152 45.455 12.121  9.091 
> summary(ap.AB,what="absolute")
    0     1     2 
45.45 27.27 27.27 
> summary(ap.AB,what="precision")
  n validn R   ACV   APE PercAgree
 53     33 2 12.17 8.608     45.45

The same in the three-way comparisons ...

> ap.ABC <- agePrecision(~agerA1+agerB1+agerC1,data=shad)
> summary(ap.ABC,what="difference")
                    -2     -1      0      1      2      3      4
agerA1 - agerB1 18.182 15.152 45.455 12.121  9.091  0.000  0.000
agerA1 - agerC1  0.000  5.882 19.608 41.176 19.608 11.765  1.961
agerB1 - agerC1  0.000  6.061  6.061 36.364 36.364 15.152  0.000
> summary(ap.ABC,what="absolute")
                      0      1      2      3      4
agerA1 v. agerB1 45.455 27.273 27.273  0.000  0.000
agerA1 v. agerC1 19.608 47.059 19.608 11.765  1.961
agerB1 v. agerC1  6.061 42.424 36.364 15.152  0.000
> summary(ap.ABC,what="precision")
  n validn R   ACV  APE PercAgree
 53     33 3 22.98 16.7     6.061

And, yes, it was Ager B that did not age many fish. I will make an errata entry for the book.

droglenc self-assigned this Feb 2, 2016

droglenc closed this as completed Feb 20, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NAs in agePrecision #9

NAs in agePrecision #9

droglenc commented Feb 2, 2016

droglenc commented Feb 6, 2016

NAs in agePrecision #9

NAs in agePrecision #9

Comments

droglenc commented Feb 2, 2016

droglenc commented Feb 6, 2016