Trouble with unbalanced design #1

waltermaldonado · 2015-02-04T22:16:40Z

First, congratulations for the good job, I really like your package.
I think I found an issue while working with unbalanced cases. As we know, for those cases we have to calculate a variance estimation of the contrast estimation on the msd calculations. This can lead to a problem, because a mean can differ from the immediate lower one, but not from the second lower one, because its error estimation may be greater.
I will try to explain better:

Browse[5]> m.tmp
       7        6        7        5 
5.389086 4.935529 4.868243 4.862371 

Browse[5]> msd
            MELOID/P MELOID/PS MELOID/V21 MELOID/TG
MELOID/P   0.0000000 0.5012896  0.4816234 0.5275920
MELOID/PS  0.5012896 0.0000000  0.5012896 0.5456037
MELOID/V21 0.4816234 0.5012896  0.0000000 0.5275920
MELOID/TG  0.5275920 0.5456037  0.5275920 0.0000000

Browse[5]> difm
            MELOID/P  MELOID/PS  MELOID/V21   MELOID/TG
MELOID/P   0.0000000 0.45355714 0.520842857 0.526714286
MELOID/PS  0.4535571 0.00000000 0.067285714 0.073157143
MELOID/V21 0.5208429 0.06728571 0.000000000 0.005871429
MELOID/TG  0.5267143 0.07315714 0.005871429 0.000000000

Browse[5]> dif
           MELOID/P MELOID/PS MELOID/V21 MELOID/TG
MELOID/P      FALSE     FALSE       TRUE     FALSE
MELOID/PS     FALSE     FALSE      FALSE     FALSE
MELOID/V21     TRUE     FALSE      FALSE     FALSE
MELOID/TG     FALSE     FALSE      FALSE     FALSE

As you can see, we have a F F T F, where we expected to have a F F T T, since the values are sorted decreasingly. This happens because the error estimation is greater, and may be expected behavior but is leading to an infinite loop.

Thanks in advance

The text was updated successfully, but these errors were encountered:

jcfaria · 2015-02-04T22:42:36Z

The package has been designed to handle unbalanced experiments only in a completely randomized design.

In other designs it is necessary to estimate the value of lost parcels.

waltermaldonado · 2015-02-04T22:46:48Z

That is the case, it doesn't matter for this issue. What is relevant is the number of replications and the msd calculations, ending in the logical matrix. The infinite loop occurs while making the groups (assigning the letters to the means)

waltermaldonado · 2015-02-04T22:59:16Z

You can test it with this data:

       [,1] [,2] [,3]   [,4]
  [1,]    4    4    1     NA
  [2,]    4    4    2     NA
  [3,]    4    4    3     NA
  [4,]    4    4    4     NA
  [5,]    4    4    5     NA
  [6,]    4    4    6     NA
  [7,]    4    4    7     NA
  [8,]    4    2    1     NA
  [9,]    4    2    2     NA
 [10,]    4    2    3     NA
 [11,]    4    2    4     NA
 [12,]    4    2    5     NA
 [13,]    4    2    6     NA
 [14,]    4    2    7     NA
 [15,]    4    1    1     NA
 [16,]    4    1    2     NA
 [17,]    4    1    3     NA
 [18,]    4    1    4     NA
 [19,]    4    1    5     NA
 [20,]    4    1    6     NA
 [21,]    4    1    7     NA
 [22,]    4    3    1     NA
 [23,]    4    3    2     NA
 [24,]    4    3    3     NA
 [25,]    4    3    4     NA
 [26,]    4    3    5     NA
 [27,]    4    3    6     NA
 [28,]    4    3    7     NA
 [29,]    2    4    1     NA
 [30,]    2    4    2 2.8482
 [31,]    2    4    3 2.9122
 [32,]    2    4    4 3.3107
 [33,]    2    4    5 2.2380
 [34,]    2    4    6 2.9845
 [35,]    2    4    7 2.5378
 [36,]    2    2    1 3.4613
 [37,]    2    2    2 2.8603
 [38,]    2    2    3 2.5623
 [39,]    2    2    4 2.8882
 [40,]    2    2    5 2.5011
 [41,]    2    2    6 2.9269
 [42,]    2    2    7 2.9015
 [43,]    2    1    1 3.5569
 [44,]    2    1    2 3.7955
 [45,]    2    1    3 3.6112
 [46,]    2    1    4 3.8429
 [47,]    2    1    5 3.5976
 [48,]    2    1    6     NA
 [49,]    2    1    7 3.3724
 [50,]    2    3    1     NA
 [51,]    2    3    2 2.5011
 [52,]    2    3    3 1.9685
 [53,]    2    3    4 1.9685
 [54,]    2    3    5 2.5428
 [55,]    2    3    6 2.5428
 [56,]    2    3    7 2.5065
 [57,]    3    4    1 2.5763
 [58,]    3    4    2     NA
 [59,]    3    4    3 2.7243
 [60,]    3    4    4 2.1206
 [61,]    3    4    5 1.3617
 [62,]    3    4    6 2.7316
 [63,]    3    4    7 2.7152
 [64,]    3    2    1 2.8306
 [65,]    3    2    2 2.9736
 [66,]    3    2    3 2.9269
 [67,]    3    2    4 2.6857
 [68,]    3    2    5 2.6304
 [69,]    3    2    6 2.1732
 [70,]    3    2    7 2.5527
 [71,]    3    1    1 3.8564
 [72,]    3    1    2     NA
 [73,]    3    1    3 4.0624
 [74,]    3    1    4 2.6405
 [75,]    3    1    5 3.2266
 [76,]    3    1    6 2.6821
 [77,]    3    1    7     NA
 [78,]    3    3    1 2.2577
 [79,]    3    3    2 2.2577
 [80,]    3    3    3     NA
 [81,]    3    3    4 2.3892
 [82,]    3    3    5 2.2304
 [83,]    3    3    6 2.0969
 [84,]    3    3    7     NA
 [85,]    1    4    1 4.5652
 [86,]    1    4    2 4.7870
 [87,]    1    4    3 4.9181
 [88,]    1    4    4 4.9714
 [89,]    1    4    5 4.6355
 [90,]    1    4    6 5.1127
 [91,]    1    4    7 5.0878
 [92,]    1    2    1 5.2887
 [93,]    1    2    2 4.6948
 [94,]    1    2    3 4.9877
 [95,]    1    2    4 4.5079
 [96,]    1    2    5 4.5978
 [97,]    1    2    6 5.6132
 [98,]    1    2    7 4.8586
 [99,]    1    1    1 5.3729
[100,]    1    1    2 5.3759
[101,]    1    1    3 4.9544
[102,]    1    1    4 5.1692
[103,]    1    1    5 5.6962
[104,]    1    1    6 5.7236
[105,]    1    1    7 5.4314
[106,]    1    3    1 4.8786
[107,]    1    3    2 4.7324
[108,]    1    3    3 4.6357
[109,]    1    3    4 5.1474
[110,]    1    3    5 4.6245
[111,]    1    3    6 5.0980
[112,]    1    3    7 4.9200

It is a 2-factor CRD and the issue happens here:

TukeyC.nest(k, model='X3 ~ X1 * X2', which='X1:X2', fl1=1)

jcfaria · 2015-02-04T23:34:43Z

Please, could you post here (or send me) the structure of the data (I think, based in the example above, it is named "k")?
I'll need to do a debug to see the source of the bug.

dput(k)

For example:

dput(BOD)
structure(list(Time = c(1, 2, 3, 4, 5, 7), demand = c(8.3, 10.3,
19, 16, 15.6, 19.8)), .Names = c("Time", "demand"), row.names = c(NA,
-6L), class = "data.frame", reference = "A1.4, p. 270")

waltermaldonado · 2015-02-05T00:03:08Z

TukeyC.nest(k, model='X4 ~ X1 * X2', which='X1:X2', fl1=1)

> dput(k)
structure(list(X1 = structure(c(4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("4", "2", "3", "1"
), class = "factor"), X2 = structure(c(4L, 4L, 4L, 4L, 4L, 4L, 
4L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("3", "2", 
"4", "1"), class = "factor"), X3 = c(1L, 2L, 3L, 4L, 5L, 6L, 
7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 
2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 
4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 
6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 
1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 
3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 
5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L), X4 = c(NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, 2.8482, 2.9122, 3.3107, 2.238, 
2.9845, 2.5378, 3.4613, 2.8603, 2.5623, 2.8882, 2.5011, 2.9269, 
2.9015, 3.5569, 3.7955, 3.6112, 3.8429, 3.5976, NA, 3.3724, NA, 
2.5011, 1.9685, 1.9685, 2.5428, 2.5428, 2.5065, 2.5763, NA, 2.7243, 
2.1206, 1.3617, 2.7316, 2.7152, 2.8306, 2.9736, 2.9269, 2.6857, 
2.6304, 2.1732, 2.5527, 3.8564, NA, 4.0624, 2.6405, 3.2266, 2.6821, 
NA, 2.2577, 2.2577, NA, 2.3892, 2.2304, 2.0969, NA, 4.5652, 4.787, 
4.9181, 4.9714, 4.6355, 5.1127, 5.0878, 5.2887, 4.6948, 4.9877, 
4.5079, 4.5978, 5.6132, 4.8586, 5.3729, 5.3759, 4.9544, 5.1692, 
5.6962, 5.7236, 5.4314, 4.8786, 4.7324, 4.6357, 5.1474, 4.6245, 
5.098, 4.92)), .Names = c("X1", "X2", "X3", "X4"), row.names = c(NA, 
-112L), class = "data.frame")

jcfaria closed this as completed Dec 20, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trouble with unbalanced design #1

Trouble with unbalanced design #1

waltermaldonado commented Feb 4, 2015

jcfaria commented Feb 4, 2015

waltermaldonado commented Feb 4, 2015

waltermaldonado commented Feb 4, 2015

jcfaria commented Feb 4, 2015

waltermaldonado commented Feb 5, 2015

Trouble with unbalanced design #1

Trouble with unbalanced design #1

Comments

waltermaldonado commented Feb 4, 2015

jcfaria commented Feb 4, 2015

waltermaldonado commented Feb 4, 2015

waltermaldonado commented Feb 4, 2015

jcfaria commented Feb 4, 2015

waltermaldonado commented Feb 5, 2015