Added an RCBD design example with two main independent variables. #34

moorepants · 2015-07-28T18:50:20Z

@msimmond I've moved this RCBD example to this pull request. I'll make a pull request for each example so we can discuss the code for each example.

Stuff:

> source('rcbd_two_var.R', print.eval = TRUE)
Loading required package: lattice
Loading required package: grid
Loading required package: latticeExtra
Loading required package: RColorBrewer
Loading required package: multcomp
Loading required package: mvtnorm
Loading required package: survival
Loading required package: splines
Loading required package: TH.data
Loading required package: gridExtra

Attaching package: ‘HH’

The following objects are masked from ‘package:car’:

    logit, vif

Use suppressPackageStartupMessages to eliminate package startup messages.

Attaching package: ‘ggplot2’

The following object is masked from ‘package:latticeExtra’:

    layer

Loading required package: plyr
Shapiro-Wilk Normality Test
===============================================================================

    Shapiro-Wilk normality test

data:  residuals(model)
W = 0.9474, p-value = 0.08646

===============================================================================
Levene's Test
===============================================================================
Levene's Test for Homogeneity of Variance (center = median)
      Df F value  Pr(>F)  
group  2  5.2195 0.01073 *
      33                  

---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Levene's Test for Homogeneity of Variance (center = median)
      Df F value Pr(>F)
group  2  1.0005 0.3786
      33               
===============================================================================
Tukey 1-df Test for Non-additivity
===============================================================================
Analysis of Variance Table

Response: log10.yield
               Df  Sum Sq  Mean Sq  F value    Pr(>F)    
clone           2 0.60563 0.302815 165.9591 2.155e-14 ***
nitrogen        2 0.05674 0.028371  15.5491 5.349e-05 ***
block           3 0.03799 0.012662   6.9395  0.001713 ** 
squared.preds   1 0.01893 0.018925  10.3722  0.003787 ** 
clone:nitrogen  4 0.00993 0.002482   1.3605  0.278312    
Residuals      23 0.04197 0.001825                       

---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
===============================================================================
ANOVA Table
===============================================================================
Analysis of Variance Table

Response: log10.yield
               Df  Sum Sq  Mean Sq  F value    Pr(>F)    
block           3 0.03799 0.012662   6.3335  0.002562 ** 
clone           2 0.60563 0.302815 151.4680 2.449e-14 ***
nitrogen        2 0.05674 0.028371  14.1914 8.556e-05 ***
clone:nitrogen  4 0.02284 0.005710   2.8563  0.045536 *  
Residuals      24 0.04798 0.001999                       

---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
===============================================================================
Least Significant Difference
===============================================================================

Study: model ~ c("clone", "nitrogen")

LSD t Test for log10.yield 

Mean Square Error:  0.001999199 

clone:nitrogen,  means and individual ( 95 %) CI

      log10.yield        std r       LCL       UCL       Min       Max
c1:n1   0.8701454 0.07582428 4 0.8240045 0.9162864 0.7781513 0.9542425
c1:n2   0.9256076 0.06669088 4 0.8794667 0.9717486 0.8450980 1.0000000
c1:n3   0.9528938 0.03959399 4 0.9067528 0.9990347 0.9030900 1.0000000
c2:n1   1.1288360 0.04171115 4 1.0826950 1.1749769 1.0791812 1.1760913
c2:n2   1.2234903 0.02455571 4 1.1773494 1.2696313 1.2041200 1.2552725
c2:n3   1.3052361 0.03735114 4 1.2590952 1.3513771 1.2552725 1.3424227
c3:n1   0.9746813 0.05953852 4 0.9285404 1.0208222 0.9030900 1.0413927
c3:n2   0.9726891 0.07497815 4 0.9265481 1.0188300 0.9030900 1.0791812
c3:n3   1.0072647 0.06313851 4 0.9611238 1.0534057 0.9542425 1.0791812

alpha: 0.05 ; Df Error: 24
Critical Value of t: 2.063899 

Least Significant Difference 0.06525313
Means with the same letter are not significantly different.

Groups, Treatments and means
a    c2:n3   1.305 
b    c2:n2   1.223 
c    c2:n1   1.129 
d    c3:n3   1.007 
de   c3:n1   0.9747 
de   c3:n2   0.9727 
de   c1:n3   0.9529 
ef   c1:n2   0.9256 
f    c1:n1   0.8701 
===============================================================================

msimmond · 2015-07-29T21:31:15Z

examples/rcbd_two_var.R

+sep(50)
+
+# Plot the mean yield with respect to each virus level.
+# TODO : Is the default plot ok here, or is something different desired?


The type of plots is good, but I think the units should be added to the y axis in parentheses. Can an option be added to the GUI where the user enters the units?

Yes, we can add units to plots via the GUI, but let's just leave that as icing on the cake for now. I opened an issue: #37

moorepants · 2015-07-29T22:18:27Z

examples/rcbd_two_var.R

+cat("Levene's Test\n")
+sep(79)
+#------------------------------------------------------------------------------#
+#leveneTest(model)


@msimmond I don't know what I'm doing here. Levene's test is giving me an error for this model. (Note that you can pass in the formula and the data, or like here, just the lm/aov model object.)

msimmond · 2015-07-30T18:06:34Z

examples/rcbd_two_var.R

+#------------------------------------------------------------------------------#
+sep(79)
+
+# The user will select the alpha level from the app.


I think we should keep this fixed at p = 0.05

Ok, that makes it easier. I forget why I thought it should be selectable by the user, maybe someone said that at some point.

msimmond · 2015-07-30T18:23:13Z

examples/rcbd_two_var.R

+#------------------------------------------------------------------------------#
+sep(50)
+
+# Show the confidence intervals for the intercept and each virus level.


is this comment left over from something else? there's no virus or intercept that I know of

yes, copy paste error

moorepants · 2015-07-30T21:07:44Z

Ok, maybe the only thing left on this one is to add in the correct transformation.

msimmond · 2015-08-03T22:01:33Z

examples/rcbd_two_var.R

+sep(50)
+
+# Run Levene's Test for a one-way ANOVA of each of the main factors.
+# TODO : clone is significant in Levene's test so transformation is necessary.


transformations added above. log-transformation was best fit.

moorepants · 2015-08-10T19:36:24Z

@msimmond I'm reading the transformation code here. What transformations do you want to support? Just base 10 log and raising to a power? Should we be doing the calculation to get the best exponent be done for the user should we simply let the user type in the desired value of the exponent?

msimmond · 2015-08-10T19:50:15Z

For now I think 3 transformations should be available: base 10 log, square root, and raising to a power using the algorithm to find best fit.

moorepants · 2015-08-10T19:55:18Z

Ok, that sounds fine.

moorepants · 2015-08-10T20:31:04Z

@msimmond I've updated this example to strictly use the log10 transform and added a bar plot for the LSD.test. Is this correct? See the results at the top of this PR.

moorepants · 2015-08-10T20:32:37Z

Should the confint() call be removed from all examples like we did in the CRD one?

moorepants · 2015-08-10T20:35:00Z

I think the two interaction plots are redundant, is that the case?

moorepants · 2015-08-12T05:07:10Z

From Maegen: The different transformations weren't meant to be run in sequence. I was using it to test each one to see how it affected the assumption tests - sorry for not specifying. For the app, it would need functions to identify each transformation as a new variable (which I think you did). Ex. possible dependent variables: factorA, log10.factorA, power.factorA.

And each time the user selects a new transformation (without reloading data), I assume it's applying it to the original variable that's selected in the dropdown for 'dependent variable'. That would be the correct way.

@msimmond One question for this one:

You had a triple transformation, something like sqrt(log10(Y))^power and I changed it to simply a log10(Y) tranformation because that is what you had in the comments. Is this correct?

moorepants · 2015-08-12T05:28:19Z

With the log10 transform I'm still seeing a sig p value in the Levene Test for clone.

msimmond · 2015-08-18T16:31:34Z

examples/rcbd_two_var.R

+sep(79)
+#------------------------------------------------------------------------------#
+leveneTest(yield ~ clone, data=my.data)
+leveneTest(yield ~ nitrogen, data=my.data)


Noticed the Levene's tests are still using the original units, and not log10. It seems like it's corrected in the app at least for this example.

msimmond · 2015-08-18T16:46:53Z

I'm not seeing the sig p value in levene's for clone with the log 10 transform in the app.

msimmond · 2015-08-18T21:08:13Z

examples/rcbd_two_var.R

+sep(79)
+#------------------------------------------------------------------------------#
+lsd.results <- LSD.test(model, c('clone', 'nitrogen'), console = TRUE)
+#------------------------------------------------------------------------------#


Since talking with Lauren, we decided that we don't need to back-transform the LS means (lsd.results$means$yield or lsd.results$groups$means). However, what we want here is for the user to view 2 outputs: First, a table for the lsd.results of the transformed dataset (done in line 148), and second, a partial amended table of the lsd.results of the original dataset. For this second table based on the original dataset: i.e., aov(yield ~ block + clone + nitrogen + clone:nitrogen, data=my.data), we want the user to see the lsd.results$means table with the corresponding LSD letters (lsd.results$groups$M) from the transformed dataset copied to it, by matching treatments. The second dataframe would be used for generating the post-hoc plot(s).

This is where I need your help. I've been working on creating the second table, but since there isn't a column within lsd.results$means specifying the treatment names, I don't know how join the column lsd.results$groups$M from transformed dataframe to the original dataframe lsd.results$means using 'by' statement.

Whatever we create from this, will be fed into the ggplot, and I want to make sure that we don't show or use the lsd letters from the original dataset.

Not sure how to do this.

This is the workflow I have so far:

'# When there is a transformed variable, as in this example, do second lsd.test on original dataset to obtain dataframe '# lsd.results.original, but replace lsd.results.original$groups$M with lsd.results$groups$M from transformed dataset.

'# Do lsd test on original data
model.original <- aov(yield ~ block + clone + nitrogen + clone:nitrogen,
data=my.data)
get.orig.LSmeans <- LSD.test(model.original, c('clone', 'nitrogen')) #Don't show output yet
'# Replace get.orig.LSmeans$groups$M with lsd.test$groups$M
??
'# Show single table with LS mean, std, r, LCL, UCL, min, max, and lsd.test$groups$M. Display with heading:
' # 'Lease square means of original dataset. Note that LSD test was performed on log-transformed dataset.' Followed by 'ATTENTION: When reporting LSD groups in the original units shown here, remember to specify that the LSD test was performed on log-transformed data.'

moorepants · 2015-08-18T21:36:30Z

Ok, I'll look at this soon. In the meantime does this do what you want?:

https://github.com/ucd-ipo/aip-analysis/pull/51/files#diff-07838b7a58641aa441c07b70abcfaa1fR843

I wrote this code to merge the summarySE and lsd.results$groups tables.

msimmond · 2015-08-18T21:57:44Z

Yes, but not for the case where you have a transformed variable. If it's for an analysis of a transformed variable, columns 5, 6, 7, 8, and 9 would be replaced with the output from the original variable. Column 10 'M' would stay the same.

Added an RCBD design example with two main independent variables.

moorepants added 2 commits July 28, 2015 11:49

Added an RCBD design example with two main independent variables.

0b73ce6

Spelling.

4cd52e1

msimmond reviewed Jul 29, 2015
View reviewed changes

moorepants mentioned this pull request Jul 29, 2015

Let users set units on plots #37

Closed

Tidied it up and tried to add Levene's Test but failed.

2412169

moorepants reviewed Jul 29, 2015
View reviewed changes

Added levene's test for each treatment factor

b306452

msimmond reviewed Jul 30, 2015
View reviewed changes

all factors significant including interaction

b411e83

msimmond reviewed Jul 30, 2015
View reviewed changes

msimmond and others added 4 commits July 30, 2015 11:46

generate more explicit interaction plots

445aefb

Adding Tukey 1-df Test for Non-additivity

9ae071a

Create Box Plots

d6e5e80

Cleaned up the new plotting code.

6dd3e67

msimmond added 2 commits July 31, 2015 13:42

missing the interaction term in model

dba88b6

Added 3 transformation types

44b0571

msimmond reviewed Aug 3, 2015
View reviewed changes

Fixed sep length.

6f90824

Strictly using log10 transform, added final bar chart, output to png.

645f99f

moorepants added 2 commits August 10, 2015 23:28

Invisible.

2fb6a09

Removed confint.

0a2c7cc

msimmond reviewed Aug 18, 2015
View reviewed changes

use log10 transformed variable for Levene's test

bd4d74a

msimmond reviewed Aug 18, 2015
View reviewed changes

msimmond mentioned this pull request Aug 18, 2015

Add functions to transform datasets #13

Closed

moorepants mentioned this pull request Aug 19, 2015

Add function to "detransform" data #41

Closed

moorepants added a commit that referenced this pull request Aug 24, 2015

Merge pull request #34 from ucd-ipo/rcbd-example

cc7f65a

Added an RCBD design example with two main independent variables.

moorepants merged commit cc7f65a into master Aug 24, 2015

moorepants deleted the rcbd-example branch August 24, 2015 08:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added an RCBD design example with two main independent variables. #34

Added an RCBD design example with two main independent variables. #34

moorepants commented Jul 28, 2015

msimmond Jul 29, 2015

moorepants Jul 29, 2015

moorepants Jul 29, 2015

msimmond Jul 30, 2015

moorepants Jul 30, 2015

msimmond Jul 30, 2015

moorepants Jul 30, 2015

moorepants commented Jul 30, 2015

msimmond Aug 3, 2015

moorepants commented Aug 10, 2015

msimmond commented Aug 10, 2015

moorepants commented Aug 10, 2015

moorepants commented Aug 10, 2015

moorepants commented Aug 10, 2015

moorepants commented Aug 10, 2015

moorepants commented Aug 12, 2015

moorepants commented Aug 12, 2015

msimmond Aug 18, 2015

msimmond commented Aug 18, 2015

msimmond Aug 18, 2015

moorepants commented Aug 18, 2015

msimmond commented Aug 18, 2015

Added an RCBD design example with two main independent variables. #34

Added an RCBD design example with two main independent variables. #34

Conversation

moorepants commented Jul 28, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

moorepants commented Jul 30, 2015

Choose a reason for hiding this comment

moorepants commented Aug 10, 2015

msimmond commented Aug 10, 2015

moorepants commented Aug 10, 2015

moorepants commented Aug 10, 2015

moorepants commented Aug 10, 2015

moorepants commented Aug 10, 2015

moorepants commented Aug 12, 2015

moorepants commented Aug 12, 2015

Choose a reason for hiding this comment

msimmond commented Aug 18, 2015

Choose a reason for hiding this comment

moorepants commented Aug 18, 2015

msimmond commented Aug 18, 2015