Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added an RCBD design example with two main independent variables. #34

Merged
merged 16 commits into from
Aug 24, 2015

Conversation

moorepants
Copy link
Member

@msimmond I've moved this RCBD example to this pull request. I'll make a pull request for each example so we can discuss the code for each example.

Stuff:

  • Add in the transformation of the dependent variable.
  • Why is the "block" variable ignored in LSD?
  • Is this the desired interaction plot?
  • Levene's test: I getting this error "Model must be completely crossed formula only."
  • Add Tukey's 1-df test
  • Add box plot
  • Remove Tukey 1-df test?
  • Add in final bar plot from LSD.test() results.
  • Remove confint() call?
  • Switch to non-ggplot bar chart.
  • Confirm that the log10 transformation is correct.
  • Why are two interaction plots needed for two variables? Doesn't one show it?
> source('rcbd_two_var.R', print.eval = TRUE)
Loading required package: lattice
Loading required package: grid
Loading required package: latticeExtra
Loading required package: RColorBrewer
Loading required package: multcomp
Loading required package: mvtnorm
Loading required package: survival
Loading required package: splines
Loading required package: TH.data
Loading required package: gridExtra

Attaching package: ‘HH’

The following objects are masked from ‘package:car’:

    logit, vif

Use suppressPackageStartupMessages to eliminate package startup messages.

Attaching package: ‘ggplot2’

The following object is masked from ‘package:latticeExtra’:

    layer

Loading required package: plyr
Shapiro-Wilk Normality Test
===============================================================================

    Shapiro-Wilk normality test

data:  residuals(model)
W = 0.9474, p-value = 0.08646

===============================================================================
Levene's Test
===============================================================================
Levene's Test for Homogeneity of Variance (center = median)
      Df F value  Pr(>F)  
group  2  5.2195 0.01073 *
      33                  

---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Levene's Test for Homogeneity of Variance (center = median)
      Df F value Pr(>F)
group  2  1.0005 0.3786
      33               
===============================================================================
Tukey 1-df Test for Non-additivity
===============================================================================
Analysis of Variance Table

Response: log10.yield
               Df  Sum Sq  Mean Sq  F value    Pr(>F)    
clone           2 0.60563 0.302815 165.9591 2.155e-14 ***
nitrogen        2 0.05674 0.028371  15.5491 5.349e-05 ***
block           3 0.03799 0.012662   6.9395  0.001713 ** 
squared.preds   1 0.01893 0.018925  10.3722  0.003787 ** 
clone:nitrogen  4 0.00993 0.002482   1.3605  0.278312    
Residuals      23 0.04197 0.001825                       

---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
===============================================================================
ANOVA Table
===============================================================================
Analysis of Variance Table

Response: log10.yield
               Df  Sum Sq  Mean Sq  F value    Pr(>F)    
block           3 0.03799 0.012662   6.3335  0.002562 ** 
clone           2 0.60563 0.302815 151.4680 2.449e-14 ***
nitrogen        2 0.05674 0.028371  14.1914 8.556e-05 ***
clone:nitrogen  4 0.02284 0.005710   2.8563  0.045536 *  
Residuals      24 0.04798 0.001999                       

---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
===============================================================================
Least Significant Difference
===============================================================================

Study: model ~ c("clone", "nitrogen")

LSD t Test for log10.yield 

Mean Square Error:  0.001999199 

clone:nitrogen,  means and individual ( 95 %) CI

      log10.yield        std r       LCL       UCL       Min       Max
c1:n1   0.8701454 0.07582428 4 0.8240045 0.9162864 0.7781513 0.9542425
c1:n2   0.9256076 0.06669088 4 0.8794667 0.9717486 0.8450980 1.0000000
c1:n3   0.9528938 0.03959399 4 0.9067528 0.9990347 0.9030900 1.0000000
c2:n1   1.1288360 0.04171115 4 1.0826950 1.1749769 1.0791812 1.1760913
c2:n2   1.2234903 0.02455571 4 1.1773494 1.2696313 1.2041200 1.2552725
c2:n3   1.3052361 0.03735114 4 1.2590952 1.3513771 1.2552725 1.3424227
c3:n1   0.9746813 0.05953852 4 0.9285404 1.0208222 0.9030900 1.0413927
c3:n2   0.9726891 0.07497815 4 0.9265481 1.0188300 0.9030900 1.0791812
c3:n3   1.0072647 0.06313851 4 0.9611238 1.0534057 0.9542425 1.0791812

alpha: 0.05 ; Df Error: 24
Critical Value of t: 2.063899 

Least Significant Difference 0.06525313
Means with the same letter are not significantly different.

Groups, Treatments and means
a    c2:n3   1.305 
b    c2:n2   1.223 
c    c2:n1   1.129 
d    c3:n3   1.007 
de   c3:n1   0.9747 
de   c3:n2   0.9727 
de   c1:n3   0.9529 
ef   c1:n2   0.9256 
f    c1:n1   0.8701 
===============================================================================

rcbd-two-var-box-plots
rcbd-two-var-fit-plots
rcbd-two-var-clone-int-plot
rcbd-two-var-nitrogen-int-plot
rcbd-two-var-clone-bar-graph

sep(50)

# Plot the mean yield with respect to each virus level.
# TODO : Is the default plot ok here, or is something different desired?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type of plots is good, but I think the units should be added to the y axis in parentheses. Can an option be added to the GUI where the user enters the units?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can add units to plots via the GUI, but let's just leave that as icing on the cake for now. I opened an issue: #37

cat("Levene's Test\n")
sep(79)
#------------------------------------------------------------------------------#
#leveneTest(model)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msimmond I don't know what I'm doing here. Levene's test is giving me an error for this model. (Note that you can pass in the formula and the data, or like here, just the lm/aov model object.)

#------------------------------------------------------------------------------#
sep(79)

# The user will select the alpha level from the app.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep this fixed at p = 0.05

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that makes it easier. I forget why I thought it should be selectable by the user, maybe someone said that at some point.

#------------------------------------------------------------------------------#
sep(50)

# Show the confidence intervals for the intercept and each virus level.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this comment left over from something else? there's no virus or intercept that I know of

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, copy paste error

@moorepants
Copy link
Member Author

Ok, maybe the only thing left on this one is to add in the correct transformation.

sep(50)

# Run Levene's Test for a one-way ANOVA of each of the main factors.
# TODO : clone is significant in Levene's test so transformation is necessary.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transformations added above. log-transformation was best fit.

@moorepants
Copy link
Member Author

@msimmond I'm reading the transformation code here. What transformations do you want to support? Just base 10 log and raising to a power? Should we be doing the calculation to get the best exponent be done for the user should we simply let the user type in the desired value of the exponent?

@msimmond
Copy link
Member

For now I think 3 transformations should be available: base 10 log, square root, and raising to a power using the algorithm to find best fit.

@moorepants
Copy link
Member Author

Ok, that sounds fine.

@moorepants
Copy link
Member Author

@msimmond I've updated this example to strictly use the log10 transform and added a bar plot for the LSD.test. Is this correct? See the results at the top of this PR.

@moorepants
Copy link
Member Author

Should the confint() call be removed from all examples like we did in the CRD one?

@moorepants
Copy link
Member Author

I think the two interaction plots are redundant, is that the case?

@moorepants
Copy link
Member Author

From Maegen: The different transformations weren't meant to be run in sequence. I was using it to test each one to see how it affected the assumption tests - sorry for not specifying. For the app, it would need functions to identify each transformation as a new variable (which I think you did). Ex. possible dependent variables: factorA, log10.factorA, power.factorA.

And each time the user selects a new transformation (without reloading data), I assume it's applying it to the original variable that's selected in the dropdown for 'dependent variable'. That would be the correct way.

@msimmond One question for this one:

You had a triple transformation, something like sqrt(log10(Y))^power and I changed it to simply a log10(Y) tranformation because that is what you had in the comments. Is this correct?

@moorepants
Copy link
Member Author

With the log10 transform I'm still seeing a sig p value in the Levene Test for clone.

sep(79)
#------------------------------------------------------------------------------#
leveneTest(yield ~ clone, data=my.data)
leveneTest(yield ~ nitrogen, data=my.data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noticed the Levene's tests are still using the original units, and not log10. It seems like it's corrected in the app at least for this example.

@msimmond
Copy link
Member

I'm not seeing the sig p value in levene's for clone with the log 10 transform in the app.

sep(79)
#------------------------------------------------------------------------------#
lsd.results <- LSD.test(model, c('clone', 'nitrogen'), console = TRUE)
#------------------------------------------------------------------------------#
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since talking with Lauren, we decided that we don't need to back-transform the LS means (lsd.results$means$yield or lsd.results$groups$means). However, what we want here is for the user to view 2 outputs: First, a table for the lsd.results of the transformed dataset (done in line 148), and second, a partial amended table of the lsd.results of the original dataset. For this second table based on the original dataset: i.e., aov(yield ~ block + clone + nitrogen + clone:nitrogen, data=my.data), we want the user to see the lsd.results$means table with the corresponding LSD letters (lsd.results$groups$M) from the transformed dataset copied to it, by matching treatments. The second dataframe would be used for generating the post-hoc plot(s).

This is where I need your help. I've been working on creating the second table, but since there isn't a column within lsd.results$means specifying the treatment names, I don't know how join the column lsd.results$groups$M from transformed dataframe to the original dataframe lsd.results$means using 'by' statement.

Whatever we create from this, will be fed into the ggplot, and I want to make sure that we don't show or use the lsd letters from the original dataset.

Not sure how to do this.

This is the workflow I have so far:

'# When there is a transformed variable, as in this example, do second lsd.test on original dataset to obtain dataframe '# lsd.results.original, but replace lsd.results.original$groups$M with lsd.results$groups$M from transformed dataset.

'# Do lsd test on original data
model.original <- aov(yield ~ block + clone + nitrogen + clone:nitrogen,
data=my.data)
get.orig.LSmeans <- LSD.test(model.original, c('clone', 'nitrogen')) #Don't show output yet
'# Replace get.orig.LSmeans$groups$M with lsd.test$groups$M
??
'# Show single table with LS mean, std, r, LCL, UCL, min, max, and lsd.test$groups$M. Display with heading:
' # 'Lease square means of original dataset. Note that LSD test was performed on log-transformed dataset.' Followed by 'ATTENTION: When reporting LSD groups in the original units shown here, remember to specify that the LSD test was performed on log-transformed data.'

@moorepants
Copy link
Member Author

Ok, I'll look at this soon. In the meantime does this do what you want?:

https://github.com/ucd-ipo/aip-analysis/pull/51/files#diff-07838b7a58641aa441c07b70abcfaa1fR843

I wrote this code to merge the summarySE and lsd.results$groups tables.

@msimmond
Copy link
Member

Yes, but not for the case where you have a transformed variable. If it's for an analysis of a transformed variable, columns 5, 6, 7, 8, and 9 would be replaced with the output from the original variable. Column 10 'M' would stay the same.

moorepants added a commit that referenced this pull request Aug 24, 2015
Added an RCBD design example with two main independent variables.
@moorepants moorepants merged commit cc7f65a into master Aug 24, 2015
@moorepants moorepants deleted the rcbd-example branch August 24, 2015 08:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants