---
### **Data Bootcamp for Genomic Prediction in Plant Breeding** ###
### **University of Minnesota Plant Breeding Center** ###
#### **June 20 - 22, 2022** ####
---

### **Practical 5:  Modeling Genotype-Environment Interactions (GxE)** ###

<br />
<br />

#### **Source Scripts and Load Data**


In [1]:
WorkDir <- getwd()
setwd(WorkDir)

##Source in functions to be used
source("R_Functions/GS_Pipeline_Jan_2022_FnsApp.R")
source("R_Functions/bootcamp_functions.R")

gc()



Attaching package: ‘dplyr’


The following objects are masked from ‘package:stats’:

    filter, lag


The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union



   *****       ***   vcfR   ***       *****
   This is vcfR 1.12.0 
     browseVignettes('vcfR') # Documentation
     citation('vcfR') # Citation
   *****       *****      *****       *****



Attaching package: ‘bWGR’


The following objects are masked from ‘package:NAM’:

    CNT, emBA, emBB, emBC, emBL, emCV, emDE, emEN, emGWA, emML, emML2,
    emRR, GAU, GRM, IMP, KMUP, KMUP2, markov, mkr, mkr2X, mrr, mrr2X,
    SPC, SPM, wgr



Attaching package: ‘emoa’


The following object is masked from ‘package:dplyr’:

    coalesce


Installing package into ‘/srv/rlibs’
(as ‘lib’ is unspecified)

also installing the dependencies ‘RcppArmadillo’, ‘RcppProgress’


Loading required package: Matrix

Loading required package: MASS


Attaching package: ‘MASS’


The following object is masked from ‘p

Unnamed: 0,used,(Mb),gc trigger,(Mb).1,max used,(Mb).2
Ncells,5775666,308.5,11408738,609.3,6529186,348.7
Vcells,10050671,76.7,14786712,112.9,12255578,93.6


#### **Read Genotype File using vcfR** ####

In [2]:

##Load in genotype data. Use package vcfR to read in and work with vcf file.
infileVCF <- "Data/SoyNAM_Geno.vcf"
genotypes_VCF <- read.table(infileVCF)
vcf <- read.vcfR(infileVCF, verbose = FALSE)
vcf

***** Object of Class vcfR *****
5189 samples
20 CHROMs
4,292 variants
Object size: 171.1 Mb
25.41 percent missing data
*****        *****         *****


#### **Convert VCF file format to numerical matrix format.**
#### Final genotype matrix is geno_num

In [3]:
gt <- extract.gt(vcf, element = "GT", as.numeric = F)
fix_T <- as_tibble(getFIX(vcf))
gt2 <- matrix(0, ncol = ncol(gt), nrow = nrow(gt))
colnames(gt2) <- colnames(gt)
rownames(gt2) <- rownames(gt)
gt2a <- apply(gt,2, function(x) gsub("1/1","1",x))
gt2b <- gsub("0[/|]0","0",gt2a)
gt2c <- gsub("[10][/|][10]","0.5",gt2b)
gt2d <- gsub("\\.[/|]\\.","NA",gt2c)

gt2d_num<- apply(gt2d,2,as.numeric)
rownames(gt2d_num)<- rownames(gt2d)
geno_num <- t(gt2d_num)
dim(geno_num)
rm(list=grep("gt2",ls(),value=TRUE))


#### **Filter Genotypic Data**

In [4]:
##Filter markers on % missing
miss <- function(x){length(which(is.na(x)))}
mrkNA <- (apply(geno_num, MARGIN=2, FUN=miss))/dim(geno_num)[1]
ndx <- which(mrkNA > 0.2)

if (length(ndx)>0) geno_num2 <- geno_num[, -ndx] else geno_num2 <- geno_num

##Filter individuals on % missing
indNA <- (apply(geno_num2, MARGIN=1, FUN=miss))/dim(geno_num2)[2]
ndx2 <- which(indNA > 0.5)

 if(length(ndx2)>0) geno_num3 <- geno_num2[-ndx2, ] else geno_num3 <- geno_num2


##Filter markers based on MAF
maf <- apply(geno_num3, MARGIN=2, FUN=mean, na.rm=T)
ndx3 <- which(maf<0.05 | maf>0.95) 

if (length(ndx3)>0) geno_num4 <- geno_num2[, -ndx3] else geno_num4 <- geno_num3
  
dim(geno_num4)

#### **Import Phenotypic Data and Merge Geno-Pheno Data**

In [5]:

pheno <- read.csv("Data/SoyNAM_Pheno.csv")

geno_num4_x <- cbind(rownames(geno_num4),geno_num4)
colnames(geno_num4_x)[1]<- "strain"

### Check strain names have same format in pheno and geno 
pheno[,1] <- gsub("[-.]","",pheno[,1])
geno_num4_x[,1] <- gsub("[-.]","",geno_num4_x[,1])

## Merge Geno and Pheno Data
Data <- merge(geno_num4_x,pheno,by="strain",all=TRUE)

## Remove with missing yiled_blup values 

YldNA_Indices <- which(is.na(Data$yield))
if(length(YldNA_Indices) >0){Data_Sub <- Data[-YldNA_Indices,]}else{Data_Sub <- Data}


genoStrain <- unique(as.character(geno_num4_x[,"strain"]))

genoStrainIndices <- which(Data_Sub[,"strain"] %in% genoStrain)
length(genoStrainIndices)
genoIndices <- grep("ss",colnames(geno_num4_x))
initGenoIndx <- genoIndices[1]
finalGenoIndx <- genoIndices[length(genoIndices)]
phenoIndices <- c(1,c((finalGenoIndx+1):ncol(Data_Sub)))

pheno_sub <- Data_Sub[genoStrainIndices,phenoIndices]
geno_num4b <- Data_Sub[genoStrainIndices,c(1,genoIndices)]


uniqueStrainIndices<- which(!duplicated(geno_num4b[,"strain"]))

if(length(uniqueStrainIndices)>0) {geno_num5 <- geno_num4b[uniqueStrainIndices,]}else{geno_num5 <- geno_num4b}

dim(geno_num5)

rm(geno_num4b)
rm(geno_num4)
rm(geno_num3)
rm(geno_num2)

### set 'yield' colname to 'Yield_blup'

yldCol <- which(colnames(pheno_sub) %in% "yield")
colnames(pheno_sub)[yldCol] <- "Yield_blup" 



#### **Subset Environments** 

In [6]:
### Select 3 environs with largest number of evaluations (lines)  

env_sub <-  names(which(table(pheno_sub[,"environ"])>5100)[1:3])

env_sub_indices <- which(pheno_sub[,"environ"] %in% env_sub)

## Subset Data and Geno tables 
DT <- pheno_sub[env_sub_indices,]

DT$environ <- as.factor(DT$environ)

dim(DT)

#### **Impute Genotype Table** ###

In [7]:
#### Impute genotable using markov function from 'NAM' package 

geno_imp <- markov(apply(geno_num5[,-1],2,as.numeric))
rownames(geno_imp) <- geno_num5[,"strain"]
dim(geno_imp)

In [8]:
### 
env_geno_sub_indices <- which(rownames(geno_imp) %in% unique(DT[,"strain"]))
geno_imp_sub <- geno_imp[env_geno_sub_indices,]

dim(geno_imp_sub)

#### **Relationship Matrix Using A.mat** 

In [9]:
K_rr <- A.mat(geno_imp_sub)
colnames(K_rr) <-rownames(geno_imp_sub)
rownames(K_rr) <- rownames(geno_imp_sub)
A <- K_rr
dim(A)



#### **Subset Genotypes for Computation Demo** 

In [10]:
  
A_Sub <- A[1:500,1:500]
DT_Sub <- DT[which(DT[,"strain"] %in% rownames(A_Sub)),]

E <- diag(length(unique(DT$environ)))
rownames(E) <- colnames(E) <- unique(DT$environ)
dim(E)

### Same set of strains in each of the environments 

rmStrains <- names(which(table(DT_Sub[,"strain"]) <3))
DT_Sub1 <- DT_Sub[-which(DT_Sub[,"strain"] %in% rmStrains),]

A_Sub1 <- A_Sub[-which(rownames(A_Sub) %in% rmStrains),-which(rownames(A_Sub) %in% rmStrains)]
dim(A_Sub1)

<br />
<br />

### **Exercise 1 - Compare a few Var-Covar structures in SOMMER package**


<br />

#### **1a -Model with Main Effect** #### 
##### Model environment as fixed effect (estimate population mean for each of the environments) and estimate random effects for genotypes 


In [11]:

fitMain <- mmer(Yield_blup~environ-1,
                random=~vs(strain,Gu=A_Sub1),
                rcov=~units,
                data=DT_Sub1,verbose=FALSE)
summary(fitMain)


Unnamed: 0,Yield_blup
u:strain,494

Unnamed: 0_level_0,VarComp,VarCompSE,Zratio,Constraint
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<chr>
u:strain.Yield_blup-Yield_blup,8583.317,5945.864,1.443578,Positive
units.Yield_blup-Yield_blup,375176.598,13971.321,26.853337,Positive

Trait,Effect,Estimate,Std.Error,t.value
<fct>,<fct>,<dbl>,<dbl>,<dbl>
Yield_blup,environIA_2012,3127.902,27.73914,112.7613
Yield_blup,environIA_2013,2794.875,27.73914,100.7557
Yield_blup,environIL_2012,3611.41,27.73914,130.1918

Unnamed: 0_level_0,logLik,AIC,BIC,Method,Converge
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<chr>,<lgl>
Value,-546.3708,1098.742,1114.645,NR,True


In [12]:

m <- model.matrix(~ environ-1 ,data=DT_Sub1)
m_beta <- m %*% as.numeric(fitMain$Beta[,3]) 
PredMain <- m_beta+fitMain$U$`u:strain`$Yield_blup
cor(PredMain,DT_Sub1[,"Yield_blup"]) 
plot(PredMain,DT_Sub1[,"Yield_blup"])

0
0.4772414


<br />

#### **1b -Model with Compound Symmetry var-covar structure** ####
##### Compound symmetry assumes GxE effects and also assumes constant correlation among environments

In [13]:

E <- diag(length(unique(DT_Sub1$environ)))
rownames(E) <- colnames(E) <- unique(DT_Sub1$environ)

EA <- kronecker(E,A_Sub1, make.dimnames = TRUE)
DT_Sub1$environ <- as.factor(DT_Sub1$environ)
DT_Sub1$strain <- as.factor(DT_Sub1$strain)

fitCS <- mmer(Yield_blup~environ-1,
              random= ~ vs(strain, Gu=A_Sub1) + vs(environ:strain, Gu=EA),
              rcov= ~ units,
              data=DT_Sub1, verbose = FALSE)
summary(fitCS)

Unnamed: 0,Yield_blup
u:strain,494
u:environ:strain,1482

Unnamed: 0_level_0,VarComp,VarCompSE,Zratio,Constraint
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<chr>
u:strain.Yield_blup-Yield_blup,0.0,7976.67,0.0,Positive
u:environ:strain.Yield_blup-Yield_blup,33142.11,13968.56,2.372622,Positive
units.Yield_blup-Yield_blup,360760.54,13832.11,26.081388,Positive

Trait,Effect,Estimate,Std.Error,t.value
<fct>,<fct>,<dbl>,<dbl>,<dbl>
Yield_blup,environIA_2012,3131.253,27.67495,113.1439
Yield_blup,environIA_2013,2789.801,27.67495,100.806
Yield_blup,environIL_2012,3612.428,27.67495,130.5306

Unnamed: 0_level_0,logLik,AIC,BIC,Method,Converge
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<chr>,<lgl>
Value,-539.1218,1084.244,1100.147,NR,True


In [14]:

m <- model.matrix(~ environ-1 ,data=DT_Sub1)
m_beta <- m %*% as.numeric(fitCS$Beta[,3]) 
PredCS <- m_beta+fitCS$U$`u:environ:strain`$Yield_blup
cor(PredCS,DT_Sub1[,"Yield_blup"]) 
plot(PredCS,DT_Sub1[,"Yield_blup"])

0
0.4686045


<br /> 

#### **1c -Model with Compound Symmetry + Diagonal Structure** ####
##### Heterogeneous gxe variance among environments and constant genetic co-variance among environments 

In [15]:
fitCSDG <- mmer(Yield_blup~environ-1,
                random=~vs(strain,Gu=A_Sub1) +vs(ds(environ),strain,Gu=A_Sub1),
                rcov=~units,
                data=DT_Sub1,verbose=FALSE) 

summary(fitCSDG)

Unnamed: 0,Yield_blup
u:strain,494
IA_2012:strain,494
IA_2013:strain,494
IL_2012:strain,494

Unnamed: 0_level_0,VarComp,VarCompSE,Zratio,Constraint
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<chr>
u:strain.Yield_blup-Yield_blup,1108.983,7976.429,0.1390325,Positive
IA_2012:strain.Yield_blup-Yield_blup,11202.649,14990.66,0.7473086,Positive
IA_2013:strain.Yield_blup-Yield_blup,38446.633,22793.832,1.6867122,Positive
IL_2012:strain.Yield_blup-Yield_blup,56959.82,27643.322,2.0605273,Positive
units.Yield_blup-Yield_blup,359480.998,13783.392,26.0807347,Positive

Trait,Effect,Estimate,Std.Error,t.value
<fct>,<fct>,<dbl>,<dbl>,<dbl>
Yield_blup,environIA_2012,3130.223,27.28389,114.7279
Yield_blup,environIA_2013,2789.228,27.71231,100.6494
Yield_blup,environIL_2012,3612.461,27.92732,129.3522

Unnamed: 0_level_0,logLik,AIC,BIC,Method,Converge
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<chr>,<lgl>
Value,-537.5147,1081.029,1096.933,NR,True


In [16]:

m2 <- cbind(c(rep(1,nrow(DT_Sub1)/3),rep(0,2*nrow(DT_Sub1)/3)),c(rep(0,nrow(DT_Sub1)/3),rep(1,nrow(DT_Sub1)/3),rep(0,nrow(DT_Sub1)/3)),
c(rep(0,nrow(DT_Sub1)/3),rep(0,nrow(DT_Sub1)/3),rep(1,nrow(DT_Sub1)/3)))

m_beta <- m2 %*% as.numeric(fitCSDG$Beta[,3]) 
length(m_beta)
m_env_strain <- do.call(cbind,lapply(fitCSDG$U,function(x) x$Yield_blup))
dim(m_env_strain)
envStrain_blup <-c(m_env_strain[,2:4])                              
                  
strain_blup <- rep(fitCSDG$U$`u:strain`$Yield_blup,3)
length(strain_blup)

In [17]:
PredCSDG <- m_beta+strain_blup+envStrain_blup

indES <-  sort.int(as.numeric(DT_Sub1[,"environ"]),decreasing=FALSE,index.return=TRUE)[[2]]

cor(PredCSDG,DT_Sub1[indES,"Yield_blup"]) 
plot(PredCSDG,DT_Sub1[indES,"Yield_blup"])


0
0.543284


<br />

#### **1d- Model with US - Unstructured Variance-Covariance** ####
Model with heterogeneous variance for each of the environments and heterogeneous covariance for every combination of environments

In [18]:
fitUS <- mmer(Yield_blup~environ-1,
                random=~vs(us(environ),strain,Gu=A_Sub1),
                rcov=~units,
                data=DT_Sub1,verbose=FALSE) 
summary(fitUS)

Unnamed: 0,Yield_blup
IA_2012:strain,494
IA_2013:IA_2012:strain,988
IA_2013:strain,494
IL_2012:IA_2012:strain,988
IL_2012:IA_2013:strain,988
IL_2012:strain,494

Unnamed: 0_level_0,VarComp,VarCompSE,Zratio,Constraint
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<chr>
IA_2012:strain.Yield_blup-Yield_blup,13999.8931,13343.98,1.04915433,Positive
IA_2013:IA_2012:strain.Yield_blup-Yield_blup,-378.4475,11900.65,-0.03180059,Unconstr
IA_2013:strain.Yield_blup-Yield_blup,40023.9472,21164.99,1.89104488,Positive
IL_2012:IA_2012:strain.Yield_blup-Yield_blup,16497.5831,14083.02,1.1714522,Unconstr
IL_2012:IA_2013:strain.Yield_blup-Yield_blup,-36052.1154,18347.46,-1.96496513,Unconstr
IL_2012:strain.Yield_blup-Yield_blup,63232.4178,27475.82,2.30138399,Positive
units.Yield_blup-Yield_blup,358944.8349,13740.5,26.1231327,Positive

Trait,Effect,Estimate,Std.Error,t.value
<fct>,<fct>,<dbl>,<dbl>,<dbl>
Yield_blup,environIA_2012,3129.223,27.25633,114.8072
Yield_blup,environIA_2013,2790.704,27.60819,101.0825
Yield_blup,environIL_2012,3616.559,27.86794,129.7749

Unnamed: 0_level_0,logLik,AIC,BIC,Method,Converge
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<chr>,<lgl>
Value,-534.6765,1075.353,1091.256,NR,True


In [23]:
envNames <- levels(factor(DT_Sub1$environ))
print(envNames)
print(names(fitUS$U))
env1Ind <- c(1,3,6)
U_envStrain <- list()
PredUS <- list()
  for(i in 1:length(envNames)){
       envInd <-  grep(envNames[i],names(fitUS$U))
       envIndNames <-  grep(envNames[i],names(fitUS$U),value=TRUE)
       U_envStrain[[i]] <-  as.numeric(fitUS$U[[env1Ind[i]]]$Yield_blup)
       for(j  in 2:length(envInd)){ 
         indJ <- envInd[j]
         b <- cbind(names(fitUS$U[[indJ]]$Yield_blup),fitUS$U[[indJ]]$Yield_blup)
         colnames(b) <- c("strain","Yield_blup")
         b_group <- as_tibble(b) %>% group_by(strain)
         YldBlup_group <- b_group %>% summarise(Yield_blup = sum(as.numeric(Yield_blup)))
         U_envStrain[[i]] <- U_envStrain[[i]] +YldBlup_group[,2]
       }
      
      PredUS[[i]] <- c(U_envStrain[[i]] + fitUS$Beta[i,3])
     }

indES <-  sort.int(as.numeric(DT_Sub1[,"environ"]),decreasing=FALSE,index.return=TRUE)[[2]]  
cor(unlist(PredUS),DT_Sub1[indES,"Yield_blup"]) 
plot(unlist(PredUS),DT_Sub1[indES,"Yield_blup"]) 

[1] "IA_2012" "IA_2013" "IL_2012"
[1] "IA_2012:strain"         "IA_2013:IA_2012:strain" "IA_2013:strain"        
[4] "IL_2012:IA_2012:strain" "IL_2012:IA_2013:strain" "IL_2012:strain"        


<br />
<br />

### **Exercise 2 - Predict performance of tested and untested genotypes in tested and untested environments** ###

#### **2a - Tested Genotypes in Untested Environment**

In [20]:
### Remove lines from IA2013 and train the model using IA2012 and IL2013 only and predict 
### performance of lines for IA2013 (untested environ) and compare accuracy with model 
### incorporating data from IA2013 in the training model 

tstIndices1 <- which(DT_Sub1[,"environ"] %in% "IA_2013") 

DT_Sub1A <- DT_Sub1
DT_Sub1A[tstIndices1 ,"Yield_blup"] <- NA
#DT_Sub1A[tstIndices1 ,"environ"] <- NA

dim(DT_Sub1A)

#### **Unstructured Var-Covar for Untested Environment**

In [24]:
fitUS1A <- mmer(Yield_blup~environ-1,
                random=~vs(us(environ),strain,Gu=A_Sub1),
                rcov=~units,
                data=DT_Sub1A,verbose=FALSE) 
summary(fitUS1A)

Unnamed: 0,Yield_blup
IA_2012:strain,494
IA_2013:IA_2012:strain,988
IA_2013:strain,494
IL_2012:IA_2012:strain,988
IL_2012:IA_2013:strain,988
IL_2012:strain,494

Unnamed: 0_level_0,VarComp,VarCompSE,Zratio,Constraint
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<chr>
IA_2012:strain.Yield_blup-Yield_blup,13999.8931,13343.98,1.04915433,Positive
IA_2013:IA_2012:strain.Yield_blup-Yield_blup,-378.4475,11900.65,-0.03180059,Unconstr
IA_2013:strain.Yield_blup-Yield_blup,40023.9472,21164.99,1.89104488,Positive
IL_2012:IA_2012:strain.Yield_blup-Yield_blup,16497.5831,14083.02,1.1714522,Unconstr
IL_2012:IA_2013:strain.Yield_blup-Yield_blup,-36052.1154,18347.46,-1.96496513,Unconstr
IL_2012:strain.Yield_blup-Yield_blup,63232.4178,27475.82,2.30138399,Positive
units.Yield_blup-Yield_blup,358944.8349,13740.5,26.1231327,Positive

Trait,Effect,Estimate,Std.Error,t.value
<fct>,<fct>,<dbl>,<dbl>,<dbl>
Yield_blup,environIA_2012,3129.223,27.25633,114.8072
Yield_blup,environIA_2013,2790.704,27.60819,101.0825
Yield_blup,environIL_2012,3616.559,27.86794,129.7749

Unnamed: 0_level_0,logLik,AIC,BIC,Method,Converge
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<chr>,<lgl>
Value,-534.6765,1075.353,1091.256,NR,True


In [33]:
envNames <- levels(factor(DT_Sub1A$environ))
env1Ind <- c(1,3,6)
U_envStrain <- list()
PredUS1A <- list()
  for(i in 1:length(envNames)){
       envInd <-  grep(envNames[i],names(fitUS1A$U))
       U_envStrain[[i]] <-  as.numeric(fitUS1A$U[[env1Ind[i]]]$Yield_blup)
     for(j  in 2:length(envInd)){ 
         indJ <- envInd[j]
         b <- cbind(names(fitUS1A$U[[indJ]]$Yield_blup),fitUS1A$U[[indJ]]$Yield_blup)
         colnames(b) <- c("strain","Yield_blup")
         b_group <- as_tibble(b) %>% group_by(strain)
         YldBlup_group <- b_group %>% summarise(Yield_blup = sum(as.numeric(Yield_blup)))
         U_envStrain[[i]] <- U_envStrain[[i]] + YldBlup_group[,2] 
       } 
        PredUS1A[[i]] <- c(U_envStrain[[i]] + fitUS$Beta[i,3])
     }
    
length(unlist(PredUS1A[[2]]))

length(tstIndices1)

cor(unlist(PredUS1A[[2]]),DT_Sub1[tstIndices1,"Yield_blup"]) 
plot(unlist(PredUS1A[[2]]),DT_Sub1[tstIndices1,"Yield_blup"]) 

<br />

#### **2b - Untested Genotypes in Tested Environments** ####

In [26]:
### Subset Data to generate untested genotypes 

set.seed(125)
tstStrain <- sample(unique(DT_Sub1[,"strain"]),0.2*length(unique(DT_Sub1[,"strain"])))
length(tstStrain)
tstIndices2 <- which(DT_Sub1[,"strain"] %in% tstStrain)
DT_Sub1B <- DT_Sub1
DT_Sub1B[tstIndices2 ,"Yeild_blup"] <- NA
dim(DT_Sub1B)

#### **Fit Compound Symmetry Var-Covar Structure for Tested Environments**

In [27]:
fitCSDG1B <- mmer(Yield_blup~environ-1,
                random=~vs(strain,Gu=A_Sub1) +vs(ds(environ),strain,Gu=A_Sub1),
                rcov=~units,
                data=DT_Sub1B,verbose=FALSE) 

summary(fitCSDG1B)

Unnamed: 0,Yield_blup
u:strain,494
IA_2012:strain,494
IA_2013:strain,494
IL_2012:strain,494

Unnamed: 0_level_0,VarComp,VarCompSE,Zratio,Constraint
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<chr>
u:strain.Yield_blup-Yield_blup,1108.983,7976.429,0.1390325,Positive
IA_2012:strain.Yield_blup-Yield_blup,11202.649,14990.66,0.7473086,Positive
IA_2013:strain.Yield_blup-Yield_blup,38446.633,22793.832,1.6867122,Positive
IL_2012:strain.Yield_blup-Yield_blup,56959.82,27643.322,2.0605273,Positive
units.Yield_blup-Yield_blup,359480.998,13783.392,26.0807347,Positive

Trait,Effect,Estimate,Std.Error,t.value
<fct>,<fct>,<dbl>,<dbl>,<dbl>
Yield_blup,environIA_2012,3130.223,27.28389,114.7279
Yield_blup,environIA_2013,2789.228,27.71231,100.6494
Yield_blup,environIL_2012,3612.461,27.92732,129.3522

Unnamed: 0_level_0,logLik,AIC,BIC,Method,Converge
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<chr>,<lgl>
Value,-537.5147,1081.029,1096.933,NR,True


In [28]:
m2 <- cbind(c(rep(1,nrow(DT_Sub1B)/3),rep(0,2*nrow(DT_Sub1B)/3)),c(rep(0,nrow(DT_Sub1B)/3),rep(1,nrow(DT_Sub1B)/3),rep(0,nrow(DT_Sub1B)/3)),
c(rep(0,nrow(DT_Sub1B)/3),rep(0,nrow(DT_Sub1B)/3),rep(1,nrow(DT_Sub1B)/3)))

m_beta <- m2 %*% as.numeric(fitCSDG1B$Beta[,3]) 
length(m_beta)
m_env_strain <- do.call(cbind,lapply(fitCSDG1B$U,function(x) x$Yield_blup))
dim(m_env_strain)
envStrain_blup <-c(m_env_strain[,2:4])                              
                  
strain_blup <- rep(fitCSDG1B$U$`u:strain`$Yield_blup,3)
length(strain_blup)

In [29]:
PredCSDG1B<- m_beta+strain_blup+envStrain_blup

indES <-  sort.int(as.numeric(DT_Sub1[,"environ"]),decreasing=FALSE,index.return=TRUE)[[2]]

cor(PredCSDG1B,DT_Sub1[indES,"Yield_blup"]) 
plot(PredCSDG1B,DT_Sub1[indES,"Yield_blup"])

0
0.543284


#### **Discuss other ways to model these scenarios and refine these models**