# Integrative Analysis. Robust Rank Aggregation

Integrative Analysis aims at combining heterogeneous data at different omic levels. 

The integration is performed using Robust Rank Aggregation (RRA) method (Kolde R et al., 2012). It detects genes that are ranked consistently better than expected under null hypothesis of uncorrelated inputs and assigns a significance score for each gene.

For each item, the algorithm looks at how the item is positioned in the ranked lists and compares this to the baseline case where all the preference lists are randomly shuffled. As a result, it assigns a P-value for all items, showing how much better it is positioned in the ranked lists than expected by chance. This P-value is used both for re-ranking the items and deciding their significance.

Since the number of informative ranks is not known, it defines the final score for the rank vector r as the minimum of P-values and order all rank vectors according to their ρ scores. 

In [1]:
library(RobustRankAggreg)

### 1) Have a look at input datasets

We are integrating here as an example, the results of the meta analysis of expression data and the meta analysis of GWAS data.

Ensure you have common gene symbols in the datasets to integrate.

In [3]:
metaGWES=read.table("/home/guess/MetaAnalysis/GeneExprMeta/meta_result_case-ctl")
head(metaGWES,n=3)

Unnamed: 0,rank,logFC.case.ctl,Var,Qpvalue,REM.Pvalue,REM.FDR,Fisher.Pvalue,Fisher.FDR,n.estimators
ANKHD1-EIF4EBP3,1,0.231182,0.14104115,5.669907e-12,0.53817492,0.7716119,0,0,2
ARHGEF9,1,-0.3601272,0.01322088,0.07571364,0.001736033,0.02073393,0,0,2
ATP6V1H,1,-0.4756527,0.11613876,6.065009e-05,0.162795809,0.44010271,0,0,2


In [8]:
metaGWAS=read.table("/mnt/data/GWAS_data/output/imputed_files/dataset.b37.imputed.dosage.maf.0.01.LOC.50kb.genes.annot.magma.genes.out.sorted.annot", header=TRUE)
dim(metaGWAS)
head(metaGWAS,n=3)

magma_rank,GENE,CHR.x,START.x,STOP.x,NSNPS,NPARAM,N,ZSTAT,P_JOINT,P_SNPWISE_MEAN,P_SNPWISE_TOP1,CHR.y,START.y,STOP.y,STRAND,HUGO
1,6869,2,75223590,75476645,729,80,10000,3.8263,6.5051e-05,1.6509e-05,0.0092927,2,75273590,75426645,-,TACR1
2,7031,21,43732391,43836644,485,67,10000,3.4969,0.00023536,2.5108e-05,0.039246,21,43782391,43786644,-,TFF1
3,8698,19,3128250,3230335,495,71,10000,3.2452,0.00058691,5.7742e-05,0.073409,19,3178250,3180335,+,S1PR4


### 2) RRA method

Ensure input list are ordered by p value ascending. 

In [None]:
# create a list object with the ordered genes from each dataset to integrate
genelist <- list(as.character(rownames(metaGWES)),as.character(rownames(metaGWAS)))

In [None]:
# call aggregateRanks method from RobustRankAggreg library
agglist<-aggregateRanks(rmat=rankMatrix(genelist,full = TRUE),method = "RRA") 
dim(agglist)
agglist

In [None]:
# rank the final list using the Rank library from the R Basic package
rank<-rank(agglist$Score,na.last = TRUE, ties.method = "min")
ranked<-cbind(rank,agglist)
head(ranked)

In [None]:
# rank the final list using the Rank library from the R Basic package - assign the same rank to those genes with NA score
rankE2<-rank(agglist_all$Score,na.last = "keep", ties.method = "min")
ranked<-cbind(rank,agglist)
head(ranked)

In [None]:
sum(is.na(ranked.E2$rankE2))

In [None]:
# get the last non NA index
NonNAindex <- which(!is.na(ranked.E2$rankE2))
lastNonNA <- max(NonNAindex)
lastNonNA
# change all NA index to last non NA +1
ranked.E2$rankE2[is.na(ranked.E2$rankE2)]<-lastNonNA+1

In [None]:
write.table(ranked,"RRAresult")