# Integrative Analysis. Robust Rank Aggregation

Integrative Analysis aims at combining heterogeneous data at different omic levels. 

The integration is performed using Robust Rank Aggregation (RRA) method (Kolde R et al., 2012). It detects genes that are ranked consistently better than expected under null hypothesis of uncorrelated inputs and assigns a significance score for each gene.

For each item, the algorithm looks at how the item is positioned in the ranked lists and compares this to the baseline case where all the preference lists are randomly shuffled. As a result, it assigns a P-value for all items, showing how much better it is positioned in the ranked lists than expected by chance. This P-value is used both for re-ranking the items and deciding their significance.

Since the number of informative ranks is not known, it defines the final score for the rank vector r as the minimum of P-values and order all rank vectors according to their ρ scores. 

In [1]:
library(RobustRankAggreg)

### 1) Have a look at input datasets

We are combining here the results from the meta analysis of GWES Microarray and the results from the GWAS analysis. Note we just have one GWAS data analysed so there is no meta GWAS results. 

Ensure you have common gene symbols in the datasets to integrate.

In [2]:
metaGWES=read.table("/mnt/data/MetaAnalysis/output/meta_result_case-ctl")
head(metaGWES,n=3)

Unnamed: 0_level_0,rank,logFC.case.ctl,Var,Qpvalue,REM.Pvalue,REM.FDR,Fisher.Pvalue,Fisher.FDR,n.estimators
Unnamed: 0_level_1,<int>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<int>
ZNF264,1,0.3552904,0.299334526,1.5021519999999998e-26,0.516086802,0.67039523,1.427335e-31,1.172413e-27,2
SVOP,2,-0.7902721,0.005249884,0.3104079,0.0,0.0,2.022028e-27,8.304468e-24,2
NFKB1,3,0.3817504,0.016106871,0.009565457,0.002629964,0.01517535,8.712374e-27,2.263827e-23,2


In [3]:
GWAS=read.table("/mnt/data/GWAS/output/build38/task6_genewise/dataset.b38.imputed.dosage.maf.0.01.LOC.50kb.genes.annot.magma.genes.out.sorted.annot", header=TRUE)
head(GWAS,n=3)

Unnamed: 0_level_0,magma_rank,GENE,CHR,START,STOP,NSNPS,NPARAM,N,ZSTAT,P_JOINT,P_SNPWISE_MEAN,P_SNPWISE_TOP1,STRAND,HUGO
Unnamed: 0_level_1,<int>,<int>,<int>,<int>,<int>,<int>,<int>,<int>,<dbl>,<dbl>,<dbl>,<dbl>,<chr>,<chr>
1,1,11214,15,85330616,85799358,20,13,495,3.813,6.8633e-05,1.8683e-05,0.0013226,+,AKAP13
2,2,134111,5,6387347,6546721,17,14,495,3.7768,7.9438e-05,5.6238e-05,0.0018005,+,UBE2QL1
3,3,26074,20,20002514,20410714,25,19,495,3.6031,0.00015722,7.9538e-05,0.0047978,+,CFAP61


### 2) RRA method

Ensure input list are ordered by p value ascending. 

In [6]:
# create a list object with the ordered genes from each dataset to integrate
genelist <- list(as.character(rownames(metaGWES)),as.character(GWAS$HUGO))

In [8]:
?aggregateRanks

0,1
aggregateRanks {RobustRankAggreg},R Documentation

0,1
glist,"list of element vectors, the order of the vectors is used as the ranking."
rmat,the rankings in matrix format. The glist is by default converted to this format.
N,"the number of ranked elements, important when using only top-k ranks, by default it is calculated as the number of unique elements in the input."
method,"rank aggregation method, by defaylt 'RRA', other options are 'min', 'geom.mean', 'mean', 'median' and 'stuart'"
full,"indicates if the full rankings are given, used if the the sets of ranked elements do not match perfectly"
exact,"indicator showing if exact p-value will be calculated based on rho score (Default: if number of lists smaller than 10, exact is used)"
topCutoff,a vector of cutoff values used to limit the number of elements in the input lists elements do not match perfectly


In [7]:
# call aggregateRanks method from RobustRankAggreg library
agglist<-aggregateRanks(rmat=rankMatrix(genelist,full = TRUE),method = "RRA", exact=TRUE) 
dim(agglist)
agglist

Unnamed: 0_level_0,Name,Score
Unnamed: 0_level_1,<chr>,<dbl>
ZNF264,ZNF264,0.0001808018
AKAP13,AKAP13,0.0002141081
SVOP,SVOP,0.0003608823
UBE2QL1,UBE2QL1,0.0004272857
RAB29,RAB29,0.0004315235
AGK,AGK,0.0005307621
NFKB1,NFKB1,0.0005404918
CFAP61,CFAP61,0.0006398554
CLN8,CLN8,0.0006788943
SRGAP1,SRGAP1,0.0007197197


In [10]:
agglist$adjP.Val=p.adjust(agglist$Score, method = "bonferroni")
head(agglist)

Unnamed: 0_level_0,Name,Score,adjP.Val
Unnamed: 0_level_1,<chr>,<dbl>,<dbl>
ZNF264,ZNF264,0.0001808018,1
AKAP13,AKAP13,0.0002141081,1
SVOP,SVOP,0.0003608823,1
UBE2QL1,UBE2QL1,0.0004272857,1
RAB29,RAB29,0.0004315235,1
AGK,AGK,0.0005307621,1


In [17]:
agglist$adjP.Val2=agglist$Score*2
agglist$adjP.Val2[agglist$adjP.Val2>1] <- 1
head(agglist)

Unnamed: 0_level_0,Name,Score,adjP.Val,adjP.Val2
Unnamed: 0_level_1,<chr>,<dbl>,<dbl>,<dbl>
ZNF264,ZNF264,0.0001808018,1,0.0003616035
AKAP13,AKAP13,0.0002141081,1,0.0004282162
SVOP,SVOP,0.0003608823,1,0.0007217645
UBE2QL1,UBE2QL1,0.0004272857,1,0.0008545713
RAB29,RAB29,0.0004315235,1,0.000863047
AGK,AGK,0.0005307621,1,0.0010615242


In [18]:
agglist2<-aggregateRanks(rmat=rankMatrix(genelist,full = TRUE),method = "RRA") 
agglist2$adjP.Val=p.adjust(agglist$Score, method = "bonferroni")
agglist2$adjP.Val2=agglist$Score*2
head(agglist2)

Unnamed: 0_level_0,Name,Score,adjP.Val,adjP.Val2
Unnamed: 0_level_1,<chr>,<dbl>,<dbl>,<dbl>
ZNF264,ZNF264,0.0001816654,1,0.0003616035
AKAP13,AKAP13,0.0002152215,1,0.0004282162
SVOP,SVOP,0.0003633226,1,0.0007217645
UBE2QL1,UBE2QL1,0.0004304315,1,0.0008545713
RAB29,RAB29,0.0004347164,1,0.000863047
AGK,AGK,0.000535121,1,0.0010615242


In [10]:
# rank the final list using the Rank library from the R Basic package 
rank<-rank(agglist$Score,na.last = "keep", ties.method = "min")
ranked<-cbind(rank,agglist)
head(ranked)

Unnamed: 0_level_0,rank,Name,Score
Unnamed: 0_level_1,<int>,<chr>,<dbl>
ZNF264,1,ZNF264,0.0001816654
AKAP13,2,AKAP13,0.0002152215
SVOP,3,SVOP,0.0003633226
UBE2QL1,4,UBE2QL1,0.0004304315
RAB29,5,RAB29,0.0004347164
AGK,6,AGK,0.000535121


In [None]:
# If you have rank to NA, you can set those to last non NA rank +1
# get the last non NA index
NonNAindex <- which(!is.na(ranked$rank))
lastNonNA <- max(NonNAindex)
lastNonNA
# change all NA index to last non NA +1
ranked$rank[is.na(ranked$rank)]<-lastNonNA+1

In [15]:
dir.create("/mnt/data/IntegrativeAnalysis/output", recursive = TRUE)

In [16]:
write.table(ranked,"/mnt/data/IntegrativeAnalysis/output/RRAresult")