# Experiments with 257 cells

## Differences between Epithelial Tissues

We would like to see if there are significant differences between the distributions followed by these variables in each of the epithelial tissues. In order to do that, we first perform a Kruskal-Wallis test over the six tissues for each of the variables, and then a pairwise Dunn test. Sice we repeat this test over 22 variables, we have fixed the p-value threshold in 0.01.

#### Kruskal-Wallis pvalues

We are calculating the p-values from the Kruskal-Wallis test. First, we load the frame

In [1]:
require("dunn.test")
cellNumber <- 257
load(paste("../2_variables/frames",as.character(cellNumber), "cells_frame.RData", sep="/"))
colNames <- colnames(cells_frame)

Loading required package: dunn.test


and save the pvalues in an array

In [2]:
KWpvalue <- c()
for (j in seq(2,length(cells_frame))){
  colName <- colNames[j]
  kwAux <- as.numeric(kruskal.test(cells_frame[,colName] ~ cells_frame$type)[3])
  KWpvalue <- c(KWpvalue, kwAux)
    }

The variables whith a p-value greater or equal than 0.01 are

In [3]:
#Note the first entry in colNames is type and we do not want to take it.
colNames[c(FALSE,KWpvalue<=0.01)]
validCols <- colNames[c(FALSE, KWpvalue<=0.01)]

The variables with a p-value greater than 0.01 are

In [4]:
#Note the first entry in colNames is type and we do not want to take it.
colNames[c(FALSE,KWpvalue>0.01)]

There are none variables with p-value greater than 0.01

#### Dunn Test

We run the Dunn test to compare pairwise the tissues.

In [5]:

colNames <- colnames(cells_frame)

# we perform the first dunn test before the loop to set the row names
colName <- colNames[2]
x1 <- cells_frame[cells_frame$type == 'cNT', colName]
x2 <- cells_frame[cells_frame$type == 'dWL', colName]
x3 <- cells_frame[cells_frame$type == 'dWP', colName]
x4 <- cells_frame[cells_frame$type == 'dNP', colName]
w <- c(x1, x2, x3, x4)
g <- factor(rep(1:4, c(length(x1),length(x2),length(x3),length(x4))),
            labels = c("cNT","dWL","dWP",'dNP'))

P <- dunn.test(w, g)

Pvalues <- data.frame(P$P.adjusted)
colnames(Pvalues) <- c(colNames[2])
row.names(Pvalues) <- P[["comparisons"]]

for (j in seq(3,length(cells_frame))){
  colName <- colNames[j]
  x1 <- cells_frame[cells_frame$type == 'cNT', colName]
  x2 <- cells_frame[cells_frame$type == 'dWL', colName]
  x3 <- cells_frame[cells_frame$type == 'dWP', colName]
  x4 <- cells_frame[cells_frame$type == 'dNP', colName]
  w <- c(x1, x2, x3, x4)
  g <- factor(rep(1:4, c(length(x1),length(x2),length(x3),length(x4))),
              labels = c("cNT","dWL","dWP",'dNP'))
    
  P <- dunn.test(w, g)
  Pvalues[colName] <- P$P.adjusted
}

  Kruskal-Wallis rank sum test

data: w and g
Kruskal-Wallis chi-squared = 36.491, df = 3, p-value = 0


                             Comparison of w by g                              
                                (No adjustment)                                
Col Mean-|
Row Mean |        cNT        dNP        dWL
---------+---------------------------------
     dNP |   3.760775
         |    0.0001*
         |
     dWL |   4.322750   0.303168
         |    0.0000*     0.3809
         |
     dWP |   5.738061   1.551637   1.322003
         |    0.0000*     0.0604     0.0931

alpha = 0.05
Reject Ho if p <= alpha/2
  Kruskal-Wallis rank sum test

data: w and g
Kruskal-Wallis chi-squared = 16.2534, df = 3, p-value = 0


                             Comparison of w by g                              
                                (No adjustment)                                
Col Mean-|
Row Mean |        cNT        dNP        dWL
---------+---------------------------------
     dNP | 

We only consider the variables with a p-value greater than 0.01 in the Kruskal-Wallis test

In [6]:
cols <- colnames(Pvalues)%in%validCols

##### Summary:

- cEE is differentiated from cNT only by the network distribution, and from dNP, dWL, dWP by both: network and centroids
- cNT is differentiated from cEE and dNP only by the network distribution, and from dNP, dWL by both: network and centroids
- dWL is differentiated from dNP by centroids; from cNT and cEE by both, network and centroids. There are not significant differences with dWP.
- dWP is differentiated from dNP by centroids; from cNT and cEE by both, network and centroids. It cannot be differentiated from dWL.
- dNP is differentiated from cNT by network distribution; from cEE by both, network and centroids and from dWL and dWP by centroids.

##### cNT vs dNP

The variables with a Dunn test smaller or equal than 0.01 are

In [7]:
Pvalues["cNT - dNP",cols & Pvalues["cNT - dNP",]<=0.01]

Unnamed: 0,PE.0.sub,PE.1.sub,landscape.0.2.sub,landscape.0.5.sub,lan.0.05.sub,lan.0.10.sub,lan.0.15.sub,landscape.1.1.sub,landscape.1.2.sub,max.0.1.2.sub,...,max.0.1.1.sup,max.0.1.2.sup,MAX.0.1.02.sup,max.0.2.1.sup,max.0.2.2.sup,MAX.0.2.02.sup,max.1.1.1.sup,max.1.2.1.sup,MAX.1.2.02.sup,MAX.1.2.02.sup.1
cNT - dNP,8.469372e-05,0.0002993798,1.124005e-05,1.410087e-05,1.487277e-07,2.208937e-08,2.215424e-07,0.0002349069,1.546052e-06,0.0003886114,...,4.063769e-05,5.544999e-05,0.0001895261,8.225259e-05,0.0002302664,0.001621564,1.328867e-06,1.060719e-06,1.056985e-05,1.056985e-05


We can find differences only in the contact network

##### cNT vs dWL

The variables with a Dunn test smaller or equal than 0.01 are

In [8]:
Pvalues["cNT - dWL",cols & Pvalues["cNT - dWL",]<=0.01]

Unnamed: 0,PE.0.sub,PE.1.sub,landscape.0.2.sub,landscape.0.5.sub,lan.0.05.sub,lan.0.10.sub,lan.0.15.sub,landscape.1.1.sub,landscape.1.2.sub,max.0.1.2.sub,...,MAX.1.2.02.sup.1,PE.0.rips,LEN.15,LEN.20,LEN.05,LEN.10,MAX.0.1.05.rips,MAX.0.1.10.rips,MAX.0.1.20.rips,MAX.0.1.50.rips
cNT - dWL,7.70479e-06,0.001493112,9.368157e-05,3.256924e-05,2.032333e-06,8.548197e-06,3.661022e-07,5.421652e-06,1.88818e-05,0.007136355,...,0.0003625917,3.65105e-06,1.001134e-05,2.303996e-06,0.0003754792,2.327925e-05,0.003111752,0.0004099832,7.59052e-05,5.877382e-06


We can find differences in both, the contact network and in the centroid distribution

##### dNP vs dWL

The variables with a Dunn test smaller or equal than 0.01 are

In [9]:
Pvalues["dNP - dWL",cols & Pvalues["dNP - dWL",]<=0.01]

Unnamed: 0,PE.0.rips,LEN.15,LEN.20,LEN.10,MAX.0.1.20.rips,MAX.0.1.50.rips
dNP - dWL,0.005653788,0.001055225,0.0006589938,0.003109631,0.00642412,0.003868547


We cannot find differences between dNP and dWL

##### cNT - dWP

The variables with a Dunn test smaller or equal than 0.01 are

In [10]:
Pvalues["cNT - dWP",cols & Pvalues["cNT - dWP",]<=0.01]

Unnamed: 0,PE.0.sub,PE.1.sub,landscape.0.2.sub,landscape.0.5.sub,lan.0.05.sub,lan.0.10.sub,lan.0.15.sub,landscape.1.1.sub,landscape.1.2.sub,max.0.1.2.sub,...,LEN.20,max.0.1.1.rips,LEN.05,LEN.10,max.0.1.2.rips,max.0.1.3.rips,MAX.0.1.05.rips,MAX.0.1.10.rips,MAX.0.1.20.rips,MAX.0.1.50.rips
cNT - dWP,4.788309e-09,0.0005718349,2.392783e-07,2.884914e-05,2.670933e-06,4.614546e-09,9.071409e-09,0.0001071057,1.10954e-08,0.0006216042,...,7.015795e-06,0.001118477,6.714447e-05,7.352085e-06,0.0007098283,0.000513078,0.0001013876,3.870869e-05,1.338863e-05,2.827145e-06


We can find differences in both, the contact network and in the centroid distribution

##### dNP vs dWP

The variables with a Dunn test smaller or equal than 0.01 are

In [11]:
Pvalues["dNP - dWP",cols & Pvalues["dNP - dWP",]<=0.01]

Unnamed: 0,PE.0.rips,len.5,LEN.15,LEN.20,LEN.05,LEN.10,MAX.0.1.05.rips,MAX.0.1.10.rips,MAX.0.1.20.rips,MAX.0.1.50.rips
dNP - dWP,0.005090596,0.003061383,0.0007540823,0.00150532,0.00594121,0.001568703,0.00649845,0.004817346,0.002213953,0.00272424


We can find differences between dNP and dWP

##### dWL vs dWP

The variables with a Dunn test smaller or equal than 0.01 are

In [20]:
Pvalues["dWL - dWP",cols & Pvalues["dWL - dWP",]<=0.01]

dWL - dWP


We cannot find differences between dNP and dWP

## Differences between CVT path and Epithelial tissues

Some steps of the CVT paths are though to represent well the contact network of some epithelial tissues.
In particular, cNT is similar to CVT001, dWL is similar to CVT004 and dWP is similar to CVT005.

We will perform Mann-Whitney U tests to see if we can find significant differences in their contact network and centroid distribution with our method.

In [13]:
variables <- colnames(cells_frame)
variables <- variables[seq(2,length(variables))]
CVTvsEPI <- data.frame(variables)

#calculate p-values of the Mann-Whitney U test
aux <- c()
for (colName in variables){
  epi = cells_frame[cells_frame$type == 'cNT', colName]
  cvt = cells_frame[cells_frame$type == 'CVT001', colName]
  aux <- c(aux, as.numeric(wilcox.test(epi,cvt, exact = FALSE)[3]))
}
CVTvsEPI["001vsCNT"] <- aux

aux <- c()
for (colName in variables){
  epi = cells_frame[cells_frame$type == 'dWL', colName]
  cvt = cells_frame[cells_frame$type == 'CVT004', colName]
  aux <- c(aux, as.numeric(wilcox.test(epi,cvt, exact = FALSE)[3]))
}
CVTvsEPI["004vsDWL"] <- aux

aux <- c()
for (colName in variables){
  epi = cells_frame[cells_frame$type == 'dWP', colName]
  cvt = cells_frame[cells_frame$type == 'CVT005', colName]
  aux <- c(aux, as.numeric(wilcox.test(epi,cvt, exact = FALSE)[3]))
}
CVTvsEPI["005vsDWP"] <- aux

##### Summary:
cNT and dWP can be differentiated from CVT001 and CVT005 respectively by both, network and centroid distribution. dWL can be differentiated from CVT004 only by network distribution

##### cNT vs CVT001

These are the variable for which the p-value of the Mann-Whitney U test is smaller than or equal to 0.01 

In [14]:
as.vector(CVTvsEPI[CVTvsEPI[, "001vsCNT"]<=0.01, c("variables", "001vsCNT")])

Unnamed: 0,variables,001vsCNT
1,PE.0.sub,0.0005519762
6,lan.0.10.sub,0.003515385
7,lan.0.15.sub,8.710074e-05
18,MAX.0.2.10.sub,0.003524482
42,PE.0.rips,0.003231585
44,len.2,0.001713749
45,len.5,0.001713749
46,LEN.15,0.0004902788
47,LEN.20,0.001713749
50,LEN.05,0.004381588


We can find differences in both, the contact network and the centroid distribution

##### dWL vs CVT004

These are the variable for which the p-value of the Mann-Whitney U test is smaller than or equal to 0.01 

In [15]:
as.vector(CVTvsEPI[CVTvsEPI[, "004vsDWL"]<=0.01, c("variables", "004vsDWL")])

Unnamed: 0,variables,004vsDWL
42,PE.0.rips,0.003537936
43,len.1,0.002555707
44,len.2,0.003537936
45,len.5,0.003537936
49,max.0.1.1.rips,0.002555707
52,max.0.1.2.rips,0.001632705
53,max.0.1.3.rips,0.003934908
54,MAX.0.1.05.rips,0.005959526


##### dWP vs CVT005

These are the variable for which the p-value of the Mann-Whitney U test is smaller than or equal to 0.01 

In [16]:
as.vector(CVTvsEPI[CVTvsEPI[, "005vsDWP"]<=0.01, c("variables", "005vsDWP")])

Unnamed: 0,variables,005vsDWP
42,PE.0.rips,1.852999e-05
46,LEN.15,0.001232172
47,LEN.20,0.0002359686
51,LEN.10,0.001536739
52,max.0.1.2.rips,0.009469782
53,max.0.1.3.rips,0.006489705
55,MAX.0.1.10.rips,0.009469782
56,MAX.0.1.20.rips,0.001101765
57,MAX.0.1.50.rips,0.00018351
