# Experiments with 187 cells

## Differences between Epithelial Tissues

We would like to see if there are significant differences between the distributions followed by these variables in each of the epithelial tissues. In order to do that, we first perform a Kruskal-Wallis test over the six tissues for each of the variables, and then a pairwise Dunn test. Sice we repeat this test over 22 variables, we have fixed the p-value threshold in 0.01.

#### Kruskal-Wallis pvalues

We are calculating the p-values from the Kruskal-Wallis test. First, we load the frame

In [1]:
require("dunn.test")
cellNumber <- 187
load(paste("../2_variables/frames",as.character(cellNumber), "cells_frame.RData", sep="/"))
colNames <- colnames(cells_frame)

Loading required package: dunn.test


and save the pvalues in an array

In [2]:
KWpvalue <- c()
for (j in seq(2,length(cells_frame))){
  colName <- colNames[j]
  kwAux <- as.numeric(kruskal.test(cells_frame[,colName] ~ cells_frame$type)[3])
  KWpvalue <- c(KWpvalue, kwAux)
    }

The variables whith a p-value greater or equal than 0.01 are

In [3]:
#Note the first entry in colNames is type and we do not want to take it.
colNames[c(FALSE,KWpvalue<=0.01)]
validCols <- colNames[c(FALSE, KWpvalue<=0.01)]

The variables with a p-value greater than 0.01 are

In [4]:
#Note the first entry in colNames is type and we do not want to take it.
colNames[c(FALSE,KWpvalue>0.01)]

There are none variables with p-value greater than 0.01

#### Dunn Test

We run the Dunn test to compare pairwise the tissues.

In [5]:

colNames <- colnames(cells_frame)

# we perform the first dunn test before the loop to set the row names
colName <- colNames[2]
x1 <- cells_frame[cells_frame$type == 'cNT', colName]
x2 <- cells_frame[cells_frame$type == 'dWL', colName]
x3 <- cells_frame[cells_frame$type == 'dWP', colName]
x4 <- cells_frame[cells_frame$type == 'cEE', colName]
x5 <- cells_frame[cells_frame$type == 'dNP', colName]
w <- c(x1, x2, x3, x4, x5)
g <- factor(rep(1:5, c(length(x1),length(x2),length(x3),length(x4),length(x5))),
            labels = c("cNT","dWL","dWP",'cEE','dNP'))

P <- dunn.test(w, g)

Pvalues <- data.frame(P$P.adjusted)
colnames(Pvalues) <- c(colNames[2])
row.names(Pvalues) <- P[["comparisons"]]

for (j in seq(3,length(cells_frame))){
  colName <- colNames[j]
  x1 <- cells_frame[cells_frame$type == 'cNT', colName]
  x2 <- cells_frame[cells_frame$type == 'dWL', colName]
  x3 <- cells_frame[cells_frame$type == 'dWP', colName]
  x4 <- cells_frame[cells_frame$type == 'cEE', colName]
  x5 <- cells_frame[cells_frame$type == 'dNP', colName]
  w <- c(x1, x2, x3, x4, x5)
  g <- factor(rep(1:5, c(length(x1),length(x2),length(x3),length(x4),length(x5))),
              labels = c("cNT","dWL","dWP",'cEE','dNP'))
    
  P <- dunn.test(w, g)
  Pvalues[colName] <- P$P.adjusted
}

  Kruskal-Wallis rank sum test

data: w and g
Kruskal-Wallis chi-squared = 50.5219, df = 4, p-value = 0


                             Comparison of w by g                              
                                (No adjustment)                                
Col Mean-|
Row Mean |        cEE        cNT        dNP        dWL
---------+--------------------------------------------
     cNT |   2.189594
         |    0.0143*
         |
     dNP |   4.392217   2.463364
         |    0.0000*    0.0069*
         |
     dWL |   5.605819   3.635652   0.944833
         |    0.0000*    0.0001*     0.1724
         |
     dWP |   5.888650   3.906638   1.153480   0.207458
         |    0.0000*    0.0000*     0.1244     0.4178

alpha = 0.05
Reject Ho if p <= alpha/2
  Kruskal-Wallis rank sum test

data: w and g
Kruskal-Wallis chi-squared = 32.4956, df = 4, p-value = 0


                             Comparison of w by g                              
                                (No adjustment

We only consider the variables with a p-value greater than 0.01 in the Kruskal-Wallis test

In [6]:
cols <- colnames(Pvalues)%in%validCols

##### Summary:

- cEE is differentiated from cNT only by the network distribution, and from dNP, dWL, dWP by both: network and centroids
- cNT is differentiated from cEE and dNP only by the network distribution, and from dNP, dWL by both: network and centroids
- dWL is differentiated from dNP by centroids (but only 1 variable); from cNT and cEE by both, network and centroids. There are not significant differences with dWP.
- dWP is differentiated from dNP by centroids (but only 1 variable); from cNT and cEE by both, network and centroids. It cannot be differentiated from dWL.
- dNP is differentiated from cNT by network distribution; from cEE by both, network and centroids and from dWL and dWP by centroids (but only 1 variable).

###### cEE vs cNT

The variables with a Dunn test smaller or equal than 0.01 are

In [7]:
Pvalues["cEE - cNT",cols & Pvalues["cEE - cNT",]<=0.01]

Unnamed: 0,PE.1.sub,landscape.0.5.sub,lan.1.05.sub,max.0.1.1.sub,max.0.1.2.sub,max.0.1.3.sub,max.0.2.1.sub,max.0.2.2.sub,PE.0.sup
cEE - cNT,0.00215293,0.007126708,5.253052e-05,0.00696307,0.0005727106,0.003283421,0.001635052,0.006643707,0.007881721


We can find differences only in the contact network

##### cEE vs dNP

The variables with a Dunn test smaller or equal than 0.01 are

In [8]:
colnames(Pvalues["cEE - dNP",cols & Pvalues["cEE - dNP",]<=0.01])

We can find differences in both, the contact network and in the centroid distribution

##### cNT vs dNP

The variables with a Dunn test smaller or equal than 0.01 are

In [9]:
colnames(Pvalues["cNT - dNP",cols & Pvalues["cNT - dNP",]<=0.01])

We can find differences only in the contact network

##### cEE vs dWL

The variables with a Dunn test smaller or equal than 0.01 are

In [10]:
colnames(Pvalues["cEE - dWL",cols & Pvalues["cEE - dWL",]<=0.01])

We can find differences in both, the contact network and in the centroid distribution

##### cNT vs dWL

The variables with a Dunn test smaller or equal than 0.01 are

In [11]:
colnames(Pvalues["cNT - dWL",cols & Pvalues["cNT - dWL",]<=0.01])

We can find differences in both, the contact network and in the centroid distribution

##### dNP vs dWL

The variables with a Dunn test smaller or equal than 0.01 are

In [12]:
Pvalues["dNP - dWL",cols & Pvalues["dNP - dWL",]<=0.01]

Unnamed: 0,LEN.15,LEN.05,LEN.10
dNP - dWL,0.007626408,0.004730528,0.009432594


We can differentiate dNP from dWL only by LEN.10

##### cEE vs dWP

The variables with a Dunn test smaller or equal than 0.01 are

In [13]:
colnames(Pvalues["cEE - dWP",cols & Pvalues["cEE - dWP",]<=0.01])

We can find differences in both, the contact network and in the centroid distribution

##### cNT - dWP

The variables with a Dunn test smaller or equal than 0.01 are

In [14]:
colnames(Pvalues["cNT - dWP",cols & Pvalues["cNT - dWP",]<=0.01])

We can find differences in both, the contact network and in the centroid distribution

##### dNP vs dWP

The variables with a Dunn test smaller or equal than 0.01 are

In [15]:
Pvalues["dNP - dWP",cols & Pvalues["dNP - dWP",]<=0.01]

Unnamed: 0,PE.0.rips,len.5,LEN.05,MAX.0.1.20.rips
dNP - dWP,0.009961491,0.00560102,0.005518176,0.008657976


We cannot find differences between dNP from dWP only by len.5

##### dWL vs dWP

The variables with a Dunn test smaller or equal than 0.01 are

In [16]:
Pvalues["dWL - dWP",cols & Pvalues["dWL - dWP",]<0.01]

In [17]:
Pvalues["dWL - dWP",cols & Pvalues["dWL - dWP",]<0.05]

Unnamed: 0,lan.0.03.sup,lan.1.02.sup,MAX.1.2.02.sup,MAX.1.2.02.sup.1
dWL - dWP,0.007341815,0.03575871,0.04358183,0.04358183


We cannot find differences between dNP and dWP

## Differences between CVT path and Epithelial tissues

Some steps of the CVT paths are though to represent well the contact network of some epithelial tissues.
In particular, cNT is similar to CVT001, dWL is similar to CVT004 and dWP is similar to CVT005.

We will perform Mann-Whitney U tests to see if we can find significant differences in their contact network and centroid distribution with our method.

In [18]:
variables <- colnames(cells_frame)
variables <- variables[seq(2,length(variables))]
CVTvsEPI <- data.frame(variables)

#calculate p-values of the Mann-Whitney U test
aux <- c()
for (colName in variables){
  epi = cells_frame[cells_frame$type == 'cNT', colName]
  cvt = cells_frame[cells_frame$type == 'CVT001', colName]
  aux <- c(aux, as.numeric(wilcox.test(epi,cvt, exact = FALSE)[3]))
}
CVTvsEPI["001vsCNT"] <- aux

aux <- c()
for (colName in variables){
  epi = cells_frame[cells_frame$type == 'dWL', colName]
  cvt = cells_frame[cells_frame$type == 'CVT004', colName]
  aux <- c(aux, as.numeric(wilcox.test(epi,cvt, exact = FALSE)[3]))
}
CVTvsEPI["004vsDWL"] <- aux

aux <- c()
for (colName in variables){
  epi = cells_frame[cells_frame$type == 'dWP', colName]
  cvt = cells_frame[cells_frame$type == 'CVT005', colName]
  aux <- c(aux, as.numeric(wilcox.test(epi,cvt, exact = FALSE)[3]))
}
CVTvsEPI["005vsDWP"] <- aux

##### Summary:
cNT and dWP can be differentiated from CVT001 and CVT005 respectively by both, network and centroid distribution. dWL can be differentiated from CVT004 only by network distribution

##### cNT vs CVT001

These are the variable for which the p-value of the Mann-Whitney U test is smaller than or equal to 0.01 

In [19]:
as.vector(CVTvsEPI[CVTvsEPI[, "001vsCNT"]<=0.01, c("variables", "001vsCNT")])

Unnamed: 0,variables,001vsCNT
1,PE.0.sub,0.0032315853
42,PE.0.rips,0.0071426558
45,len.5,0.00069764
46,LEN.15,0.0007831876
47,LEN.20,0.0013767069
50,LEN.05,0.0017137487
54,MAX.0.1.05.rips,0.0008783885
55,MAX.0.1.10.rips,0.0009842247
56,MAX.0.1.20.rips,0.0007831876
57,MAX.0.1.50.rips,0.0007831876


We can find differences in both, the contact network and the centroid distribution

##### dWL vs CVT004

These are the variable for which the p-value of the Mann-Whitney U test is smaller than or equal to 0.01 

In [20]:
as.vector(CVTvsEPI[CVTvsEPI[, "004vsDWL"]<=0.01, c("variables", "004vsDWL")])

Unnamed: 0,variables,004vsDWL
42,PE.0.rips,0.0088793
43,len.1,0.001829083
49,max.0.1.1.rips,0.001829083
52,max.0.1.2.rips,0.003537936
53,max.0.1.3.rips,0.003177739
54,MAX.0.1.05.rips,0.009785073


##### dWP vs CVT005

These are the variable for which the p-value of the Mann-Whitney U test is smaller than or equal to 0.01 

In [21]:
as.vector(CVTvsEPI[CVTvsEPI[, "005vsDWP"]<=0.01, c("variables", "005vsDWP")])

Unnamed: 0,variables,005vsDWP
,,
12.0,max.0.1.1.sub,0.008542651
42.0,PE.0.rips,9.624174e-05
43.0,len.1,0.0003416019
44.0,len.2,0.002125246
46.0,LEN.15,0.004381588
47.0,LEN.20,0.002363332
49.0,max.0.1.1.rips,0.0003416019
52.0,max.0.1.2.rips,0.0005519762
53.0,max.0.1.3.rips,0.001713749
