# Experiments with 257 cells

## Differences between Epithelial Tissues

We would like to see if there are significant differences between the distributions followed by these variables in each of the epithelial tissues. In order to do that, we first perform a Kruskal-Wallis test over the six tissues for each of the variables, and then a pairwise Dunn test. Sice we repeat this test over 22 variables, we have fixed the p-value threshold in 0.01.

#### Kruskal-Wallis pvalues

We are calculating the p-values from the Kruskal-Wallis test. First, we load the frame

In [1]:
require("dunn.test")
cellNumber <- 257
load(paste("../2_variables/frames",as.character(cellNumber), "cells_frame.RData", sep="/"))
colNames <- colnames(cells_frame)

Loading required package: dunn.test


and save the pvalues in an array

In [2]:
KWpvalue <- c()
for (j in seq(2,length(cells_frame))){
  colName <- colNames[j]
  kwAux <- as.numeric(kruskal.test(cells_frame[,colName] ~ cells_frame$type)[3])
  KWpvalue <- c(KWpvalue, kwAux)
    }

The variables whith a p-value greater or equal than 0.01 are

In [3]:
#Note the first entry in colNames is type and we do not want to take it.
colNames[c(FALSE,KWpvalue<=0.01)]
validCols <- colNames[c(FALSE, KWpvalue<=0.01)]

The variables with a p-value greater than 0.01 are

In [4]:
#Note the first entry in colNames is type and we do not want to take it.
colNames[c(FALSE,KWpvalue>0.01)]

There are none variables with p-value greater than 0.01

#### Dunn Test

We run the Dunn test to compare pairwise the tissues.

In [5]:

colNames <- colnames(cells_frame)

# we perform the first dunn test before the loop to set the row names
colName <- colNames[2]
x1 <- cells_frame[cells_frame$type == 'cNT', colName]
x2 <- cells_frame[cells_frame$type == 'dWL', colName]
x3 <- cells_frame[cells_frame$type == 'dWP', colName]
x4 <- cells_frame[cells_frame$type == 'dNP', colName]
w <- c(x1, x2, x3, x4)
g <- factor(rep(1:4, c(length(x1),length(x2),length(x3),length(x4))),
            labels = c("cNT","dWL","dWP",'dNP'))

P <- dunn.test(w, g)

Pvalues <- data.frame(P$P.adjusted)
colnames(Pvalues) <- c(colNames[2])
row.names(Pvalues) <- P[["comparisons"]]

for (j in seq(3,length(cells_frame))){
  colName <- colNames[j]
  x1 <- cells_frame[cells_frame$type == 'cNT', colName]
  x2 <- cells_frame[cells_frame$type == 'dWL', colName]
  x3 <- cells_frame[cells_frame$type == 'dWP', colName]
  x4 <- cells_frame[cells_frame$type == 'dNP', colName]
  w <- c(x1, x2, x3, x4)
  g <- factor(rep(1:4, c(length(x1),length(x2),length(x3),length(x4))),
              labels = c("cNT","dWL","dWP",'dNP'))
    
  P <- dunn.test(w, g)
  Pvalues[colName] <- P$P.adjusted
}

ERROR: Error in factor(rep(1:4, c(length(x1), length(x2), length(x3), length(x4))), : invalid 'labels'; length 4 should be 1 or 3


We only consider the variables with a p-value greater than 0.01 in the Kruskal-Wallis test

In [None]:
cols <- colnames(Pvalues)%in%validCols

##### Summary:

- cEE is differentiated from cNT only by the network distribution, and from dNP, dWL, dWP by both: network and centroids
- cNT is differentiated from cEE and dNP only by the network distribution, and from dNP, dWL by both: network and centroids
- dWL is differentiated from dNP by centroids; from cNT and cEE by both, network and centroids. There are not significant differences with dWP.
- dWP is differentiated from dNP by centroids; from cNT and cEE by both, network and centroids. It cannot be differentiated from dWL.
- dNP is differentiated from cNT by network distribution; from cEE by both, network and centroids and from dWL and dWP by centroids.

##### cNT vs dNP

The variables with a Dunn test smaller or equal than 0.01 are

In [None]:
Pvalues["cNT - dNP",cols & Pvalues["cNT - dNP",]<=0.01]

We can find differences only in the contact network

##### cNT vs dWL

The variables with a Dunn test smaller or equal than 0.01 are

In [None]:
Pvalues["cNT - dWL",cols & Pvalues["cNT - dWL",]<=0.01]

We can find differences in both, the contact network and in the centroid distribution

##### dNP vs dWL

The variables with a Dunn test smaller or equal than 0.01 are

In [None]:
Pvalues["dNP - dWL",cols & Pvalues["dNP - dWL",]<=0.01]

We cannot find differences between dNP and dWL

##### cNT - dWP

The variables with a Dunn test smaller or equal than 0.01 are

In [None]:
Pvalues["cNT - dWP",cols & Pvalues["cNT - dWP",]<=0.01]

We can find differences in both, the contact network and in the centroid distribution

##### dNP vs dWP

The variables with a Dunn test smaller or equal than 0.01 are

In [None]:
Pvalues["dNP - dWP",cols & Pvalues["dNP - dWP",]<=0.01]

We can find differences between dNP and dWP

##### dWL vs dWP

The variables with a Dunn test smaller or equal than 0.01 are

In [None]:
Pvalues["dWL - dWP",cols & Pvalues["dWL - dWP",]<=0.01]

We cannot find differences between dNP and dWP

## Differences between CVT path and Epithelial tissues

Some steps of the CVT paths are though to represent well the contact network of some epithelial tissues.
In particular, cNT is similar to CVT001, dWL is similar to CVT004 and dWP is similar to CVT005.

We will perform Mann-Whitney U tests to see if we can find significant differences in their contact network and centroid distribution with our method.

In [None]:
variables <- colnames(cells_frame)
variables <- variables[seq(2,length(variables))]
CVTvsEPI <- data.frame(variables)

#calculate p-values of the Mann-Whitney U test
aux <- c()
for (colName in variables){
  epi = cells_frame[cells_frame$type == 'cNT', colName]
  cvt = cells_frame[cells_frame$type == 'CVT001', colName]
  aux <- c(aux, as.numeric(wilcox.test(epi,cvt, exact = FALSE)[3]))
}
CVTvsEPI["001vsCNT"] <- aux

aux <- c()
for (colName in variables){
  epi = cells_frame[cells_frame$type == 'dWL', colName]
  cvt = cells_frame[cells_frame$type == 'CVT004', colName]
  aux <- c(aux, as.numeric(wilcox.test(epi,cvt, exact = FALSE)[3]))
}
CVTvsEPI["004vsDWL"] <- aux

aux <- c()
for (colName in variables){
  epi = cells_frame[cells_frame$type == 'dWP', colName]
  cvt = cells_frame[cells_frame$type == 'CVT005', colName]
  aux <- c(aux, as.numeric(wilcox.test(epi,cvt, exact = FALSE)[3]))
}
CVTvsEPI["005vsDWP"] <- aux

##### Summary:
cNT and dWP can be differentiated from CVT001 and CVT005 respectively by both, network and centroid distribution. dWL can be differentiated from CVT004 only by network distribution

##### cNT vs CVT001

These are the variable for which the p-value of the Mann-Whitney U test is smaller than or equal to 0.01 

In [None]:
as.vector(CVTvsEPI[CVTvsEPI[, "001vsCNT"]<=0.01, c("variables", "001vsCNT")])

We can find differences in both, the contact network and the centroid distribution

##### dWL vs CVT004

These are the variable for which the p-value of the Mann-Whitney U test is smaller than or equal to 0.01 

In [None]:
as.vector(CVTvsEPI[CVTvsEPI[, "004vsDWL"]<=0.01, c("variables", "004vsDWL")])

##### dWP vs CVT005

These are the variable for which the p-value of the Mann-Whitney U test is smaller than or equal to 0.01 

In [None]:
as.vector(CVTvsEPI[CVTvsEPI[, "005vsDWP"]<=0.01, c("variables", "005vsDWP")])