# RMixtComp basic example

## Data

Load the CSV data file as dataframe.

In [6]:
data <- read.table("blockcluster-example.csv", sep = ";")
head(data)

Unnamed: 0_level_0,V1
Unnamed: 0_level_1,<chr>
1,1110111110011110011000110011001101000111100000001100000001000101000000100011101111110100000000001011
2,0010000001001110000111001001101001000101010111110011011000110110010110010101000100010101000100111100
3,0110111110011011010000110010001011000110000000000100000000100101001000100101000001101100110011101000
4,0000101101100100001111100000100000000000010111110101111100010010011110011000000100011000101110011000
5,0001111110010000010010100011011101100101001000000101000011000101001011100111100000100110000000110001
6,1111001101000100000111011001100000000100010101101111111000111010011110111011000100011000001100111000


## Clustering with RMixtComp

Launch the BlockCluster package.

In [7]:
library(blockcluster)

Define the ...

In [None]:
mixmodStrat <- mixmodStrategy(algo="EM",
                              epsilonInAlgo=0.001,
                              nbIterationInAlgo=200,
                              initMethod="smallEM",
                              nbTryInInit=10,
                              nbIterationInInit=5,
                              epsilonInInit=0.001)

Choose the desired number of classes and the number of runs for each given number of classes.

In [8]:
nbcocluster <- 1:5

In [11]:
res <-  cocluster( data,
                  datatype="Categorical",
                  semisupervised = FALSE,
                  rowlabels = numeric(0),
                  collabels = numeric(0),
                  model = NULL,
                  nbcocluster,
                  strategy = coclusterStrategy())

ERROR: Error in cocluster(data, datatype = "Categorical", rowlabels = numeric(0), : object 'inpobj' not found


## Output's Analysis

### Criterion

This chart represents the criterion value for each model that was built. The lower the value (close to 0) the better the model.

### Variables

Draw the discriminating level of each variable. A high value (close to one) means that the variable is highly discriminating. A low value (close to zero) means that the variable is poorly discriminating. 

In [None]:
barplot(mm_res)

Draw the distribution of the variables.

In [None]:
hist(res)

Draw the similarity between every pair of variable. A high value (close to one) means that the two variables provide the same information for the clustering task (i.e. similar partitions). A low value (close to zero) means that the two variables provide some different information for the clustering task (i.e. different partitions).

In [None]:
heatmapVar(resK, pkg = "plotly")

Select a variable to draw its distribution.

In [None]:
variable <- "SG"
plotDataBoxplot(resK, variable, grl = TRUE, pkg = "plotly")

### Classes

Draw the proportion of individuals in each class.

In [None]:
plotProportion(resK, pkg = "plotly")

Draw the similarity level between each pair of classes. A high value (close to one) means that the 2 classes are strongly different (i.e. low overlapping). A low value (close to zero) means that the 2 classes are similar for the clustering task (i.e. high overlapping).

In [None]:
heatmapClass(resK, pkg = "plotly")

Draw the discriminating level of each variable for the selected class.

In [None]:
class <- 2
plotDiscrimVar(resK, class = class, pkg = "plotly")

Select a variable to draw its distribution for the selected class.

In [None]:
variable <- "SG"
plotDataBoxplot(resK, variable, class = class, grl = TRUE, pkg = "plotly")

### Probabilities

Draw the probability of assignment to a class for each individual. Individuals have been reordered in decreasing assignment probability. 

In [None]:
heatmapTikSorted(resK, pkg = "plotly")

### Advanced

Visualize in a *Gaussian-like way*, and onto R2, results of Gaussian or non-Gaussian based clustering.

In [None]:
library(ClusVis)

In [None]:
logTik <- getTik(resK, log = TRUE)
prop <- getProportion(resK)
resVisu <- clusvis(logTik, prop)

#### Component Interpretation

In [None]:
plotDensityClusVisu(resVisu, add.obs = FALSE)

#### Observation Scatter-plot 

In [None]:
plotDensityClusVisu(resVisu, add.obs = TRUE)