## Motivation

In the experiment designed to test the independency of cell fate decisions in the ICM (see main text and Figure 1), we mix wild type embyro cells labelled with H2B-GFP and unlabelled cells (either wild type or Gata6 null). The presence (or absence) of GFP lets us identify the progeny of each initial population. Given that either population can represent any given fraction of the resulting chimeric embryo, manual scoring of GFP+ and GFP- cells becomes unreasonable. 
To automatically identify both subpopulations, we applied clustering methods to separate GFP+ and GFP- cells. GFP expression is variable due to mosaic expression of the transgene and to the noise in the image acquisition and  processing. Therefore a simple threshold is not enough to separate populations. Instead, we use clustering methods, which give a better result.  

In this notebook, I compare the performance of three approaches, Hierarchical clustering, 1D and 2D K-means clustering, and show the rationale for the approach we ultimately used in the study.

### Set up
Load necessary packages and create an object with plotting aesthetics

In [None]:
library('reshape2')
library('plyr')
library('dplyr')
library('ggplot2')

looks <- theme_bw() + theme(panel.grid = element_blank(), 
                            strip.background = element_blank(), 
                            panel.border = element_rect(color = 'black', 
                                                        size = 1), 
                            axis.ticks = element_line(color = 'black', 
                                                      size = 0.5), 
                            axis.text = element_text(size = 6, 
                                                     color = 'black'), 
                            axis.title = element_text(size = 8, 
                                                      color = 'black'), 
                            legend.text = element_text(size = 8, 
                                                       color = 'black'), 
                            legend.title = element_text(size = 10, 
                                                        color = 'black'), 
                            strip.text = element_text(size = 8, 
                                                      color = 'black'))

### Load data

Read in data 