This repository contains a Python implementation of FlowSOM algorithm for clustering and visualizing a mass cytometry data set.
For more details about the algorithm, please check (En|中)
Just use pip
pip install FlowSom
Or download this repository to a directory of your choice and then run:
pip install -r requirements.txt
- Read Files
- In order to use FlowSOM you need your data saved as a .csv file or a .fcs file.
file = r'flowmetry.fcs'
- Or
file = 'flowmetry.csv'
- Import Package
- Then you need to import the package.
- If you install the package via pip, then you should run
from flowsom import flowsom
- If you download the repository, you should run
from flowsom import *
- Play Around
- Then you can run FlowSOM just as follows:
fsom = flowsom(file) # read the data
fsom.som_mapping(50, 50, 31, sigma=2.5,
learning_rate=0.1, batch_size=100) # trains SOM with 100 iterations
fsom.meta_clustering(AgglomerativeClustering, min_n=40,
max_n=45,
iter_n=3) # train the meta clustering for cluster in range(40,45)
After the training, you will be able to:
- Get the weights of SOM with method
fsom.map_som
- Get the best number of clustering with method
fsom.bestk
- Get the prediction dataframe with method
fsom.df
andfsom.tf_df
- Visualize the final clustering outcome with method
fsom.vis
The demo code could be found here.
The distance map of SOM trained from a sample flow cytometry data:
The visualization example after meta-clustering using Minimal Spanning Tree (MST):
FlowSOM analyzes flow or mass cytometry data using a self-Organizing Map (SOM). Using a two-level clustering and star charts, FlowSOM helps to obtain a clear overview of how all markers are behaving on all cells, and to detect subsets that might be missed otherwise.
The algorithm consists of four steps:
- reading the data
- building a Self-Organizing Map
- building a minimal spanning tree
- computing a meta-clustering
SOM is a type of unsupervised Artificial Neural Network able to convert complex, nonlinear statistical relationships between high-dimensional data items into simple geometric relationships on a low-dimensional display. Introduction
A minimum spanning tree (MST) or minimum weight spanning tree is a subset of the edges of a connected, edge-weighted undirected graph that connects all the vertices together, without any cycles and with the minimum possible total edge weight.
The meta-clustering technique conducted on the SOM is hierarchical consensus meta-clustering, which clusters the weights of trained SOM into different groups.
FlowSOM is built based on FlowCytometryTools, MiniSom and Consensus Clustering.
Update pypi: source