# Overview of CyTOF Data
The original data was given as two tab-separated matrices
* ``Plasma.txt`` (original name: 160202_CGI002_Plasma_Plasma_singlets.fcs_raw_events.txt)
* ``PMA.txt`` (original name: 160202_CGI002_PMA_PMA_singlets.fcs_raw_events.txt)

These files had individual cell measurements as rows and dimensions (e.g. antibodies) as columns. I only kept the dimensions of interest surface marker and phospho marker antibody columns/dimensions and renamed these files ``Plasma_clean.txt`` and ``PMA_clean.txt``.

# Plasma

In [4]:
import pandas as pd
import numpy as np

from clustergrammer_widget import *
net = Network()

In [27]:
net.load_file('cytof_data/Plasma_clean.txt')
df_plasma = net.export_df()
df_plasma.shape

(141859, 28)

In [28]:
net.normalize(axis='col', norm_type='zscore', keep_orig=False)
net.downsample(ds_type='kmeans', axis='row', num_samples=1000)
net.dat['mat'].shape

(1000, 28)

In [29]:
# clip z-scores since we do not are about extreme outliers
net.clip(-10,10)

In [30]:
net.make_clust(views=[])
clustergrammer_widget(network=net.widget())

# PMA

In [31]:
net.load_file('cytof_data/PMA_clean.txt')
df_pma = net.export_df()

In [32]:
net.load_df(df_pma)
df_pma.shape

(110705, 28)

In [33]:
net.normalize(axis='col', norm_type='zscore', keep_orig=False)
net.downsample(ds_type='kmeans', axis='row', num_samples=1000)
net.dat['mat'].shape
net.clip(-10,10)

In [34]:
net.make_clust(views=[])
clustergrammer_widget(network=net.widget())

# Merge Plasma and PMA

In [35]:
df_merge = pd.concat([df_plasma, df_pma])

In [36]:
df_merge.shape

(252564, 28)

In [37]:
net.load_df(df_merge)

In [38]:
net.normalize(axis='col', norm_type='zscore', keep_orig=False)
net.downsample(ds_type='kmeans', axis='row', num_samples=1000)
net.clip(-10,10)
net.dat['mat'].shape

(1000, 28)

In [39]:
net.make_clust(views=[])
clustergrammer_widget(network=net.widget())