# Exploring fcs readers

This notebook explores a couple popular fcs python readers:
- pytometry
- fcsparser
- flowio
- cytopy

In [None]:
from nbproject import header

In [1]:
%load_ext autoreload
%autoreload 2

## Data

In [4]:
import os
from pathlib import Path

# adjust this directory as needed
emailed_data = (
    Path.home() / "Library/Mobile Documents/com~apple~CloudDocs/Lamin/Emailed data"
)

basedir = os.chdir(emailed_data / "2021-03-24 Maren DZNE/data")

path_data = Path("A1 3804-CV-1 DMSO.785309.fcs")

## pytometry

- pytometry parse data into an AnnData object
- stores headers in .uns['meta']
- stores spillover matrix in .uns['spill_mat']
- computes a compensation matrix from spillover matrix and store it in .uns['comp_mat']

In [6]:
import pytometry as pm

adata = pm.io.readandconvert(path_data)
adata

AnnData object with n_obs × n_vars = 500886 × 19
    uns: 'meta', 'spill_mat', 'comp_mat'

In [76]:
adata.uns["spill_mat"]

Unnamed: 0,CCR7-FITC-A,OX40-PE-Cy7-A,CD69-PE-A,CD8-BV650-A,CD45RA-BV421-A,CD4-BV605-A,DUMP-V500-A,CD3-AF700-A,CD137-APC-A
0,1.0,0.0,0.0001,0.0004,0.0,0.0101,0.052,0.0,0.0001
1,0.0059,1.0,0.0353,0.0003,0.0,0.0064,0.0062,0.0179,0.0018
2,0.0199,0.0009,1.0,0.0047,0.0001,0.0887,0.0487,0.0071,0.0012
3,0.0034,0.0125,0.0005,1.0,0.012,0.6845,0.0503,0.2263,0.4818
4,0.0087,0.0001,0.0006,0.005,1.0,0.1044,1.6071,0.0005,0.0011
5,0.0019,0.0059,0.0159,0.1277,0.0021,1.0,0.0282,0.0002,0.0007
6,0.047,0.0,0.0001,0.0175,0.0,0.2314,1.0,0.0,0.0001
7,0.0077,0.012,0.0002,0.0008,0.0,0.0047,0.0078,1.0,0.0367
8,0.0007,0.0067,0.0001,0.0202,0.0,0.0022,0.0024,0.3411,1.0


## fcsparser

fcsparser is very basic, just parses the file into two objects: 
- meta: a dict containing info
- data: a dataframe

In [5]:
import fcsparser

meta, data = fcsparser.parse(path_data, reformat_meta=True)

In [9]:
data.shape

(500886, 19)

In [37]:
meta.keys()

dict_keys(['__header__', '$BEGINANALYSIS', '$BEGINDATA', '$BEGINSTEXT', '$BYTEORD', '$DATATYPE', '$ENDANALYSIS', '$ENDDATA', '$ENDSTEXT', '$MODE', '$NEXTDATA', '$PAR', '$P4F', '$P4L', '$P5F', '$P5L', '$P6F', '$P6L', '$P7F', '$P7L', '$P8F', '$P8L', '$P9F', '$P9L', '$P10F', '$P10L', '$P11F', '$P11L', '$P12F', '$P12L', '$P13F', '$P13L', '$P14F', '$P14L', '$P15F', '$P15L', '$P16F', '$P16L', '$P17F', '$P17L', '$P18F', '$P18L', '$TOT', '$BTIM', '$CYT', '$CYTSN', '$DATE', '$ETIM', '$FIL', '$SPILLOVER', '$TIMESTEP', '$TR', '$VOL', '$WELLID', '_channels_', '_channel_names_'])

In [35]:
meta.get("_channels_")

Unnamed: 0_level_0,$PnB,$PnE,$PnN,$PnR,$PnO,$PnS,$PnV
Channel Number,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
1,32,"[0, 0]",T0,2147483647,0,TLSW,0
2,32,"[0, 0]",T1,2147483647,0,TMSW,0
3,32,"[0, 0]",INFO,2147483647,0,Event Info,0
4,32,"[0, 0]",FS00-H,2147483647,100,FSC 488/10-H,294
5,32,"[0, 0]",FS00-A,2147483647,100,FSC 488/10-A,294
6,32,"[0, 0]",FS00-W,2147483647,100,FSC 488/10-W,294
7,32,"[0, 0]",SS01-H,2147483647,100,SSC 488/10-H,622
8,32,"[0, 0]",SS01-A,2147483647,100,SSC 488/10-A,622
9,32,"[0, 0]",SS01-W,2147483647,100,SSC 488/10-W,622
10,32,"[0, 0]",FL02-A,2147483647,100,CCR7-FITC-A,573


## flowio

FlowIO retrieves event data exactly as it is encoded in the FCS file: as a 1-dimensional list without separating the events into channels. However, all the metadata found within the FCS file is available as a dictionary via the 'text' attribute. Basic attributes are also available for commonly accessed properties.

- operates as FlowData objects
- couldn't handle Path object as input
- gives a warning to set `ignore_offset_error=True`

In [14]:
import flowio

fcs_data = flowio.FlowData("A1 3804-CV-1 DMSO.785309.fcs", ignore_offset_error=True)

  warn(warn_msg)


In [52]:
[i for i in fcs_data.__dir__() if not i.startswith("_")]

['name',
 'file_size',
 'header',
 'text',
 'channel_count',
 'event_count',
 'analysis',
 'events',
 'channels',
 'write_fcs']

In [57]:
fcs_data.text.get("spillover")

'9,FL02-A,FL08-A,FL12-A,FL15-A,FL19-A,FL20-A,FL21-A,FL22-A,FL25-A,1.0000,0.0000,0.0001,0.0004,0.0000,0.0101,0.0520,0.0000,0.0001,0.0059,1.0000,0.0353,0.0003,0.0000,0.0064,0.0062,0.0179,0.0018,0.0199,0.0009,1.0000,0.0047,0.0001,0.0887,0.0487,0.0071,0.0012,0.0034,0.0125,0.0005,1.0000,0.0120,0.6845,0.0503,0.2263,0.4818,0.0087,0.0001,0.0006,0.0050,1.0000,0.1044,1.6071,0.0005,0.0011,0.0019,0.0059,0.0159,0.1277,0.0021,1.0000,0.0282,0.0002,0.0007,0.0470,0.0000,0.0001,0.0175,0.0000,0.2314,1.0000,0.0000,0.0001,0.0077,0.0120,0.0002,0.0008,0.0000,0.0047,0.0078,1.0000,0.0367,0.0007,0.0067,0.0001,0.0202,0.0000,0.0022,0.0024,0.3411,1.0000'

## cytopy

Utilising FlowIO

Couldn't get it installed... but high quality code for data parsing. So I copied the read_write.py to _core.py with a few modifications:
- allow reading in Path
- catch ValueError when reading in fcs using flowio.FlowData
- catch ParserError in processing date
- fixed .compensate()


--> see [quickstart](https://lamin.ai/readfcs/guides/quickstart)
