# FlowKit Tutorial - Part 5 - The Session Class

https://flowkit.readthedocs.io/en/latest/?badge=latest

This tutorial will cover the `Session` class. The `Session` class is combines multiple `Sample` instances with a `GatingStrategy` to programmatically create a gates for the collection of FCS files. A `Session` utilizes the `GatingStrategy` template & custom gates so there can be a common template gate for a particular node in the gate tree, but that node can also be customized for a particular `Sample`. 

In this notebook we will use everything we've learned from the previous tutorials to programmatically create a gating strategy for a collection of FCS samples.

If you have any questions about FlowKit, find any bugs, or feel something is missing from these tutorials [please submit an issue to the GitHub repository here](https://github.com/whitews/FlowKit/issues/new/).

## Table of Contents

* [Session Class](#Session-Class)
  * [Extract Gated Event Data](#Extract-Gated-Event-Data)

In [1]:
import os
import bokeh
from bokeh.plotting import show
import matplotlib.pyplot as plt

import flowkit as fk

bokeh.io.output_notebook()
%matplotlib inline

_ = plt.ioff()

In [2]:
# check version so users can verify they have the same version/API
fk.__version__

'0.9.90b0'

## Session Class

The Session class is intended as the main interface in FlowKit for complex flow cytometry analysis. A Session allows creating a gating strategy for a collection of FCS samples. The `Session` class also supports importing a GatingML-2.0 document to serve as the gating strategy.

The gates in a Session's gating strategy can be shared across samples (common gates) or customized per sample. Unlike the GatingStrategy class, which does not retain any Sample instances, the Session class will store the Sample instances that have been loaded. This is also true for the `GatingResults` data after applying the gating strategy to loaded samples.

Let's have a look at the constructor:

    Session(
        gating_strategy=None, 
        fcs_samples=None
    )

The `gating_strategy` argument may be a `GatingStrategy` instance or a file path to a GatingML 2.0 compliant document. If None, then an empty `GatingStrategy` will be created for use in the Session.

The argument `fcs_samples` may be a `Sample` instance, string or a list. If given a string, it can be a directory path or a file path. If a directory, any .fcs files in the directory will be loaded. If a list, then it must be a list of file paths or a list of Sample instances. Lists of mixed types are not supported.

Many of the methods in the `Session` class are similar to those found in the `GatingStrategy` class, with the addition of an extra methods for managing the loaded `Sample` instances. And, there are a few methods retrieving gated event data and for plotting gated events.

Let's jump in and load a GatingML document. We'll then review the imported data and analyze the files.

In [3]:
# setup some file paths for our data
base_dir = "../../data/8_color_data_set"

fcs_dir = os.path.join(base_dir, "fcs_files")
gml_path = os.path.join(base_dir, "8_color_ICS.xml")

In [4]:
# Create a Session with the path to our GatingML document and the directory containing our FCS files. 
# Alternatively, FCS files can be added later using the 'add_samples' method.
session = fk.Session(gating_strategy=gml_path, fcs_samples=fcs_dir)

In [5]:
# get the sample IDs that were loaded
sample_list = session.get_sample_ids()

In [6]:
sample_list

['101_DEN084Y5_15_E03_009_clean.fcs',
 '101_DEN084Y5_15_E01_008_clean.fcs',
 '101_DEN084Y5_15_E05_010_clean.fcs']

In [7]:
# review the gating hierarchy that was in the GatingML document
print(session.get_gate_hierarchy())

root
╰── TimeGate
    ╰── Singlets
        ╰── aAmine-
            ╰── CD3-pos
                ├── CD4-pos
                ╰── CD8-pos


In [8]:
# looks good, let's analyze the samples (using verbose mode to see each gate as it's processed)
session.analyze_samples(verbose=True)

#### Processing gates for 3 samples (multiprocessing is enabled - 3 cpus) ####
101_DEN084Y5_15_E03_009_clean.fcs: processing gate TimeGate
101_DEN084Y5_15_E03_009_clean.fcs: processing gate Singlets
101_DEN084Y5_15_E03_009_clean.fcs: processing gate aAmine-
101_DEN084Y5_15_E01_008_clean.fcs: processing gate TimeGate
101_DEN084Y5_15_E01_008_clean.fcs: processing gate Singlets
101_DEN084Y5_15_E01_008_clean.fcs: processing gate aAmine-
101_DEN084Y5_15_E05_010_clean.fcs: processing gate TimeGate
101_DEN084Y5_15_E05_010_clean.fcs: processing gate Singlets
101_DEN084Y5_15_E03_009_clean.fcs: processing gate CD3-pos
101_DEN084Y5_15_E05_010_clean.fcs: processing gate aAmine-
101_DEN084Y5_15_E03_009_clean.fcs: processing gate CD4-pos101_DEN084Y5_15_E01_008_clean.fcs: processing gate CD3-pos

101_DEN084Y5_15_E01_008_clean.fcs: processing gate CD4-pos
101_DEN084Y5_15_E05_010_clean.fcs: processing gate CD3-pos
101_DEN084Y5_15_E03_009_clean.fcs: processing gate CD8-pos
101_DEN084Y5_15_E05_010_clean.

In [9]:
# and a look a the results
session.get_analysis_report()

Unnamed: 0,sample,gate_path,gate_name,gate_type,quadrant_parent,parent,count,absolute_percent,relative_percent,level
0,101_DEN084Y5_15_E03_009_clean.fcs,"(root,)",TimeGate,RectangleGate,,root,283968,99.999648,99.999648,1
1,101_DEN084Y5_15_E03_009_clean.fcs,"(root, TimeGate)",Singlets,PolygonGate,,TimeGate,236780,83.382341,83.382635,2
2,101_DEN084Y5_15_E03_009_clean.fcs,"(root, TimeGate, Singlets)",aAmine-,PolygonGate,,Singlets,161823,56.98615,68.343188,3
3,101_DEN084Y5_15_E03_009_clean.fcs,"(root, TimeGate, Singlets, aAmine-)",CD3-pos,PolygonGate,,aAmine-,132200,46.554377,81.694197,4
4,101_DEN084Y5_15_E03_009_clean.fcs,"(root, TimeGate, Singlets, aAmine-, CD3-pos)",CD4-pos,PolygonGate,,CD3-pos,81855,28.82533,61.917549,5
5,101_DEN084Y5_15_E03_009_clean.fcs,"(root, TimeGate, Singlets, aAmine-, CD3-pos)",CD8-pos,PolygonGate,,CD3-pos,46965,16.538777,35.525719,5
0,101_DEN084Y5_15_E01_008_clean.fcs,"(root,)",TimeGate,RectangleGate,,root,290166,99.997932,99.997932,1
1,101_DEN084Y5_15_E01_008_clean.fcs,"(root, TimeGate)",Singlets,PolygonGate,,TimeGate,239001,82.365287,82.36699,2
2,101_DEN084Y5_15_E01_008_clean.fcs,"(root, TimeGate, Singlets)",aAmine-,PolygonGate,,Singlets,164655,56.743931,68.893017,3
3,101_DEN084Y5_15_E01_008_clean.fcs,"(root, TimeGate, Singlets, aAmine-)",CD3-pos,PolygonGate,,aAmine-,133670,46.065782,81.181865,4


In [10]:
# what if we want to review the gates for a sample
sample_id = '101_DEN084Y5_15_E01_008_clean.fcs'
sample_results = session.get_gating_results(sample_id)
sample_results.report

Unnamed: 0,sample,gate_path,gate_name,gate_type,quadrant_parent,parent,count,absolute_percent,relative_percent,level
0,101_DEN084Y5_15_E01_008_clean.fcs,"(root,)",TimeGate,RectangleGate,,root,290166,99.997932,99.997932,1
1,101_DEN084Y5_15_E01_008_clean.fcs,"(root, TimeGate)",Singlets,PolygonGate,,TimeGate,239001,82.365287,82.36699,2
2,101_DEN084Y5_15_E01_008_clean.fcs,"(root, TimeGate, Singlets)",aAmine-,PolygonGate,,Singlets,164655,56.743931,68.893017,3
3,101_DEN084Y5_15_E01_008_clean.fcs,"(root, TimeGate, Singlets, aAmine-)",CD3-pos,PolygonGate,,aAmine-,133670,46.065782,81.181865,4
4,101_DEN084Y5_15_E01_008_clean.fcs,"(root, TimeGate, Singlets, aAmine-, CD3-pos)",CD4-pos,PolygonGate,,CD3-pos,82484,28.425899,61.707189,5
5,101_DEN084Y5_15_E01_008_clean.fcs,"(root, TimeGate, Singlets, aAmine-, CD3-pos)",CD8-pos,PolygonGate,,CD3-pos,47165,16.254153,35.284656,5


In [11]:
# plot the gates for a sample
for i, row in sample_results.report.iterrows():    
    p = session.plot_gate(
        row['sample'], # 'sample' is a Pandas DataFrame method, so lookup explicitly
        gate_name=row.gate_name,
        gate_path=row.gate_path,
        x_min=0, 
        x_max=1.2, 
        y_min=0, 
        y_max=1.2
    )
    show(p)

### Extract Gated Event Data

**TODO: Consider changing `get_gate_events` to remove it and use `get_gate_membership`, then rely on Sample class methods**

In [12]:
cd3_pos_events = session.get_gate_events(sample_id=sample_id, gate_name='CD3-pos')

In [13]:
# Gated event results is a list of DataFrames (in the order of the given sample_list)
# Rows are the individual events
# Columns are the channels (plus a sample_group & sample_id column)
cd3_pos_events

pnn,FSC-A,FSC-H,FSC-W,SSC-A,SSC-H,SSC-W,TNFa FITC FLR-A,CD8 PerCP-Cy55 FLR-A,IL2 BV421 FLR-A,Aqua Amine FLR-A,IFNg APC FLR-A,CD3 APC-H7 FLR-A,CD107a PE FLR-A,CD4 PE-Cy7 FLR-A,Time
pns,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
6,165875.515625,136158.0,79839.726562,30412.320312,29198.0,68261.585938,146.880005,68.339996,145.080002,126.480003,61.380001,1475.099976,393.800018,3631.100098,1.280000
9,108877.015625,86248.0,82730.781250,52511.640625,47880.0,71875.578125,150.959991,678.299988,512.119995,163.680008,275.220001,2758.140137,700.700012,5526.399902,1.287000
10,111956.429688,86024.0,85292.203125,77629.140625,68860.0,73881.835938,181.559998,499.799988,489.800018,189.720001,236.610001,2061.179932,542.299988,6578.000000,1.288000
11,183806.140625,150217.0,80190.125000,42267.777344,39405.0,70297.203125,180.539993,4299.299805,64.480003,137.639999,154.440002,1152.359985,358.600006,803.000000,1.289000
14,184054.203125,154340.0,78153.273438,28808.878906,28292.0,66733.304688,83.639999,233.580002,97.959999,76.879997,120.779999,1089.000000,321.200012,2814.900146,1.293000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
290163,160751.578125,130817.0,80532.460938,41516.039062,38705.0,70295.703125,108.119995,427.380005,59.520000,48.360001,202.949997,2280.959961,290.399994,5511.000000,68.966001
290164,94923.242188,79691.0,78062.640625,36205.917969,33896.0,70002.101562,96.900002,326.399994,318.679993,50.840000,217.800003,4101.569824,364.100006,5009.399902,68.966001
290166,127303.765625,105138.0,79352.664062,49167.058594,47074.0,68449.937500,213.179993,4848.060059,494.760010,145.080002,285.119995,2589.840088,332.200012,1290.300049,68.967002
290167,111575.656250,94812.0,77123.390625,51329.460938,47500.0,70819.531250,71.400002,9262.620117,416.640015,182.279999,195.029999,2439.360107,399.300018,1834.800049,68.967002


In [14]:
# Retrieve all the gate IDs
# Note a gate ID is a combination of the gate name plus its gate path
session.get_gate_ids()

[('TimeGate', ('root',)),
 ('Singlets', ('root', 'TimeGate')),
 ('aAmine-', ('root', 'TimeGate', 'Singlets')),
 ('CD3-pos', ('root', 'TimeGate', 'Singlets', 'aAmine-')),
 ('CD4-pos', ('root', 'TimeGate', 'Singlets', 'aAmine-', 'CD3-pos')),
 ('CD8-pos', ('root', 'TimeGate', 'Singlets', 'aAmine-', 'CD3-pos'))]

In [15]:
# Instead of getting the gated events, you can also
# retrieve the gate membership for all events.
# This is a boolean array (True value means the event is in the gate)
# Note: If the gate name is ambiguous, you must specify the gate path
session.get_gate_membership(sample_id=sample_id, gate_name='Singlets')

array([False, False, False, ..., False, False,  True])

In [16]:
# Here we'll collect the gate membership arrays for all gates for a sample
results = {}

for gate_name, gate_path in session.get_gate_ids():
    result = session.get_gate_membership(
        sample_id=sample_id, 
        gate_name=gate_name, 
        gate_path=gate_path
    )
    results[(gate_name, gate_path)] = result

In [17]:
list(results.keys())

[('TimeGate', ('root',)),
 ('Singlets', ('root', 'TimeGate')),
 ('aAmine-', ('root', 'TimeGate', 'Singlets')),
 ('CD3-pos', ('root', 'TimeGate', 'Singlets', 'aAmine-')),
 ('CD4-pos', ('root', 'TimeGate', 'Singlets', 'aAmine-', 'CD3-pos')),
 ('CD8-pos', ('root', 'TimeGate', 'Singlets', 'aAmine-', 'CD3-pos'))]

In [18]:
results[('aAmine-', ('root', 'TimeGate', 'Singlets'))]

array([False, False, False, ..., False, False,  True])