# FlowKit Tutorial - Part 3 - The `GatingStrategy` & `GatingResults` Classes

https://flowkit.readthedocs.io/en/latest/?badge=latest

So far, we've seen how to load FCS files using the Sample class and perform basic pre-processing like compensation and transformation for better visualization of event data. In part 3, we will explore using FlowKit for gating Sample event data using the `GatingStrategy` and `GatingResults` classes.

If you have any questions about FlowKit, find any bugs, or feel something is missing from these tutorials [please submit an issue to the GitHub repository here](https://github.com/whitews/FlowKit/issues/new/).

## Table of Contents

* [GatingStrategy Class](#GatingStrategy-Class)
  * [The Gate ID Concept](#The-Gate-ID-Concept)
  * [Create a GatingStrategy from GatingMl Document](#Create-a-GatingStrategy-from-GatingML-Document)
    * [Retrieve the Gate Hierarchy](#Retrieve-the-Gate-Hierarchy)
    * [Export Gate Hierarchy as Image](#Export-Gate-Hierarchy-as-Image)
    * [Retrieve Gate IDs](#Retrieve-Gate-IDs)
    * [Retrieve Gate Instances](#Retrieve-Gate-Instances)
    * [Retrieve Compensation Matrices](#Retrieve-Compensation-Matrices)
    * [Retrieve Transformations](#Retrieve-Transformations)
* [GatingResults Class](#GatingResults-Class)
  * [GatingResults Report](#GatingResults-Report)

In [None]:
import bokeh
from bokeh.plotting import show
import matplotlib.pyplot as plt

import flowkit as fk

bokeh.io.output_notebook()
%matplotlib inline

_ = plt.ioff()

In [None]:
# check version so users can verify they have the same version/API
fk.__version__

## GatingStrategy Class

A GatingStrategy object represents a collection of hierarchical gates along with the compensation and transformation information referenced by any gate Dimension objects (covered in Part 4 of the tutorial series). A GatingStrategy can be created from a valid GatingML document or built programmatically. Methods in the GatingStrategy class fall in to 3 main categories: adding gate-related objects, retrieving those objects, and applying the gating strategy to a Sample.

### The Gate ID Concept

Quite a lot of thought has been put into the design of the GatingStrategy class to support the various ways gates are used and processed in typical FCM workflows. The most important concept to understand when interacting with a GatingStrategy instance is how gate IDs are used to reference gates and their position within the gating hierarchy. 

For example, gates are sometimes "re-used" in different branches of the hierarchy, like the same quadrant gate applied to each of the CD4+ and CD8+ populations. Because of this, the name of the gate is not sufficient to fully identify it. Further, simply coupling the gate name with its parent gate name can also become problematic if the nested gates are re-used.

The GatingStrategy class solves this ambiguity by defining a gate ID as a tuple combining the gate name and the full ancestor path of gate names, similar in concept to a computer file system. However, this approach can be cumbersome for the common case where gates are not re-used. Therefore, the GatingStrategy allows for referencing gates simply by their gate name string for cases where that name is not re-used within the gate hierarchy. For ambiguous cases, referencing a gate requires the full gate ID tuple of the gate name and gate path. 

We will see how this works in practice later, but for now let's create a GatingStrategy from an existing GatingML-2.0 document.

### Create a GatingStrategy from GatingML Document

In [None]:
gml_path = '../../examples/data/8_color_data_set/8_color_ICS.xml'
g_strat = fk.parse_gating_xml(gml_path)

In [None]:
g_strat

The string representation reveals this GatingStrategy has 6 gates, 3 transforms, and 1 compensation (Matrix instance).

#### Retrieve the Gate Hierarchy

We can retrieve the gate hierarchy in a variety of formats using the `get_gate_hiearchy` method. The method takes the following `output` options:

* `ascii`: Generates a text-based representation of the gate tree, and is likey the most human-readable for reviewing the hierarchy. This is the default option.
* `json`: Generates a JSON representation of the gate tree, useful for programmatic parsing, especially outside of Python. When this option is used, all extra keywords are passed to `json.dumps` (e.g. `indent=2` works to indent the output).
* `dict`: Generates a Python dictionary representation of the gate tree, useful for programmatic parsing within Python.

In [None]:
text = g_strat.get_gate_hierarchy(output='ascii')

In [None]:
print(text)

In [None]:
gs_json = g_strat.get_gate_hierarchy(output='json', indent=2)

In [None]:
print(gs_json)

In [None]:
gs_dict = g_strat.get_gate_hierarchy(output='dict')

In [None]:
gs_dict

#### Export Gate Hierarchy as Image 

*Note: Exporting as an image requires the `graphviz` package.*

In [None]:
g_strat.export_gate_hierarchy_image('gs.png')

In [None]:
img = plt.imread('gs.png')

In [None]:
f = plt.figure(figsize=(12, 8))
ax = f.subplots(1)
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

plt.imshow(img)
plt.tight_layout()
plt.show()

#### Retrieve Gate IDs 

Remember, a gate ID is a tuple of the gate name and the gate path.

In [None]:
g_strat.get_gate_ids()

#### Retrieve Gate Instances

Below we show retrieving a Gate instance by its gate name, which works here because the name is unambigious within this gate hierarchy. We can also retrieve a gate's parent Gate instance or the list of child gates.

In [None]:
g_strat.get_gate('TimeGate')

In [None]:
g_strat.get_parent_gate('CD3-pos')

In [None]:
g_strat.get_child_gates('CD3-pos')

#### Retrieve Compensation Matrices

In [None]:
g_strat.comp_matrices

#### Retrieve Transformations

In [None]:
g_strat.transformations

## GatingResults Class

A GatingResults instance is returned from calling the GatingStrategy `gate_sample` method on a Sample instance, and is never created by an end user directly. A GatingResults instance contains the results of applying the gating hierarchy on a single Sample. Let's load a Sample and apply the previous GatingStrategy via the `gate_sample` method (setting `verbose=True` to print out each gate as it is processed). 

In [None]:
sample = fk.Sample("../../examples/data/8_color_data_set/fcs_files/101_DEN084Y5_15_E01_008_clean.fcs")

In [None]:
gs_results = g_strat.gate_sample(sample, verbose=True)

In [None]:
# get the Sample ID for the GatingResults instance
gs_results.sample_id

### GatingResults Report

As we can see, the GatingResults class is relatively simple, and it's main purpose is to provide a Pandas DataFrame of the results via the `report` attribute. The report contains a row for every gate and includes the following columns:

* **sample**: the Sample ID of the processed Sample instance
* **gate_path**: tuple of the gate path
* **gate_name**: the name of the gate (or name of the Quadrant of a QuadrantGate)
* **gate_type**: The class name of the gate (RectangleGate, PolygonGate, etc.)
* **quadrant_parent**: Quadrant gates are a bit different, they are really a collection of gates. This field would contain the QuadrantGate name, and each Quadrant name would be in the gate_name field.
* **parent**: the gate name of the parent gate
* **count**: the absolute event count for events inside the gate
* **absolute_percent**: the percentage of events inside the gate relative to the total event count in the Sample
* **relative_percent**: the percentage of events inside the gate relative to the number of events in the parent gate
* **level**: the depth of the gate in the gate tree relative to the root of the tree

In [None]:
gs_results.report

### Retrieve Gate Membership

The `get_gate_membership` method returns a Boolean array representing which of the Sample events are inside the specified gate.

In [26]:
cd3_pos_gate_membership.sum()

133670

We can then use the membership array to retrieve those events from the Sample

**Note: The events we extract here are not necessarily pre-processed the same as they would be given the instructions of the gate, even if using the 'comp' or 'xform' source option.  

In [27]:
gated_raw_events = sample.get_events(source='raw')
gated_raw_events = gated_raw_events[cd3_pos_gate_membership]

In [28]:
gated_raw_events.shape

(133670, 15)