<h1>Loading ILTIS data into Immunova</h1>

In [1]:
import sys
if '/home/ross/immunova' not in sys.path:
    sys.path.append('/home/ross/immunova')
from immunova.data.fcs_experiments import FCSExperiment
from immunova.data.utilities import get_fcs_file_paths
from immunova.data.panel import Panel
from immunova.data.mongo_setup import test_init

In [2]:
test_init()

<h2>Create Panels</h2>

Start by using the excel templates in the root folder of this notebook to create `Panel` objects. These represent the expected channel/marker mappings of fcs files and are used to 'standardise' input at the point of entry.

In [3]:
t1_panel = Panel()
t1_panel.panel_name = 'ILTIS_T1'
t1_panel.create_from_excel('t1_mappings.xlsx')
t1_panel.save()

<Panel: Panel object>

In [4]:
t2_panel = Panel()
t2_panel.panel_name = 'ILTIS_T2'
t2_panel.create_from_excel('t2_mappings.xlsx')
t2_panel.save()

<Panel: Panel object>

In [5]:
n_panel = Panel()
n_panel.panel_name = 'ILTIS_N'
n_panel.create_from_excel('mononeutro_mappings.xlsx')
n_panel.save()

<Panel: Panel object>

<h2>Create experiments</h2>

I need three experiments, one for each of the above Panels. This is the central object that will be used to add fcs files, sample data, and then interact with that data.

In [6]:
t1 = FCSExperiment()
t1.experiment_id = 'ILTIS_T1'
t1.panel = t1_panel
t1.save()

<FCSExperiment: FCSExperiment object>

In [7]:
t2 = FCSExperiment()
t2.experiment_id = 'ILTIS_T2'
t2.panel = t2_panel
t2.save()

<FCSExperiment: FCSExperiment object>

In [8]:
n = FCSExperiment()
n.experiment_id = 'ILTIS_N'
n.panel = n_panel
n.save()

<FCSExperiment: FCSExperiment object>

<h2>Add data to experiments</h2>
I'm going to add the first 10 samples from ILTIS to each experiment. I can use the utility function in Immunova `get_fcs_file_paths` to generate a dictionary of fcs file paths from any location.

In [9]:
help(get_fcs_file_paths)

Help on function get_fcs_file_paths in module immunova.data.utilities:

get_fcs_file_paths(fcs_dir:str, control_names:list, ctrl_id:str, ignore_comp=True) -> dict
    Generate a standard dictionary object of fcs files in given directory
    :param fcs_dir: target directory for search
    :param control_names: names of expected control files (names must appear in filenames)
    :param ctrl_id: global identifier for control file e.g. 'FMO' (must appear in filenames)
    :param ignore_comp: If True, files with 'compensation' in their name will be ignored (default = True)
    :return: standard dictionary of fcs files contained in target directory



In [10]:
t1_files = dict()
t2_files = dict()
n_files = dict()
t1_ctrls = ['CD57', 'CCR7', 'CD45RA', 'CD27']
t2_ctrls = ['CXCR3', 'HLA-DR', 'CD69', 'CD25']
n_ctrls = ['CD11b', 'HLA-DR', 'CD40', 'CD62L']
root = '/media/ross/extdrive/ILTIS/study_fcs_files/'
for x in [f'hc{x}' for x in range(1, 22)]:
    print(f'Fetching files for: {x}')
    print('T1 panel...')
    t1_files[x] = get_fcs_file_paths(f'{root}{x}/day1/t1', control_names=t1_ctrls, ctrl_id='FMO')
    print('T2 panel...')
    t2_files[x] = get_fcs_file_paths(f'{root}{x}/day1/t2', control_names=t2_ctrls, ctrl_id='FMO')
    print('Monocyte/Neutrophil panel...')
    n_files[x] = get_fcs_file_paths(f'{root}{x}/day1/mononeutro', control_names=n_ctrls, ctrl_id='FMO')
    print('Complete!')
    print('---------------------------------------------------')

Fetching files for: hc1
T1 panel...
T2 panel...
Monocyte/Neutrophil panel...
Complete!
---------------------------------------------------
Fetching files for: hc2
T1 panel...
T2 panel...
Monocyte/Neutrophil panel...
Complete!
---------------------------------------------------
Fetching files for: hc3
T1 panel...
T2 panel...
Monocyte/Neutrophil panel...
Complete!
---------------------------------------------------
Fetching files for: hc4
T1 panel...
T2 panel...
Monocyte/Neutrophil panel...
Complete!
---------------------------------------------------
Fetching files for: hc5
T1 panel...
T2 panel...
Monocyte/Neutrophil panel...
Complete!
---------------------------------------------------
Fetching files for: hc6
T1 panel...
T2 panel...
Monocyte/Neutrophil panel...
Complete!
---------------------------------------------------
Fetching files for: hc7
T1 panel...
T2 panel...
Monocyte/Neutrophil panel...
Complete!
---------------------------------------------------
Fetching files for: hc8
T1 

Some samples are missing FMO's so let's remove those.

In [11]:
for x in [f'hc{x}' for x in [2,3,6,8,14,20,21]]:
    _=t1_files.pop(x)
    _=t2_files.pop(x)
    _=n_files.pop(x)

In [12]:
sample_ids = ['hc1',
 'hc4',
 'hc5',
 'hc7',
 'hc9',
 'hc10',
 'hc11',
 'hc12',
 'hc13',
 'hc15',
 'hc16',
 'hc17',
 'hc18',
 'hc19']

In [13]:
help(t1.add_new_sample)

Help on method add_new_sample in module immunova.data.fcs_experiments:

add_new_sample(sample_id:str, file_path:str, controls:list, comp_matrix:<built-in function array>=None, compensate:bool=True, feedback:bool=True, catch_standardisation_errors:bool=False) -> str method of immunova.data.fcs_experiments.FCSExperiment instance
    Add a new sample (FileGroup) to this experiment
    :param sample_id: primary ID for identification of sample (FileGroup.primary_id)
    :param file_path: file path of the primary fcs file (e.g. the fcs file that is of primary interest such as the
    file with complete staining)
    :param controls: list of file paths for control files e.g. a list of file paths for associated FMO controls
    :param comp_matrix: (optional) numpy array for spillover matrix for compensation calculation; if not supplied
    the matrix linked within the fcs file will be used, if not present will present an error
    :param compensate: boolean value as to whether compensation sho

In [14]:
for x in sample_ids:
    print('--------------------------------------------------------------------')
    print(f'Processing fcs files for  {x}')
    print('Adding files for T1 experiment...')
    t1.add_new_sample(sample_id=x, file_path=t1_files[x]['primary'][0],
                     controls=t1_files[x]['controls'], feedback=False)
    print('Adding files for T2 experiment...')
    t2.add_new_sample(sample_id=x, file_path=t2_files[x]['primary'][0],
                 controls=t2_files[x]['controls'], feedback=False)
    print('Adding files for Monocyte/Neutrophil experiment...')
    n.add_new_sample(sample_id=x, file_path=n_files[x]['primary'][0],
                 controls=n_files[x]['controls'], feedback=False)
    print('Completed!')
    print('--------------------------------------------------------------------')

--------------------------------------------------------------------
Processing fcs files for  hc1
Adding files for T1 experiment...
Adding files for T2 experiment...
Adding files for Monocyte/Neutrophil experiment...
Completed!
--------------------------------------------------------------------
--------------------------------------------------------------------
Processing fcs files for  hc4
Adding files for T1 experiment...
Adding files for T2 experiment...
Adding files for Monocyte/Neutrophil experiment...
Completed!
--------------------------------------------------------------------
--------------------------------------------------------------------
Processing fcs files for  hc5
Adding files for T1 experiment...
Adding files for T2 experiment...
Adding files for Monocyte/Neutrophil experiment...
Completed!
--------------------------------------------------------------------
--------------------------------------------------------------------
Processing fcs files for  hc7
Adding 