# Runs brainvisa preprocessing

This notebook creates the whole brainvisa preprocessing pipeline to feed to deep learning algorithms.
Note that you need brainvisa installed or you need to run the server under the brainvisa singularity.

You also need to have access to the HCP database (the HCP folder lies as subfolder of root_unsupervised) and a path to the root of the supervised folder (/neurospin for persons present at neurospin)

# Sets root directories

Contrary to the other notebooks, this one relies on data that are outside the deep_folding/data folder.

In [1]:
# This is the root of the HCP directory
# Could be either /tgcc, or /nfs/tgcc, for example
root_unsupervised = '/tgcc'  

# This is the root of the supervised directory
# Could be either /neuropsin, or /nfs/neurospin, for example
root_supervised = '/neurospin' 

# Imports

General imports

In [2]:
import sys
import os
from os.path import join
import glob
import json
import inspect

Deep_folding imports

In [44]:
from deep_folding.brainvisa import generate_skeletons
from deep_folding.brainvisa import generate_ICBM2009c_transforms
from deep_folding.brainvisa import resample_files
from deep_folding.brainvisa import compute_bounding_box
from deep_folding.brainvisa import compute_mask
from deep_folding.brainvisa import generate_crops
print(inspect.getfile(compute_bounding_box))
print(inspect.getfile(generate_crops))

/neurospin/dico/jchavas/Runs/32_deep_folding_foldlabel_clean/Program/deep_folding/deep_folding/brainvisa/compute_bounding_box.py
/neurospin/dico/jchavas/Runs/32_deep_folding_foldlabel_clean/Program/deep_folding/deep_folding/brainvisa/generate_crops.py


Constants

In [4]:
_ALL_SUBJECTS = -1

In [5]:
out_voxel_size = 1

In [6]:
number_subjects_supervised = 10 # Number of subjects for which we determine the box. We can set it to _ALL_SUBJECTS

In [7]:
number_subjects = 10 # Number of subjects for which we generate the crops. We can set it to _ALL_SUBJECTS

# Creates useful functions

In [8]:
def check_directory(directory_path):
    """Checks directory path and returns absolute path"""
    directory_path = os.path.abspath(directory_path)
    if os.path.isdir(directory_path):
        print((directory_path + ' is a directory'))
    else:
        print((directory_path + ' does not exist or is not a directory.'))
    return directory_path

# Variables used by all sub-computations

The following boolean variables decide which pprocessing to run:

In [42]:
run_bbox = True  # If set to True, it generates new bounding boxes
run_mask = True  # If set to True, it generates new masks
run_crop = True  # If set to True, it generates crops

We now assign path names and other user-specific variables.

The unsupervised source directory is where the unsupervised database lies. It contains the morphologist analysis subfolder ANALYSIS/3T_morphologist


In [23]:
unsupervised_src_dir = check_directory(join(root_unsupervised, 'hcp', 'ANALYSIS/3T_morphologist'))

/tgcc/hcp/ANALYSIS/3T_morphologist is a directory


The supervised source directories are where lies the database that has been manually labelled. It is a list of full pathes towards the manually labelled datasets.

In [11]:
human_supervised_dir = join(root_supervised, 'dico/data/bv_databases/human')
supervised_src_dir = [check_directory(join(human_supervised_dir, 'pclean/all'))
                     ]
path_to_graph = ["t1mri/t1/default_analysis/folds/3.3/base2018_manual"
                 ]

/neurospin/dico/data/bv_databases/human/pclean/all is a directory


# Generates bounding boxes

### User variables

In [12]:
bbox_dir = check_directory(join(root_supervised, 'dico/data/deep_folding/test', 'bbox'))

/neurospin/dico/data/deep_folding/test/bbox is a directory


In [13]:
mask_dir = check_directory(join(root_supervised, 'dico/data/deep_folding/current/mask/2mm'))

/neurospin/dico/data/deep_folding/current/mask/2mm is a directory


Lists the sulci of the left side that we want to analyze:

In [14]:
sulci_left = ['S.T.s.ter.asc.ant.', 'S.T.s.ter.asc.post.']

Lists the sulci of the right side that we want to analyze:

In [15]:
sulci_right = ['S.T.s.ter.asc.ant.', 'S.T.s.ter.asc.post.']

### Generates bounding boxes (actual program)

We first call crop_definition help as if called from a command line:

In [16]:
args = "--help"
argv = args.split(' ')
compute_bounding_box.main(argv)

usage: compute_bounding_box.py [-h] [-s SRC_DIR [SRC_DIR ...]] [-o OUTPUT_DIR]
                               [-u SULCUS] [-w NEW_SULCUS] [-i SIDE]
                               [-p PATH_TO_GRAPH] [-n NB_SUBJECTS] [-v]
                               [-x OUT_VOXEL_SIZE]

Computes bounding box around the named sulcus

optional arguments:
  -h, --help            show this help message and exit
  -s SRC_DIR [SRC_DIR ...], --src_dir SRC_DIR [SRC_DIR ...]
                        Source directory where the MRI data lies. If there are
                        several directories, add all directories one after the
                        other. Example: -s DIR_1 DIR_2. Default is :
                        /neurospin/dico/data/bv_databases/human/pclean/all
  -o OUTPUT_DIR, --output_dir OUTPUT_DIR
                        Output directory where to store the output bbox json
                        files. Default is : test/bbox/1mm
  -u SULCUS, --sulcus SULCUS
                        Sulcus name ar

In [17]:
print(sulci_left, sulci_right)

['S.T.s.ter.asc.ant.', 'S.T.s.ter.asc.post.'] ['S.T.s.ter.asc.ant.', 'S.T.s.ter.asc.post.']


We now run the actial program.
This saves as json files in bbox_dir the bounding box characteristics:

In [18]:
if run_bbox:
    for sulcus in sulci_left:
        compute_bounding_box.compute_bounding_box(
            src_dir=supervised_src_dir, 
            path_to_graph=path_to_graph,
            bbox_dir=bbox_dir,
            sulcus=sulcus,
            side='L',
            number_subjects=number_subjects_supervised,
            out_voxel_size=out_voxel_size)
    for sulcus in sulci_right:
        compute_bounding_box.compute_bounding_box(
            src_dir=supervised_src_dir, 
            path_to_graph=path_to_graph,
            bbox_dir=bbox_dir,
            sulcus=sulcus,
            side='R',
            number_subjects=number_subjects_supervised,
            out_voxel_size=out_voxel_size)

INFO:compute_bounding_box.py: {'subject': 'sujet01', 'side': 'L', 'dir': '/neurospin/dico/data/bv_databases/human/pclean/all', 'graph_file': 'sujet01/t1mri/t1/default_analysis/folds/3.3/base2018_manual/Lsujet01*.arg'}
INFO:compute_bounding_box.py: {'subject': 's12590', 'side': 'L', 'dir': '/neurospin/dico/data/bv_databases/human/pclean/all', 'graph_file': 's12590/t1mri/t1/default_analysis/folds/3.3/base2018_manual/Ls12590*.arg'}
INFO:compute_bounding_box.py: {'subject': 'ammon', 'side': 'L', 'dir': '/neurospin/dico/data/bv_databases/human/pclean/all', 'graph_file': 'ammon/t1mri/t1/default_analysis/folds/3.3/base2018_manual/Lammon*.arg'}
INFO:compute_bounding_box.py: {'subject': 'vayu', 'side': 'L', 'dir': '/neurospin/dico/data/bv_databases/human/pclean/all', 'graph_file': 'vayu/t1mri/t1/default_analysis/folds/3.3/base2018_manual/Lvayu*.arg'}
INFO:compute_bounding_box.py: {'subject': 'osiris', 'side': 'L', 'dir': '/neurospin/dico/data/bv_databases/human/pclean/all', 'graph_file': 'osiri

INFO:compute_bounding_box.py: {'subject': 'beflo', 'side': 'R', 'dir': '/neurospin/dico/data/bv_databases/human/pclean/all', 'graph_file': 'beflo/t1mri/t1/default_analysis/folds/3.3/base2018_manual/Rbeflo*.arg'}


# Generates crops

### User variables 

In [19]:
interp = 'nearest'

In [27]:
crop_dir = check_directory(join(root_supervised, 'dico/data/deep_folding/test', 'crops'))

/neurospin/dico/data/deep_folding/test/crops is a directory


### Generates crops (actual program)

We now save in {crop_dir}/Lcrops and {crop_dir}/Rcrops the actual crops based on bounding boxes:

In [29]:
skeleton_raw_dir = check_directory(join(root_supervised, 'dico/data/deep_folding/test', 'skeletons/raw'))

/neurospin/dico/data/deep_folding/test/skeletons/raw does not exist or is not a directory.


In [32]:
skeleton_1mm_dir = check_directory(join(root_supervised, 'dico/data/deep_folding/test', 'skeletons/1mm'))

/neurospin/dico/data/deep_folding/test/skeletons/1mm does not exist or is not a directory.


In [33]:
transform_dir = check_directory(join(root_supervised, 'dico/data/deep_folding/test', 'transform'))

/neurospin/dico/data/deep_folding/test/transform is a directory


In [30]:
if run_crop:
    generate_skeletons.generate_skeletons(
        src_dir=unsupervised_src_dir,
        skeleton_dir=skeleton_raw_dir,
        side='L',
        number_subjects=number_subjects)
    generate_skeletons.generate_skeletons(
        src_dir=unsupervised_src_dir,
        skeleton_dir=skeleton_raw_dir,
        side='R',
        number_subjects=number_subjects)

INFO:generate_skeletons.py: list_subjects[:5] = ['585862', '586460', '727654', '210617', '123420']
INFO:generate_skeletons.py: SERIAL MODE: subjects are scanned serially, without parallelism
INFO:generate_skeletons.py: list_subjects[:5] = ['585862', '586460', '727654', '210617', '123420']
INFO:generate_skeletons.py: SERIAL MODE: subjects are scanned serially, without parallelism


In [34]:
if run_crop:
    generate_ICBM2009c_transforms.generate_ICBM2009c_transforms(
        src_dir=unsupervised_src_dir,
        transform_dir=transform_dir,
        side='L',
        number_subjects=number_subjects)
    generate_ICBM2009c_transforms.generate_ICBM2009c_transforms(
        src_dir=unsupervised_src_dir,
        transform_dir=transform_dir,
        side='R',
        number_subjects=number_subjects)

INFO:generate_ICBM2009c_transforms.py: filenames[:5] = ['/tgcc/hcp/ANALYSIS/3T_morphologist/585862/', '/tgcc/hcp/ANALYSIS/3T_morphologist/586460/', '/tgcc/hcp/ANALYSIS/3T_morphologist/727654/', '/tgcc/hcp/ANALYSIS/3T_morphologist/210617/', '/tgcc/hcp/ANALYSIS/3T_morphologist/123420/']
INFO:generate_ICBM2009c_transforms.py: list_subjects[:5] = ['585862', '586460', '727654', '210617', '123420']
INFO:generate_ICBM2009c_transforms.py: SERIAL MODE: transforms are generated serially, without parallelism
INFO:generate_ICBM2009c_transforms.py: filenames[:5] = ['/tgcc/hcp/ANALYSIS/3T_morphologist/585862/', '/tgcc/hcp/ANALYSIS/3T_morphologist/586460/', '/tgcc/hcp/ANALYSIS/3T_morphologist/727654/', '/tgcc/hcp/ANALYSIS/3T_morphologist/210617/', '/tgcc/hcp/ANALYSIS/3T_morphologist/123420/']
INFO:generate_ICBM2009c_transforms.py: list_subjects[:5] = ['585862', '586460', '727654', '210617', '123420']
INFO:generate_ICBM2009c_transforms.py: SERIAL MODE: transforms are generated serially, without parall

In [35]:
if run_crop:
    resample_files.resample_files(
        src_dir=skeleton_raw_dir,
        input_type='skeleton',
        resampled_dir=skeleton_1mm_dir,
        transform_dir=transform_dir,
        side='L',
        number_subjects=number_subjects)
    resample_files.resample_files(
        src_dir=skeleton_raw_dir,
        input_type='skeleton',
        resampled_dir=skeleton_1mm_dir,
        transform_dir=transform_dir,
        side='R',
        number_subjects=number_subjects)
    print("Done")

INFO:resample_files.py: list_subjects[:5] = ['585862', '727654', '123420', '145127', '887373']
INFO:resample_files.py: SERIAL MODE: subjects are scanned serially
INFO:resample_files.py: list_subjects[:5] = ['156536', '887373', '123420', '586460', '210617']
INFO:resample_files.py: SERIAL MODE: subjects are scanned serially


In [39]:
if run_crop:
    # Runs on left hemisphere
    generate_crops.generate_crops(
        src_dir=skeleton_1mm_dir,
        crop_dir=crop_dir,
        bbox_dir=bbox_dir,
        cropping_type='bbox',
        list_sulci=sulci_left,
        side='L',
        number_subjects=number_subjects)
    # Runs on right hemisphere
    generate_crops.generate_crops(
        src_dir=skeleton_1mm_dir,
        crop_dir=crop_dir,
        bbox_dir=bbox_dir,
        cropping_type='bbox',
        list_sulci=sulci_right,
        side='R',
        number_subjects=number_subjects)
    print("Done")

INFO:generate_crops.py: list_subjects[:5] = ['586460', '810843', '887373', '896778', '210617']
INFO:generate_crops.py: SERIAL MODE: subjects are scanned serially
INFO:generate_crops.py: list_subjects[:5] = ['896778', '585862', '727654', '156536', '210617']
INFO:generate_crops.py: SERIAL MODE: subjects are scanned serially


Done


We now sort generated files (we do it in date order; indeed, we may find older files in folder as we don't suppress files before writing new ones):

In [40]:
files = glob.glob(f"{crop_dir}/Lcrops/*.nii.gz")
files.sort(key=os.path.getmtime, reverse=True)
print("\n".join(files[:number_subjects]))

/neurospin/dico/data/deep_folding/test/crops/Lcrops/727654_cropped_skeleton.nii.gz
/neurospin/dico/data/deep_folding/test/crops/Lcrops/145127_cropped_skeleton.nii.gz
/neurospin/dico/data/deep_folding/test/crops/Lcrops/585862_cropped_skeleton.nii.gz
/neurospin/dico/data/deep_folding/test/crops/Lcrops/156536_cropped_skeleton.nii.gz
/neurospin/dico/data/deep_folding/test/crops/Lcrops/123420_cropped_skeleton.nii.gz
/neurospin/dico/data/deep_folding/test/crops/Lcrops/210617_cropped_skeleton.nii.gz
/neurospin/dico/data/deep_folding/test/crops/Lcrops/896778_cropped_skeleton.nii.gz
/neurospin/dico/data/deep_folding/test/crops/Lcrops/887373_cropped_skeleton.nii.gz
/neurospin/dico/data/deep_folding/test/crops/Lcrops/810843_cropped_skeleton.nii.gz
/neurospin/dico/data/deep_folding/test/crops/Lcrops/586460_cropped_skeleton.nii.gz


In [31]:
files = glob.glob(f"{crop_dir}/Rcrops/*.nii.gz")
files.sort(key=os.path.getmtime, reverse=True)
print("\n".join(files[:number_subjects]))

/neurospin/dico/data/deep_folding/test/crops/Rcrops/145127_normalized.nii.gz
/neurospin/dico/data/deep_folding/test/crops/Rcrops/810843_normalized.nii.gz
/neurospin/dico/data/deep_folding/test/crops/Rcrops/887373_normalized.nii.gz
/neurospin/dico/data/deep_folding/test/crops/Rcrops/896778_normalized.nii.gz
/neurospin/dico/data/deep_folding/test/crops/Rcrops/586460_normalized.nii.gz
/neurospin/dico/data/deep_folding/test/crops/Rcrops/210617_normalized.nii.gz
/neurospin/dico/data/deep_folding/test/crops/Rcrops/727654_normalized.nii.gz
/neurospin/dico/data/deep_folding/test/crops/Rcrops/585862_normalized.nii.gz
/neurospin/dico/data/deep_folding/test/crops/Rcrops/123420_normalized.nii.gz
/neurospin/dico/data/deep_folding/test/crops/Rcrops/156536_normalized.nii.gz


### Generates mask-based crops

In [45]:
if run_mask:
    for sulcus in sulci_left:
        compute_mask.compute_mask(
            src_dir=supervised_src_dir, 
            path_to_graph=path_to_graph,
            mask_dir=mask_dir,
            sulcus=sulcus,
            side='L',
            number_subjects=number_subjects_supervised,
            out_voxel_size=out_voxel_size)
    for sulcus in sulci_right:
        compute_mask.compute_mask(
            src_dir=supervised_src_dir, 
            path_to_graph=path_to_graph,
            mask_dir=mask_dir,
            sulcus=sulcus,
            side='R',
            number_subjects=number_subjects_supervised,
            out_voxel_size=out_voxel_size)
    print("Done")

INFO:compute_mask.py: {'subject': 'sujet01', 'side': 'L', 'dir': '/neurospin/dico/data/bv_databases/human/pclean/all', 'graph_file': 'sujet01/t1mri/t1/default_analysis/folds/3.3/base2018_manual/Lsujet01*.arg'}
INFO:compute_mask.py: {'subject': 's12590', 'side': 'L', 'dir': '/neurospin/dico/data/bv_databases/human/pclean/all', 'graph_file': 's12590/t1mri/t1/default_analysis/folds/3.3/base2018_manual/Ls12590*.arg'}
INFO:compute_mask.py: {'subject': 'ammon', 'side': 'L', 'dir': '/neurospin/dico/data/bv_databases/human/pclean/all', 'graph_file': 'ammon/t1mri/t1/default_analysis/folds/3.3/base2018_manual/Lammon*.arg'}
INFO:compute_mask.py: {'subject': 'vayu', 'side': 'L', 'dir': '/neurospin/dico/data/bv_databases/human/pclean/all', 'graph_file': 'vayu/t1mri/t1/default_analysis/folds/3.3/base2018_manual/Lvayu*.arg'}
INFO:compute_mask.py: {'subject': 'osiris', 'side': 'L', 'dir': '/neurospin/dico/data/bv_databases/human/pclean/all', 'graph_file': 'osiris/t1mri/t1/default_analysis/folds/3.3/ba

INFO:compute_mask.py: {'subject': 'cronos', 'side': 'R', 'dir': '/neurospin/dico/data/bv_databases/human/pclean/all', 'graph_file': 'cronos/t1mri/t1/default_analysis/folds/3.3/base2018_manual/Rcronos*.arg'}
INFO:compute_mask.py: {'subject': 'beflo', 'side': 'R', 'dir': '/neurospin/dico/data/bv_databases/human/pclean/all', 'graph_file': 'beflo/t1mri/t1/default_analysis/folds/3.3/base2018_manual/Rbeflo*.arg'}
INFO:compute_mask.py: Final mask file: /neurospin/dico/data/deep_folding/current/mask/2mm/R/S.T.s.ter.asc.post._right.nii.gz


In [47]:
if run_crop:
    # Runs on left hemisphere
    generate_crops.generate_crops(
        src_dir=skeleton_1mm_dir,
        crop_dir=crop_dir,
        mask_dir=mask_dir,
        list_sulci=sulci_left,
        side='L',
        cropping_type='mask',
        number_subjects=number_subjects)
    # Runs on right hemisphere
    generate_crops.generate_crops(
        src_dir=skeleton_1mm_dir,
        crop_dir=crop_dir,
        mask_dir=mask_dir,
        list_sulci=sulci_left,
        side='R',
        cropping_type='mask',
        number_subjects=number_subjects)
    print("Done")

INFO:generate_crops.py: list_subjects[:5] = ['586460', '810843', '887373', '896778', '210617']
INFO:generate_crops.py: SERIAL MODE: subjects are scanned serially
INFO:generate_crops.py: list_subjects[:5] = ['896778', '585862', '727654', '156536', '210617']
INFO:generate_crops.py: SERIAL MODE: subjects are scanned serially


Done
