<font size = 6>*GMOdetector notebook* </font><br>
**Template to analyze a new batch of images** (v.0.7.2)

In this workflow, images taken with the macroPhor Array dual RGB/hyperspectral imaging platform are analyzed by a workflow in which regression quantifies fluorescent signals in hyperspectral images, deep learning segments RGB images into different tissues, and these datasets are cross-referenced to produce statistics on growth of transgenic callus and shoot.

# Experiment ID and description

<div class="alert alert-block alert-success">
Provide a short description of the experiment in the below box. This should include unique identifier codes for the experiment, along with a short description of genotypes and treatments studied. The timepoint should also be included. </div>

# Parameters for analysis

<div class="alert alert-block alert-success">
The below variables must be modified appropriately every time this workflow is run over new images.
</div>

## Data location
The `data` variable below provides the **complete** path to the folder containing data to be analyzed. This should include all folders and subfolders in which the data of interest is organized by. For the organizational system used for our lab's data, this should follow the format "/Experiment/Subexperiment/Timepoint/"

In [None]:
data="ENTER_DATA_PATH"

## Sample information
Every experiment has a spreadsheet of metadata to organize treatment and genotype information for each plate, prepare labels, and randomize plates. [For details, see tutorial on preparing this spreadsheet](https://github.com/naglemi/GMOnotebook/blob/master/1_Decide_parameters/1_Metadata_and_randomization/1-Generate_randomization_scheme.ipynb).

In [None]:
randomization_datasheet="ENTER_RANDOMIZATION_DATASHEET_PATH"

In [None]:
grid=ENTER_GRID # 12 or 20

## Detection of missing or contaminated explants

Set the `missing_explants` variable to `"Automatic"` or to the path of manually prepared data file. [For details, see this tutorial and example file](https://github.com/naglemi/GMOnotebook/tree/master/1_Decide_parameters/3_Other_parameters).

Note: Our automatic missing explant detection model is only trained for poplar.

In [None]:
missing_explants="ENTER_DENSENET_OPTION_OR_SHEET"

## Segmentation settings and models

In [None]:
segmentation_mode="ENTER_SEGMENTATION_MODE"

These three below settings need to be changed if you wish to use a different model for hyperspectral segmentation.

In [None]:
segmentation_model_key="ENTER_HYP-SEGMENTATION_MODEL_KEY"
segmentation_model_path="ENTER_HYP-SEGMENTATION_MODEL_PATH"
segmentation_model_type="ENTER_HYP-SEGMENTATION_MODEL_TYPE"

This is a list of classes that should not be included in the "All regenerated tissues" statistics.

In [None]:
unregenerated_tissues="Background Stem Necrotic Explant"

## Computing weights for fluorescent proteins

[See this notebook for details on all below fluorescent protein settings.](https://github.com/naglemi/GMOnotebook/blob/master/1_Decide_parameters/3_Other_parameters/3_Hyperspectral_settings.ipynb)

In [None]:
# ALL known fluorescent components in the sample should be included.
# Library has DsRed, ZsYellow, GFP, Chl, ChlA, ChlB, Noise
fluorophores=ENTER_FLUOROPHORES # Order doesn't matter here. Names must match library.
desired_wavelength_range=ENTER_WAVELENGTHS # (first last), e.g. (500 900)

### Producing false-color plots for fluorescent proteins

In [None]:
FalseColor_channels=ENTER_CHANNELS # (Red Green Blue), e.g. (Chl GFP Noise)
FalseColor_caps=ENTER_CAPS # (Red Green Blue); recommend 400 for reporters, 200 for others; e.g. (200 400 200)

### Producing summary statistics for fluorescent proteins

In [None]:
reporters=ENTER_REPORTERS # Will compute summary stats for these proteins, e.g. (GFP) or (GFP Chl)
pixel_threshold=ENTER_PIXEL_THRESHOLD # If this many pixels... (recommended: 3)
reporter_threshold=ENTER_REPORTER_THRESHOLD # ...have this much signal (recommended: 38), then the tissue is "Positive"

## Plot settings

In [None]:
composite=ENTER_COMPOSITE_OPTION # 1 to make composite images with side-by-side RGB, segmentation outputs and blended images (slow), 0 to skip
width=ENTER_PLOT_WIDTH # GGplot box/violin plot output (inches)
height=ENTER_PLOT_HEIGHT # GGplot box/violin plot output (inches)

## Parallelization

In [None]:
parallel=ENTER_PARALLEL_OPTION # 1 if parallelizing CubeGLM with GNU Parallel, 0 if not

## Paths to workflow modules

These only need to be modified if you are setting up a `GMOnotebook` template in a new environment.

In [None]:
gmodetector_wd="/home/cubeglm/"
spectral_library_path="${gmodetector_wd}spectral_library/"
deeplab_path="/mnt/models/rgb/poplar_model_2_w_contam/"
densenet_model_path="/mnt/models/densenet_contamination/model_finetune.h5"
cubeml_path="/home/cubeml/"
alignment_path="/home/ImageAlignment/"
gmolabeler_path="/home/GMOlabeler/"
contamination_path="/home/DenseNet"
data_prefix="/mnt/output/"
output_directory_prefix="${data_prefix}gmodetector_out/"

In [None]:
cwd="/home/GMOnotebook"

<div class="alert alert-block alert-info">
With all above variables set, please "Save as..." with a filename referencing this specific dataset. <br>Finally, deploy the workflow (Step 4 in above instructions).
</div>

# Check if inputs are OK

This script will print warnings for any common problems that are detected with input variables we set above.

In [None]:
echo ${fluorophores[@]}

In [None]:
echo $fluorophores[@]

In [None]:
IFS=','  # Setting the Internal Field Separator to ',' for array joining

echo "opt\$data <- \"$data\""
echo "opt\$randomization_datasheet <- \"$randomization_datasheet\""
echo "opt\$segmentation_mode <- \"$segmentation_mode\""
echo "opt\$unregenerated_tissues <- \"$unregenerated_tissues\""
echo "opt\$grid <- \"$grid\""
echo "opt\$missing_explants <- \"$missing_explants\""
echo "opt\$fluorophores <- \"${fluorophores[*]}\""  # Joining array elements with IFS
echo "opt\$desired_wavelength_range <- \"${desired_wavelength_range[*]}\""  # Joining array elements with IFS
echo "opt\$FalseColor_channels <- \"${FalseColor_channels[*]}\""  # Joining array elements with IFS
echo "opt\$FalseColor_caps <- \"${FalseColor_caps[*]}\""  # Joining array elements with IFS
echo "opt\$reporters <- \"${reporters[*]}\""  # Joining array elements with IFS
echo "opt\$pixel_threshold <- \"$pixel_threshold\""
echo "opt\$reporter_threshold <- \"$reporter_threshold\""
echo "opt\$segmentation_model_key <- \"$segmentation_model_key\""
echo "opt\$segmentation_model_path <- \"$segmentation_model_path\""
echo "opt\$gmodetector_wd <- \"$gmodetector_wd\""
echo "opt\$spectral_library_path <- \"$spectral_library_path\""
echo "opt\$deeplab_path <- \"$deeplab_path\""
echo "opt\$cubeml_path <- \"$cubeml_path\""
echo "opt\$alignment_path <- \"$alignment_path\""
echo "opt\$gmolabeler_path <- \"$gmolabeler_path\""
echo "opt\$contamination_path <- \"$contamination_path\""
echo "opt\$data_prefix <- \"$data_prefix\""
echo "opt\$output_directory_prefix <- \"$output_directory_prefix\""
echo "opt\$cwd <- \"$cwd\""

unset IFS  # Resetting IFS back to default


In [None]:
Rscript ${cwd}/intermediates/are_inputs_ok.R \
  --data "$data" \
  --randomization_datasheet "$randomization_datasheet" \
  --segmentation_mode "$segmentation_mode" \
  --unregenerated_tissues "$unregenerated_tissues" \
  --grid "$grid" \
  --missing_explants "$missing_explants" \
  --fluorophores "$(IFS=,; echo "${fluorophores[*]}")" \
  --desired_wavelength_range "$(IFS=,; echo "${desired_wavelength_range[*]}")" \
  --FalseColor_channels "$(IFS=,; echo "${FalseColor_channels[*]}")" \
  --FalseColor_caps "$(IFS=,; echo "${FalseColor_caps[*]}")" \
  --reporters "$(IFS=,; echo "${reporters[*]}")" \
  --pixel_threshold "$pixel_threshold" \
  --reporter_threshold "$reporter_threshold" \
  --segmentation_model_key "$segmentation_model_key" \
  --segmentation_model_path "$segmentation_model_path" \
  --gmodetector_wd "$gmodetector_wd" \
  --spectral_library_path "$spectral_library_path" \
  --deeplab_path "$deeplab_path" \
  --cubeml_path "$cubeml_path" \
  --alignment_path "$alignment_path" \
  --gmolabeler_path "$gmolabeler_path" \
  --contamination_path "$contamination_path" \
  --data_prefix "$data_prefix" \
  --output_directory_prefix "$output_directory_prefix" \
  --cwd "$cwd"


# Automated workflow to be deployed

See the below code for a walkthrough of how GMOnotebook works, or view the outputs after running the workflow for help troubleshooting errors in specific steps of analysis.

<div class="alert alert-block alert-danger"> <b>Danger:</b> Do not modify any below code without creating a new version of the template notebook. During routine usage, this workflow should be customized only by modifying variables above, while leaving the below code unmodified. </div>

These internal variables are set automatically.

In [None]:
datestamp=$(date +”%Y-%m-%d”)
data_folder=$(echo $data | cut -d/ -f5-)
timepoint="$(basename -- $data_folder)"
output_directory_full="$output_directory_prefix$data_folder"
dataset_name=$(echo $data_folder | sed -e 's/\///g')

In [None]:
if [ "$segmentation_mode" = "rgb" ]; then
    unset segmentation_model_key
    unset segmentation_model_path
    unset segmentation_model_type
fi

Time analysis begins:

In [None]:
echo $(date)

## Quantification of fluorescent proteins by regression

The Python package `CubeGLM` is used to quantify fluorescent proteins in each pixel of hyperspectral images via linear regression. Hyperspectral images are regressed over spectra of known components, and pixelwise maps of test-statistics are constructed for each component in the sample. This approach to quantifying components of hyperspectral images is described in-depth in the Methods section from <a href="https://link.springer.com/article/10.1007/s40789-019-0252-7" target="_blank">Böhme, et al. 2019</a>. Code and documentation for `CubeGLM` is on <a href="https://github.com/naglemi/GMOdetector_py" target="_blank">Github</a>.

In [None]:
cd $data_prefix

In [None]:
job_list_name="$dataset_name.jobs"

In [None]:
rm -rf $job_list_name

In [None]:
for file in $data/*.hdr
do
 if [[ "$file" != *'roadband'* ]]; then
  echo "python -W ignore ${gmodetector_wd}/wrappers/analyze_sample.py \
--file_path $file \
--fluorophores ${fluorophores[*]} \
--min_desired_wavelength ${desired_wavelength_range[0]} \
--max_desired_wavelength ${desired_wavelength_range[1]} \
--red_channel ${FalseColor_channels[0]} \
--green_channel ${FalseColor_channels[1]} \
--blue_channel ${FalseColor_channels[2]} \
--red_cap ${FalseColor_caps[0]} \
--green_cap ${FalseColor_caps[1]} \
--blue_cap ${FalseColor_caps[2]} \
--plot 1 \
--spectral_library_path "$spectral_library_path" \
--output_dir $output_directory_full \
--threshold 38" >> $job_list_name
 fi
done

In [None]:
if [ $parallel -eq 1 ]
then
    parallel --jobs 20 -a $job_list_name
fi

if [ $parallel -eq 0 ]
then
    bash $job_list_name
fi

Time regression completes:

In [None]:
echo $(date)

Determine borders of grid in hyperspectral image, which we'll use for alignment and/or cropping to explants later.

## Detection of grid edges

In [None]:
# Function to process orientation and extract grid borders
process_orientation() {
    local mode=$1
    local rotation=$2
    local flip_horizontal=$3
    local label=$4
    local half_scale=$5

    # Navigate to the intermediates directory and run the Python script
    cd "${cwd}/intermediates/"
    command="python3.8 find_grid_position.py --mode $mode --cap 500 --index 130 --data $data --plot --rotation $rotation"
    #command="python3.8 ${cwd}/intermediates/find_grid_position.py --mode $mode --cap 500 --index 130 --data $data --plot --rotation $rotation"
    if [ "$flip_horizontal" = true ]; then
        command="$command --flip_horizontal"
    fi
    eval $command

    # Navigate to the data directory where coordinates.csv is saved
    cd "$data"
    cat coordinates.csv

    # Read the CSV file and extract x1, x2, y1, y2
    firstline=1
    while IFS=',' read -r mode x1 x2 y1 y2; do
        if [ "$firstline" -eq "0" ]; then
            if [ "$half_scale" = true ]; then
                # Divide by 2 and round to integer
                x1=$(printf "%.0f" $(echo "$x1 / 2" | bc -l))
                x2=$(printf "%.0f" $(echo "$x2 / 2" | bc -l))
                y1=$(printf "%.0f" $(echo "$y1 / 2" | bc -l))
                y2=$(printf "%.0f" $(echo "$y2 / 2" | bc -l))
            fi
            eval "$label=\"$x1,$y1,$x2,$y2\""
        fi
        firstline=0
    done < coordinates.csv
}

# Process original orientation
process_orientation "hyperspectral" 0 false "aligned_grid_borders_original" false

# Process label bottom orientation
process_orientation "hyperspectral" 270 true "aligned_grid_borders_label_bottom" false

# Process for DenseNet model orientation
process_orientation "RGB" 270 false "pre_aligned_resized_grid_borders_densenet" true

# Process for hyperspectral layer matching orientation
process_orientation "RGB" 180 true "pre_aligned_grid_borders_hyp_oriented" false


# Print results
echo "Hyperspectral, original orientation:      $aligned_grid_borders_original"
echo "Hyperspectral, transf. for labels on bot: $aligned_grid_borders_label_bottom"
echo "RGB, rotated and scaled for Densenet:     $pre_aligned_resized_grid_borders_densenet"
echo "RGB, oriented for alignment with hyp:     $pre_aligned_grid_borders_hyp_oriented"

## Semantic segmentation of tissues

Images are segmented into specific plant tissues by a deep neural network of the state-of-the-art Deeplab v3 architecture <a href="https://arxiv.org/abs/1706.05587" target="_blank">Liang-Chieh et al., 2017</a>. The model has been trained using training sets generated with our annotation GUI Intelligent DEep Annotator for Segmentation (IDEAS, available on <a href="https://bitbucket.org/JialinYuan/image-annotator/src/master/" target="_blank">Bitbucket</a>, publication pending). Our branch of the Deeplab v3 repo, including a Jupyter walkthrough for training, can be found on Github.

Training is completed upstream of this notebook, which only entails analysis of test data using the latest model.

<img src="Figures/downsized/segmentation_composite2.png">

Figure: This example image was taken from an experiment on the effects of different CIMs on cottonwood regeneration. This composite image illustrates that for every sample, tissues are segmented into stem (red), callus (blue) and shoot (green). These composite images, useful for manual inspection of results, are produced when the 'composite' option is on.

### Pre-processing

#### Normalize orientation

We desire for images to all be in the same orientation. At one point, the camera on the *macroPhor Array* was set to automatically detect orientation, which led to images randomly being in portrait or landscape. Here we will standardize the orientation.

In [None]:
for filename in $data/*.jpg; do
    exiftool -Orientation=8 -n $filename > ${data}log_exiftool.txt
    done

In [None]:
rm -f $data/*original*

#### Crop and resize

This script resizes images to 900x900 and then crops away top and bottom 150 pixels for a final image size of 900x600.

The purpose for cropping is to remove labels, which has been standard practice for all training and testing. Otherwise, we could run into problems such as the neural network "learning" plants labeled as control have more or less regeneration.<br>The purpose for resizing is to reduce computational expense.

In [None]:
if [ "$segmentation_mode" = "rgb" ]; then
    cd ${cwd}/intermediates/
    python crop.py $data
fi

#### Prepare input list

The script `inference.py` requires a list of all files to be analyzed. We will create this file as `test.csv`. This will be a list of all our (pre-processed) image files.

In [None]:
if [ "$segmentation_mode" = "rgb" ]; then
    cd $data
    ls -d $PWD/* $data | grep -i "rgb_cropped.jpg" > test.csv
    sed -i '/hroma/d' "${data}/test.csv"
fi

### Inference

The trained model is deployed to perform semantic segmentation of experimental images. A list of RGB images to be segmented by the trained model is passed through the --image-list option. For each of these images, we will obtain an output mask (.png) of labeled tissues

Dependencies include `opencv`, `scipy`, `yaml` and `tensorflow` (version 1.14)

In [None]:
if [ "$segmentation_mode" = "rgb" ]; then
    export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
    cd $deeplab_path
    python3.7 -W ignore deeplab/inference.py \
        --image_lists "${data}/test.csv" \
        --crop_size 900 --crop_size 600 \
        --seg_results segmentation_results \
        --model_dir "${deeplab_path}/deeplab/model/" \
        >> $data/log_inference.txt
    mv "${deeplab_path}/segmentation_results/raw/"* $data/
fi

In [None]:
if [ "$segmentation_mode" = "hyperspectral" ]; then
    cd $cubeml_path
    #cd /mnt/cubeml/
    python scripts/batch_inference.py \
    --dir $data \
    --pickle $segmentation_model_path \
    --method $segmentation_model_type \
    --false_color
    >> $data/log_inference.txt
fi

### Post-processing

Name outputs to reflect that they are segmentation results

In [None]:
if [ "$segmentation_mode" = "rgb" ]; then
    cd $data
    for file in *_rgb_cropped.png; do mv -f "$file" "${file%_rgb_cropped.png}_segment_cropped.png"; done
fi

Re-expand segment outputs to same size as original RGB files

In [None]:
if [ "$segmentation_mode" = "rgb" ]; then
    cd $alignment_path
    python expand.py $data >> $data/log_expand.txt
fi

Make composite images with side-by-side RGB, segmentation outputs and blended images

In [None]:
if [ $composite -eq 1 ]
then
    echo "making composites"
    cd $gmolabeler_path
    python image_blender.py $data 0.75 'both' 1 0
fi

## Classification of contaminated/missing explants

Plates are cropped into sub-images for each explant and each is analyzed to determine if the explant position should be excluded from analysis due to being missing or contamination. Missing and contaminated explants are recognized using a trained Densenet model (<a href="https://github.com/Contamination-Classification/DenseNet" target="_blank">Huang, et al. 2018</a>). Our fork of the Densenet repository is available on <a href="https://arxiv.org/abs/1608.06993" target="_blank">GitHub</a>.

<img src="Figures/Densenet.png">
Figure: These are four examples of contaminated explants used in the training set for this pre-trained model

To check the grid cropping dimensions, we can run the following script. Note that these are the dimensions to crop the image to after resizing to 2000x2000 (from 4000x4000 in the case of the *macroPhor Array*).

### Prepare list of images

In [None]:
if [ $missing_explants = "Automatic" ]; then
   echo "Missing explants will be inferred."
   cd $data
   ls -d $PWD/* $data | grep -i "rgb.jpg" > rgb_list.txt
   sed -i '/hroma/d' rgb_list.txt
   img_list_path="${data}/rgb_list.txt"
else
   echo "Missing explants input manually by user, in file: "
   echo $missing_explants
fi

If the mode for missing explant data is automatic, prepare input file for script to detect missing explants and run this script.

### Infer contaminated/missing explants

In [None]:
if [ $missing_explants = "Automatic" ]; then
    cd $data
    python3.7 -W ignore ${contamination_path}/inference.py \
    --img-list=$img_list_path \
    --crop_dims $pre_aligned_resized_grid_borders_densenet \
    --weights_path $densenet_model_path \
    --output_file="${data}/output.csv" >> $data/log_contam.txt
fi

In [None]:
if [ $missing_explants = "Automatic" ]; then
    missing_explants="${data}/output.csv"
    echo "Missing explants inferred by model and written to file:"
    echo $missing_explants
else
    echo "Missing explants input manually by user, in file: "
    echo $missing_explants
fi

## Alignment of RGB and hyperspectral layers

To match the frame and angle of RGB and hyperspectral image layers, we perform a homography transformation using a method described [in these notebooks](https://github.com/naglemi/GMOnotebook/tree/master/1_Decide_parameters/2_Align_and_crop_parameters/2_find_alignment_parameters). Using a pair of standard images, a homography matrix is calculated for the necessary transformation of RGB images to align with hyperspectral images. The transformation can then be applied to large batches of images rapidly, as long as the RGB and hyperspectral cameras remain in the same positions relative to one another (as they do in the macroPhor Array platform)

<img src="Figures/Alignment.png">
Figure: To enable precise calculation of a homography matrix for transformation of RGB images to match hyperspectral images, we used images of a piece of paper with grid marks.

In [None]:
echo "Original Borders: $aligned_grid_borders_original"
echo "Label Bottom Borders: $aligned_grid_borders_label_bottom"
echo "DenseNet Borders: $pre_aligned_resized_grid_borders_densenet"
echo "Hyperspectral Oriented Borders: $pre_aligned_grid_borders_hyp_oriented"

In [None]:
# Assuming that the borders are provided as four integers each, in the order: left, top, right, bottom
pre_aligned_borders=(${pre_aligned_grid_borders_hyp_oriented//,/ })
aligned_borders=(${aligned_grid_borders_original//,/ })

# Call the new script with these borders and other necessary arguments
#if [ "$segmentation_mode" = "rgb" ]; then
python3.8 ${cwd}/intermediates/batch_align_auto.py \
--left_source ${pre_aligned_borders[0]} \
--top_source ${pre_aligned_borders[1]} \
--right_source ${pre_aligned_borders[2]} \
--bottom_source ${pre_aligned_borders[3]} \
--left_target ${aligned_borders[0]} \
--top_target ${aligned_borders[1]} \
--right_target ${aligned_borders[2]} \
--bottom_target ${aligned_borders[3]} \
--img_dir $data \
--hyp_path $output_directory_full \
--channel_index 130 \
--overlay_dir "/mnt/output/debug/"
#fi

In [None]:
if [ "$segmentation_mode" = "rgb" ]; then
    hypercube_jpg=$(echo $hypercube_csv | sed -e 's/\.csv/.jpg/g')
fi

## Cross-analyze deep segmentation and regression results

Scripts in the <a href="https://github.com/naglemi/GMOlabeler" target="_blank">GMOlabeler repository</a> are used to cross-reference results from deep segmentation of RGB images and regression of hyperspectral imaging, apply thresholding parameters to classify tissues as transgenic or escapes, and produce plots.

<img src="Figures/GMOlabeler.png">

Figure: The various steps of data processing in GMOlabeler are illustrated for an example explant from an experiment on CIM optimization for cottonwood. Images of plates are cropped to a sub-image for each explant. RGB segmentation results and hyperspectral regression results are cross-referenced to calculate fluorecent proteins in specific tissues and infer whether these tissues are transgenic.

### Prepare sample datasheet input

Prepare input file we will use for making plots. This file contains paths to CLS results, RGB images, and hyperspectral images.

In [None]:
echo $datestamp

In [None]:
cd "${cwd}/intermediates/"

# Check if segmentation_mode is set to "hyperspectral"
if [ "$segmentation_mode" = "hyperspectral" ]; then
  Rscript pre_label.R \
  -r "${data}/" \
  -R "${output_directory_prefix}" \
  -i 1 \
  -d $datestamp \
  --segmentation_model_key $segmentation_model_key # Only included if segmentation_mode is hyperspectral
else
  Rscript pre_label.R \
  -r "${data}/" \
  -R "${output_directory_prefix}" \
  -i 1 \
  -d $datestamp
fi

In [None]:
# Check if $aligned_grid is unset or the file doesn't exist
if [[ -z "$aligned_grid" || ! -f "$aligned_grid" ]]; then
  echo "No grid found at a path given by user, searching for one..."

  # Find files, extract filenames and timestamps, sort by timestamp, and get the filename with the second-to-last timestamp
  #  in Strauss Lab, this will always represent the grid with bold numbers and lines, useful for visualizing grid cropping
  aligned_grid=$(find "$data" -maxdepth 1 -type f -name "*hroma*rgb_processed.png*" ! -name "*csv*" \
                 | while read -r file; do
                     timestamp=$(echo "$file" | grep -o '[0-9]\{6\}')
                     echo "$timestamp $file"
                   done \
                 | sort -k1,1nr \
                 | head -n 2 \
                 | tail -n 1 \
                 | cut -d' ' -f2-)

  # Check if a file was found
  if [[ -z "$aligned_grid" ]]; then
    echo "No suitable grid file found."
  else
    # Output the found path
    echo "Grid found: $aligned_grid"
  fi
fi


In [None]:
echo $aligned_grid_borders_label_bottom

In [None]:
echo ${data}/samples_pre_labeling.csv
echo $aligned_grid
echo $reporter_threshold
echo "Chl"
echo $grid
echo "hdf"
echo ${output_directory_prefix}/gmolabeler_logic_outputs/

In [None]:
aligned_grid_borders_label_bottom_formatted=$(echo $aligned_grid_borders_original | awk -F',' '{print $2, $4, $3, $1}')

In [None]:
echo $aligned_grid_borders_label_bottom_formatted

In [None]:
aligned_grid_borders_label_bottom_formatted=$(echo $aligned_grid_borders_original | awk -F',' '{print $2, $4, $3, $1}')
cd $gmolabeler_path
for reporter in ${reporters[@]}; do
    # Start the command with the basic arguments
    cmd="python main.py \
    \"${data}/samples_pre_labeling.csv\" \
    $aligned_grid \
    $reporter_threshold \
    $reporter \
    $grid \
    \"hdf\" \
    \"${output_directory_prefix}/gmolabeler_logic_outputs/\""

    # Append the grid borders to the command
    if [[ -n "$aligned_grid_borders_label_bottom_formatted" ]]; then
        cmd+=" \"$aligned_grid_borders_label_bottom_formatted\""
    fi

    # Check if segmentation_model_key is set and points to a file
    if [[ -n $segmentation_model_key && -f $segmentation_model_key ]]; then
        # Append the segmentation model key to the command as a named argument
        cmd+=" --segmentation_model_key \"$segmentation_model_key\""
    fi

    # Run the command and redirect stdout to the log file
    eval $cmd > "$data/log_gmolabeler_$reporter.txt"

done

### Calculate sums of statistics over combined segments

We are interested in all regenerated tissue (callus + shoot) as well as all tissue (including stem as well). We will calculate aggregate statistics over these groups.

In [None]:
echo $segmentation_model_key

In [None]:
cd $gmolabeler_path
for reporter in "${reporters[@]}"; do
    # Start the command with the base part
    cmd="Rscript calculate_sum_stats_over_combined_segments.R \
    --output_dir \"${output_directory_prefix}/gmolabeler_logic_outputs/\" \
    --datapath \"${data_folder}/${reporter}/\""

    # Append the model key path if it's set and not None
    if [ -n "${segmentation_model_key}" ] && [ "${segmentation_model_key}" != "None" ]; then
        cmd+=" --keypath \"${segmentation_model_key}\""
    fi

    # Append the exclude tissues string if it's set and not None
    if [ -n "${unregenerated_tissues}" ] && [ "${unregenerated_tissues}" != "None" ]; then
        echo $exclude_tissues
        cmd+=" --exclude_tissues \"${unregenerated_tissues}\""
    fi
    
    echo $cmd

    # Execute the command
    eval $cmd
done

### Make plots of results

In [None]:
cd $gmolabeler_path
for reporter in ${reporters[@]}; do
    # Start the command with the base part
    cmd="Rscript grid_item_plots.R \
    -d \"${data_folder}/\" \
    -r \"$randomization_datasheet\" \
    -p $pixel_threshold \
    -v categorical \
    -m 1 \
    -M $missing_explants \
    -g $grid \
    --samples-pre-labeling ${data}/samples_pre_labeling.csv \
    --sort 1 \
    --height $height \
    --width $width \
    --Reporter $reporter \
    --outdir \"${output_directory_prefix}\""

    # Append the model key path if it's set and not None
    if [ -n "$segmentation_model_key" ] && [ "$segmentation_model_key" != "None" ]; then
        cmd+=" --keypath \"$segmentation_model_key\""
    fi

    echo $cmd

    # Execute the command
    eval $cmd
done

In [None]:
echo -e "Complete \u2705"