# *GMOnotebook*
#### **Notebook template for applying hyperspectral/segmentation cross-analysis phenomics workflow over new datasets**

###### Notebook template v0.1.60 (Aug 18, 2020)
###### New feature: Automated detection and removal of contaminated explants (now using DNN both for contaminated and missing explants, accelerated relative to linear classifier) – from v0.1.51
###### In v0.1.52, I cleaned up bash commands in accordance with cleanup of GMOdetectoR repository...
###### In v0.1.60, Use new version of contamination model and use a SINGLE environment for everything, for easy installation...

In [1]:
conda activate gmonotebook

(gmonotebook) 

: 1

<img src="WorkflowFlowchart.png">

In this workflow, images taken with the macroPhor Array dual RGB/hyperspectral imaging platform are analyzed by a workflow in which regression quantifies fluorescent signals in hyperspectral images, deep learning segments RGB images into different tissues, and these datasets are cross-referenced to produce statistics on growth of transgenic callus and shoot.

#### Instructions for use
1.  Enter information for the experiment below
2. Set <font color=blue>variables</font> for data paths and parameters, as instructed by colored boxes.
3. "Save as" with filename describing experiment and anything special about this analysis (e.g. T18_OD_TAO_wk7_automation_test_attempt2.ipynb)
4. Run notebook from console, using the below command with the notebook filename inserted<br>
```jupyter nbconvert --to HTML --ExecutePreprocessor.timeout=-1 --allow-errors --execute insert_filename_here```
5. Wait for email

#### Experiment ID and quick description:

<div class="alert alert-block alert-success">
Provide a short description of the experiment in the below box. This should include unique identifier codes for the experiment, along with a short description of genotypes and treatments studied. </div>

#### Parameters for analysis:

<div class="alert alert-block alert-success">
The below variables must be modified appropriately every time this workflow is run over new images.
</div>

##### Data location
The below is the path to the dataset. This should include all folder and subfolders in which the data of interest is organized by. For the organizational system used for our lab's data, this should follow the format "/Experiment/Subexperiment/Timepoint"

In [2]:
data_folder="/T16_DEV_genes/EA/wk7"

(gmonotebook) 

: 1

##### Sample information
Every experiment has a randomization datasheet which was used to organize treatment and genotype information for each plate, prepare labels, and randomize plates. This workflow requires this datasheet in order to know which plates have which genotype/treatment. At a later date, we will integrate an ability to read this data directly from labels.

In [3]:
randomization_datasheet="/scratch2/NSF_GWAS/macroPhor_Array/T16_DEV_genes/EA_randomized.xlsx"

(gmonotebook) 

: 1

##### Exclusion of missing/contaminated explants from analysis

Set the below variable to "Automatic" if using model to automatically detect missing and contaminated explants. Note that this model is only supported for plates with 12 explants. Otherwise, provide an appropriately formatted .csv spreadsheet (see example)  of manually scored contamination / missing explant data.

In [4]:
missing_explants="Automatic"

(gmonotebook) 

: 1

Enter your email where results will be sent

In [5]:
email=michael.nagle@oregonstate.edu

(gmonotebook) 

: 1

<div class="alert alert-block alert-warning">
The below variables should be modified only as needed to indicate the fluorescent proteins in samples and the grid layout of explants. </div>

Select the appropriate fluorophore list depending on reporter protein(s) being used

In [6]:
#fluorophores=GFP_in_poplar_fluorophore_list.txt
fluorophores=PHPCTR_in_poplar_fluorophore_list.txt

(gmonotebook) (gmonotebook) 

: 1

In [7]:
#reporter=GFP
reporter=DsRed

(gmonotebook) (gmonotebook) 

: 1

The parameters for reporter signal threshold and pixel threshold must be provided by the user. Our brute force analyses yielded several noted below.

In [8]:
reporter_threshold=8.76
#reporter_threshold=3.97

(gmonotebook) (gmonotebook) 

: 1

In [9]:
pixel_threshold=5

(gmonotebook) 

: 1

In [10]:
#grid=20
grid=12

(gmonotebook) (gmonotebook) 

: 1

This variable is for the number of grid positions on a plate. This workflow currently supports 12 and 20.

In [11]:
#grid_file="/scratch2/NSF_GWAS/macroPhor_Array/grids/grid20_post_processed.png"
grid_file="/scratch2/NSF_GWAS/macroPhor_Array/grids/grids_left_facing_125208_1_0_1_rgb_processed.jpg"

(gmonotebook) (gmonotebook) 

: 1

<div class="alert alert-block alert-danger">
The below variables do not need to be modified during any routine use of the workflow.
</div>

In [12]:
Ncores=25

(gmonotebook) 

: 1

This option should be either "continuous" or "categorical" and indicates the type of treatment (e.g. continuous variable for different levels of a chemical, or categorical for different gene vectors).

In [13]:
variable_type="categorical"

(gmonotebook) 

: 1

For the following three variables, use 0 (False) or 1 (True)

If these options are true, plots are produced for every plate. This is helpful when trying new fluorophores or a new type of dataset, but is not needed for every analysis. Producing either or both type of plot will make analysis take several times longer.

In [14]:
residualplots=0

(gmonotebook) 

: 1

In [15]:
regressionplots=0

(gmonotebook) 

: 1

In [16]:
composite=1

(gmonotebook) 

: 1

The below option should always be 0 (False) in this workflow since we are running regression over entire plates, then later integrating this data with deep segmentation results and finally dividing plates according to grid position and tissue type at the same time. <br>If the below option is 1 (True) then we output explant-level regression outputs at the end of regression. Plates would be divided prior to regression

In [17]:
explantstatistics=0

(gmonotebook) 

: 1

Set dimensions for plot outputs

In [18]:
#width=15
width=6
height=5

(gmonotebook) (gmonotebook) (gmonotebook) 

: 1

<div class="alert alert-block alert-info">
With all above variables set, please "Save as..." with a filename referencing this specific dataset. <br>Finally, execute the workflow by running the below command in a console, replacing insert_filename_here with the filename for the saved notebook.
</div>

# Automated workflow to be deployed

See the below code for a walkthrough of how GMOnotebook works, or view the outputs after running the workflow for help troubleshooting errors in specific steps of analysis.

<div class="alert alert-block alert-danger"> <b>Danger:</b> Do not modify any below code without creating a new version of the template notebook. Interact with this workflow by modifying variables above while leaving the below code unmodified. </div>

##### Time analysis begins:

In [19]:
echo $(date)

Tue Aug 18 21:12:40 PDT 2020
(gmonotebook) 

: 1

These internal variables are set automatically.

In [20]:
datestamp=$(date +”%Y-%m-%d”)

(gmonotebook) 

: 1

In [21]:
timepoint="$(basename -- $data_folder)"

(gmonotebook) 

: 1

In [22]:
data="/scratch2/NSF_GWAS/macroPhor_Array${data_folder}"

(gmonotebook) 

: 1

In [24]:
TMPDIR="/scratch2/NSF_GWAS/Rtmp/"

(gmonotebook) 

: 1

## 1. Quantification of fluorescent proteins by regression

The R library GMOdetectoR is used to quantify fluorescent proteins in each pixel of hyperspectral images via linear regression. Hyperspectral images are regressed over spectra of known components, and pixelwise maps of test-statistics are constructed for each component in the sample. This approach to quantifying components of hyperspectral images is described in-depth in the Methods section from <a href="https://link.springer.com/article/10.1007/s40789-019-0252-7" target="_blank">Böhme, et al. 2019</a>. Code and documentation for GMOdetectoR is on <a href="https://github.com/naglemi/GMOdetectoR" target="_blank">Github</a>.

In [27]:
cd /scratch2/NSF_GWAS/GMOdetectoR/
Rscript wrappers/GMOdetectoR_wrapper.R \
-F $data \
-g $grid \
-m $Ncores \
-r $residualplots \
-p $regressionplots \
-e $explantstatistics \
-f $fluorophores

(gmonotebook) Error in library(GMOdetectoR) : there is no package called ‘GMOdetectoR’
Execution halted
(gmonotebook) 

: 1

##### Time regression completes:

In [24]:
echo $(date)

Tue Jul 21 01:26:09 PDT 2020


## 2. Neural networks to segment tissues, classify missing/contaminated explants

### 2.1. Semantic segmentation of tissues

Images are segmented into specific plant tissues by a deep neural network of the state-of-the-art Deeplab v3 architecture <a href="https://arxiv.org/abs/1706.05587" target="_blank">Liang-Chieh et al., 2017</a>. The model has been trained using training sets generated with our annotation GUI Intelligent DEep Annotator for Segmentation (IDEAS, available on <a href="https://bitbucket.org/JialinYuan/image-annotator/src/master/" target="_blank">Bitbucket</a>, publication pending). Our branch of the Deeplab v3 repo, including a Jupyter walkthrough for training, can be found on Github.

Training is completed upstream of this notebook, which only entails analysis of test data using the latest model.

<img src="Figures/downsized/segmentation_composite2.png">

Figure: This example image was taken from an experiment on the effects of different CIMs on cottonwood regeneration. This composite image illustrates that for every sample, tissues are segmented into stem (red), callus (blue) and shoot (green). These composite images, useful for manual inspection of results, are produced when the 'composite' option is on.

#### 2.1.1. Pre-processing
##### 2.1.1.1. Crop to remove labels, and resize images to 900x600

This script resizes images to 900x900 and then crops away top and bottom 150 pixels for a final image size of 900x600.

In [25]:
cd /scratch2/NSF_GWAS/GMOdetectoR/

In [26]:
conda activate base
python crop.py $data

(base) (base) 

: 1

##### 2.1.1.2. prepare test.csv file for inference

Make a list of all our image files

In [27]:
cd $data
ls -d $PWD/* $data | grep -i "rgb_cropped" > test.csv

(base) (base) 

: 1

Remove the chroma standard from list of RGB image data to be segmented

In [28]:
sed -i '/hroma/d' "${data}/test.csv"

(base) 

: 1

#### 2.1.2. Inference

The trained model is deployed to perform semantic segmentation of experimental images. A list of RGB images to be segmented by the trained model is passed through the --image-list option. For each of these images, we will obtain an output mask (.png) of labeled tissues

In [29]:
cd /scratch2/NSF_GWAS/deeplab/
conda activate deeplab
python /scratch2/NSF_GWAS/deeplab/inference.py --image-list "${data}/test.csv"




2020-07-21 01:27:13.275318: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2020-07-21 01:27:13.342012: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2020-07-21 01:27:13.342164: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (steed.eecs.oregonstate.edu): /proc/driver/nvidia/version does not exist
2020-07-21 01:27:13.343390: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-07-21 01:27:13.507405: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 1800080000 Hz
2020-07-21 01:27:13.526236: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5584dec1d5e0 executing computations on platform Host. Devices:
2020-07-21 01:27:13.526318: I ten

: 1

#### 2.3. Post-processing

Name outputs to reflect that they are segmentation results

In [30]:
cd $data
for file in *_rgb_cropped.png; do mv -f "$file" "${file%_rgb_cropped.png}_segment_cropped.png"; done

(deeplab) (deeplab) 

: 1

Re-expand segment outputs to same size as original RGB files

In [31]:
#cd /scratch2/NSF_GWAS/deeplab/
conda activate alignment
cd /scratch2/NSF_GWAS/ImageAlignment/
python expand.py $data

(deeplab) (alignment) (alignment) Working in directory/scratch2/NSF_GWAS/macroPhor_Array/T16_DEV_genes/EC/wk31
Number of files: 5
Reading file EC1_I5.0_F1.9_L100_142252_2_0_6_segment_cropped.png
Writing file EC1_I5.0_F1.9_L100_142252_2_0_6_segment_uncropped.png
Reading file EC1_I5.0_F1.9_L100_143328_0_2_6_segment_cropped.png
Writing file EC1_I5.0_F1.9_L100_143328_0_2_6_segment_uncropped.png
Reading file EC1_I5.0_F1.9_L100_142348_3_1_2_segment_cropped.png
Writing file EC1_I5.0_F1.9_L100_142348_3_1_2_segment_uncropped.png
Reading file EC1_I5.0_F1.9_L100_142158_1_0_4_segment_cropped.png
Writing file EC1_I5.0_F1.9_L100_142158_1_0_4_segment_uncropped.png
Reading file EC1_I5.0_F1.9_L100_142501_4_1_0_segment_cropped.png
Writing file EC1_I5.0_F1.9_L100_142501_4_1_0_segment_uncropped.png
(alignment) 

: 1

Make composite images with side-by-side RGB, segmentation outputs and blended images

In [34]:
if [ $composite -eq 1 ]
then
    cd /scratch2/NSF_GWAS/GMOlabeler/
    python image_blender.py $data 0.75 'both' 1 180
fi

['image_blender.py', '/scratch2/NSF_GWAS/macroPhor_Array/T16_DEV_genes/EC/wk31', '0.75', 'both', '1', '180']
Building composite for image EC1_I5.0_F1.9_L100_142252_2_0_6_rgb.jpg
Building composite for image EC1_I5.0_F1.9_L100_142348_3_1_2_rgb.jpg
Building composite for image EC1_I5.0_F1.9_L100_142501_4_1_0_rgb.jpg
Building composite for image EC1_I5.0_F1.9_L100_143328_0_2_6_rgb.jpg
(alignment) 

: 1

#### 2.4. Classification of contaminated/missing explants

Plates are cropped into sub-images for each explant and each is analyzed to determine if the explant position should be excluded from analysis due to being missing or contamination. Missing and contaminated explants are recognized using a trained Densenet model (<a href="https://github.com/Contamination-Classification/DenseNet" target="_blank">Huang, et al. 2018</a>). Our fork of the Densenet repository is available on <a href="https://arxiv.org/abs/1608.06993" target="_blank">GitHub</a>.

<img src="Figures/Densenet.png">
Figure: These are four examples of contaminated explants used in the training set for this pre-trained model

If the mode for missing explant data is automatic, prepare input file for script to detect missing explants and run this script.

In [35]:
img_list_path="${data}/rgb_list.txt"

(alignment) 

: 1

In [36]:
"${data}/output_a2.csv"

bash: /scratch2/NSF_GWAS/macroPhor_Array/T16_DEV_genes/EC/wk31/output_a2.csv: No such file or directory
(alignment) 

: 1

In [37]:
if [ $missing_explants = "Automatic" ]; then
    conda activate densenet
    echo "Missing explants will be inferred."
    cd $data
    ls -d $PWD/* $data | grep -i "rgb.jpg" > rgb_list.txt
    sed -i '/hroma/d' rgb_list.txt
    #sed -i 's/.jpg//g' rgb_list.txt
    #cat rgb_list.txt
    cd /scratch2/NSF_GWAS/Contamination
    img_list_path="${data}/rgb_list.txt"
    #echo $img_list_path
    python inference.py --img-list=$img_list_path --output_file=output.csv
    mv output.csv "${data}/output_a2.csv"
    missing_explants="${data}/output_a2.csv"
    conda deactivate
else
    echo "Missing explants input manually by user, in file: "
    echo $missing_explants
fi

Missing explants input manually by user, in file: 
/scratch2/NSF_GWAS/macroPhor_Array/T16_DEV_genes/EC_ETA_EL_EN_callus_missing_scores.csv
(alignment) 

: 1

Started running at 1:42

## 3. Alignment of RGB and hyperspectral layers

To match the frame and angle of RGB and hyperspectral image layers, we apply a scale-invariant feature transformation (<a href="https://github.com/NSF-Image-alignment/ImageAlignment" target="_blank">GitHub</a>). Using a pair of standard images, a homography matrix is calculated for the necessary transformation of RGB images to align with hyperspectral images. The transformation can then be applied to large batches of images rapidly, as long as the RGB and hyperspectral cameras remain in the same positions relative to one another (as they do in the macroPhor Array platform)

<img src="Figures/Alignment.png">
Figure: To enable precise calculation of a homography matrix for transformation of RGB images to match hyperspectral images, we used images of a piece of paper with grid marks. These images are provided by the user inputs to --hyper-img and --rgb-img in the below call to the alignment script. If using a phenotyping platform other than the macroPhor Array, or using updated camera settings, these variables will need to be replaced.

##### 3.1. Prepare input file for alignment

In [39]:
cd $data
ls | grep -i 'rgb\.jpg' > file_list_part1.csv
ls | grep -i 'segment_uncropped\.png' > file_list_part2.csv
cat file_list_part* > file_list.csv
sed -i '/hroma/d' file_list.csv
cwd=$(pwd)/
awk -v prefix="$cwd" '{print prefix $0}' file_list.csv > temp
mv -f temp file_list.csv
echo 'rgb_images' | cat - file_list.csv > temp && mv -f temp file_list.csv

(alignment) (alignment) (alignment) (alignment) (alignment) (alignment) (alignment) (alignment) (alignment) 

: 1

##### 3.2 Run alignment

In [40]:
conda activate alignment
cd /scratch2/NSF_GWAS/ImageAlignment/
file_list_input="${data}/file_list.csv"
python main.py \
--hyper-img Grids_to_align-selected/20itemgrid_F1.9_I3.0_L50_cyan_114229_0_0_0_index130.csv \
--rgb-img Grids_to_align-selected/20itemgrid_F1.9_I3.0_L50_cyan_114229_0_0_0_rgb.jpg \
--img-csv $file_list_input \
--mode 2

(alignment) (alignment) (alignment) [  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17
  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35
  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53
  54  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71
  72  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89
  90  91  92  93  94  95  96  97  98  99 100 101 102 103 104 105 106 107
 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125
 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143
 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161
 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179
 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197
 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215
 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233
 234 235 236 23

: 1

In [41]:
conda deactivate
conda deactivate

(deeplab) (base) 

: 1

## 4. Cross-analyze deep segmentation and regression results

Scripts in the <a href="https://github.com/naglemi/GMOlabeler" target="_blank">GMOlabeler repository</a> are used to cross-reference results from deep segmentation of RGB images and regression of hyperspectral imaging, apply thresholding parameters to classify tissues as transgenic or escapes, and produce plots.

<img src="Figures/GMOlabeler.png">

Figure: The various steps of data processing in GMOlabeler are illustrated for an example explant from an experiment on CIM optimization for cottonwood. Images of plates are cropped to a sub-image for each explant. RGB segmentation results and hyperspectral regression results are cross-referenced to calculate fluorecent proteins in specific tissues and infer whether these tissues are transgenic.

##### 4.1. Prepare sample datasheet input

Prepare input file we will use for making plots. This file contains paths to CLS results, RGB images, and hyperspectral images.

In [43]:
conda deactivate
cd /scratch2/NSF_GWAS/GMOdetectoR/
Rscript wrappers/pre_label_a2.R \
-r "${data}/" \
-R "/scratch2/NSF_GWAS/GMOdetectoR/output/" \
-i 1 \
-d $datestamp

[1] "”2020-07-21”"
[1] "2020-07-21"
[1] "Looking for CLS files in directory /scratch2/NSF_GWAS/GMOdetectoR/output/T16_DEV_genes/EC/wk31/intercept1/2020-07-21/CLS_tables"
[1] "How many CLS files? 5"
[1] "Writing 5 rows to /scratch2/NSF_GWAS/macroPhor_Array/T16_DEV_genes/EC/wk31//samples_pre_labeling.csv"


##### 4.2. Cross-reference RGB and hyperspectral data

In [45]:
cat "${data}/samples_pre_labeling.csv"

segment,CLS_data,rgb,mean_callus_signal,mean_shoot_signal,callus_signal_total,shoot_signal_total,n_pixels_callus_transgenic,n_pixels_callus_escape,n_pixels_shoot_transgenic,n_pixels_shoot_escape,threshold
/scratch2/NSF_GWAS/macroPhor_Array/T16_DEV_genes/EC/wk31//EC1_I5.0_F1.9_L100_143328_0_2_6_segment_uncropped_processed.png,/scratch2/NSF_GWAS/GMOdetectoR/output/T16_DEV_genes/EC/wk31/intercept1/2020-07-21/CLS_tables/EC1_I5.0_F1.9_L100_143328_0_2_6_GridItemWholePlate.csv,/scratch2/NSF_GWAS/macroPhor_Array/T16_DEV_genes/EC/wk31//EC1_I5.0_F1.9_L100_143328_0_2_6_rgb_processed.jpg,,,,,,,,,
/scratch2/NSF_GWAS/macroPhor_Array/T16_DEV_genes/EC/wk31//EC1_I5.0_F1.9_L100_142501_4_1_0_segment_uncropped_processed.png,/scratch2/NSF_GWAS/GMOdetectoR/output/T16_DEV_genes/EC/wk31/intercept1/2020-07-21/CLS_tables/EC1_I5.0_F1.9_L100_142501_4_1_0_GridItemWholePlate.csv,/scratch2/NSF_GWAS/macroPhor_Array/T16_DEV_genes/EC/wk31//EC1_I5.0_F1.9_L100_142501_4_1_0_rgb_processed.jpg,,,,,,,,,
/scratch2/NSF_GWAS/ma

: 1

In [46]:
echo $grid

12
(base) 

: 1

In [49]:
identify $grid_file

/scratch2/NSF_GWAS/macroPhor_Array/grids/grids_left_facing_125208_1_0_1_rgb_processed.jpg JPEG 1419x1566 1419x1566+0+0 8-bit sRGB 346442B 0.000u 0:00.000
(base) 

: 1

In [50]:
conda activate
cd /scratch2/NSF_GWAS/GMOlabeler/
python main.py \
"${data}/samples_pre_labeling.csv" \
$grid_file \
$reporter_threshold \
$reporter \
$grid

(base) (base) Grid type 12
Loading plate 0 of 5
/scratch2/NSF_GWAS/macroPhor_Array/T16_DEV_genes/EC/wk31//EC1_I5.0_F1.9_L100_143328_0_2_6_segment_uncropped_processed.png
Plate segment loaded
Plate RGB loaded
Loading CLS data from path/scratch2/NSF_GWAS/GMOdetectoR/output/T16_DEV_genes/EC/wk31/intercept1/2020-07-21/CLS_tables/EC1_I5.0_F1.9_L100_143328_0_2_6_GridItemWholePlate.csv
Loading plate 1 of 5
/scratch2/NSF_GWAS/macroPhor_Array/T16_DEV_genes/EC/wk31//EC1_I5.0_F1.9_L100_142501_4_1_0_segment_uncropped_processed.png
Plate segment loaded
Plate RGB loaded
Loading CLS data from path/scratch2/NSF_GWAS/GMOdetectoR/output/T16_DEV_genes/EC/wk31/intercept1/2020-07-21/CLS_tables/EC1_I5.0_F1.9_L100_142501_4_1_0_GridItemWholePlate.csv
Loading plate 2 of 5
/scratch2/NSF_GWAS/macroPhor_Array/T16_DEV_genes/EC/wk31//EC1_I5.0_F1.9_L100_142348_3_1_2_segment_uncropped_processed.png
Plate segment loaded
Plate RGB loaded
Loading CLS data from path/scratch2/NSF_GWAS/GMOdetectoR/output/T16_DEV_genes/EC/w

: 1

##### 4.3. Calculate sums of statistics over combined segments

We are interested in all regenerated tissue (callus + shoot) as well as all tissue (including stem as well). We will calculate aggregate statistics over these groups.

In [51]:
conda deactivate
Rscript calculate_sum_stats_over_combined_segments.R \
--datapath "${data_folder}/"

[1] "Writing output with sums statistics calculated over combined tissue segments to: /scratch2/NSF_GWAS/GMOlabeler/output//T16_DEV_genes/EC/wk31/stats_with_sums_over_tissues.csv"


##### 4.4. Make plots of results

In [59]:
echo "${data_folder}/"
echo $randomization_datasheet
echo $pixel_threshold
echo $variable_type
echo 1
echo $missing_explants
echo $grid
echo 1
echo $height
echo $width

/T16_DEV_genes/EC/wk31/
/scratch2/NSF_GWAS/macroPhor_Array/T16_DEV_genes/EC_randomized.xlsx
5
categorical
1
/scratch2/NSF_GWAS/macroPhor_Array/T16_DEV_genes/EC_ETA_EL_EN_callus_missing_scores.csv
12
1
5
6


In [60]:
cd /scratch2/NSF_GWAS/GMOlabeler/
Rscript grid_item_plots_a10.R \
-d "${data_folder}/" \
-r $randomization_datasheet \
-p $pixel_threshold \
-v $variable_type \
-m 1 \
-M $missing_explants \
-g $grid \
--sort 1 \
--height $height \
--width $width

1: replacing previous import ‘data.table::melt’ by ‘reshape2::melt’ when loading ‘GMOdetectoR’ 
2: replacing previous import ‘data.table::dcast’ by ‘reshape2::dcast’ when loading ‘GMOdetectoR’ 
In storage.mode(default) <- type : NAs introduced by coercion
[1] "Reading in output from GMOlabeler at path: /scratch2/NSF_GWAS/GMOlabeler/output//T16_DEV_genes/EC/wk31/stats_with_sums_over_tissues.csv"

[1] "Rows in output from GMOlabeler: 301"

[1] "Max n_pixels_passing_threshold in output from GMOlabeler: 798"

[1] "Max total_signal in output from GMOlabeler: 199420.551575988"

[1] "Look at the top of output from GMOlabeler"
[1] ""                                                                                             
[2] "/scratch2/NSF_GWAS/macroPhor_Array/T16_DEV_genes/EC/wk31//EC1_I5.0_F1.9_L100_143328_0_2_6.png"
[3] "/scratch2/NSF_GWAS/macroPhor_Array/T16_DEV_genes/EC/wk31//EC1_I5.0_F1.9_L100_143328_0_2_6.png"
[4] "/scratch2/NSF_GWAS/macroPhor_Array/T16_DEV_genes/EC/wk31//EC1_I5.0_F

: 1

## 5. Email plots to user

##### 5.1. ZIP results

In [None]:
cd "/scratch2/NSF_GWAS/GMOlabeler/plots${data_folder}"

In [None]:
rm -f ./plants_over_plates.csv

In [None]:
cp "/scratch2/NSF_GWAS/GMOlabeler/output/${data_folder}/plants_over_plates.csv" ./

In [None]:
rm -f Rplots.pdf

In [None]:
cd ../

This messy substitution is explained here: https://superuser.com/questions/1068031/replace-backslash-with-forward-slash-in-a-variable-in-bash

In [None]:
data_folder_Compress="${data_folder////-}.zip"
data_folder_Compress=${data_folder_Compress#?};

In [None]:
ls

In [None]:
cd $timepoint
zip -r $data_folder_Compress ./*

##### 5.2. Write email

In [None]:
duration=$(( SECONDS - start ))

https://unix.stackexchange.com/questions/53841/how-to-use-a-timer-in-bash

In [None]:
rm -f /scratch2/NSF_GWAS/GMOlabeler/email_to_send.txt
cp /scratch2/NSF_GWAS/GMOlabeler/email_to_send_template.txt /scratch2/NSF_GWAS/GMOlabeler/email_to_send.txt

In [None]:
echo "" >> /scratch2/NSF_GWAS/GMOlabeler/email_to_send.txt
echo "Number of samples run: " >> /scratch2/NSF_GWAS/GMOlabeler/email_to_send.txt

In [None]:
cat "${data}/test.csv" | wc -l >> /scratch2/NSF_GWAS/GMOlabeler/email_to_send.txt
echo "" >> /scratch2/NSF_GWAS/GMOlabeler/email_to_send.txt

In [None]:
if (( $SECONDS > 3600 )) ; then
    let "hours=SECONDS/3600"
    let "minutes=(SECONDS%3600)/60"
    let "seconds=(SECONDS%3600)%60"
    echo "Completed in $hours hour(s), $minutes minute(s) and $seconds second(s)" >> /scratch2/NSF_GWAS/GMOlabeler/email_to_send.txt
elif (( $SECONDS > 60 )) ; then
    let "minutes=(SECONDS%3600)/60"
    let "seconds=(SECONDS%3600)%60"
    echo "Completed in $minutes minute(s) and $seconds second(s)" >> /scratch2/NSF_GWAS/GMOlabeler/email_to_send.txt
else
    echo "Completed in $SECONDS seconds" >> /scratch2/NSF_GWAS/GMOlabeler/email_to_send.txt
fi

In [None]:
echo "" >> /scratch2/NSF_GWAS/GMOlabeler/email_to_send.txt

##### 5.3. Send email with results to user

In [None]:
pwd

In [None]:
mail -a $data_folder_Compress -s $data_folder $email < /scratch2/NSF_GWAS/GMOlabeler/email_to_send.txt

In [None]:
du -sh $data_folder_Compress

In [None]:
cat /scratch2/NSF_GWAS/GMOlabeler/email_to_send.txt

##### Time analysis ends

In [None]:
echo $(date)