In [None]:
%autosave 5

# RGB Plot Algorithm

## Authors

* Author1 = {"name": "Chris Schnaufer", "affiliation": "Cyverse/University of Arizona Data Scientist", "email": "schnaufer@arizona.edu", "orcid": "0000-0002-6150-4558"}
* Author2 = {"name": "Jacob van der Leeuw", "affiliation": "Cyverse Intern", "email": "jvanderleeuw@email.arizona.edu", "orcid": "0000-0003-0892-9837"}


## Purpose

Develop and test an Algorithm to process RGB images that correspond to experiment plots.

This notebook is intended to assist researchers integrate their algorithms into Drone Processing Pipeline workflows through an easy to use interface.

## Technical contributions

* Based off the [template-rgb-plot](https://github.com/AgPipeline/template-rgb-plot) template repository of the University of Arizona [AgPipeline](https://github.com/AgPipeline) project

## Methodology

It is assumed that:

* You have an algorithm that processes RGB data to produce one or more measurements

* Have working knowledge of Python

* The Python environment is version 3.5 or newer

* You are familiar with [Github template repositories](https://docs.github.com/en/repositories/creating-and-managing-repositories/creating-a-template-repository), or know how to use [git](https://git-scm.com)

* For testing purposes, a folder named *sample_plot_images*, containing sample plot images, is located in the same directory as this jupyter notebook. An archive of sample plot images can be [downloaded from CyVerse](https://de.cyverse.org/dl/d/4108BB75-AAA3-48E1-BBD4-E10B06CADF54/sample_plot_images.zip)

The following steps are followed:
1. Clone the GitHub template to your organization or git repository
2. Declare parameters describing the algorithm
3. Write the Algorithm that returns the calculated values
4. Export the algorithm to a python file which can be tested and added to a repository
5. Test the algorithm to confirm it works as expected

Steps for building and testing a Docker image containing the algorithm follow.

## Results

This notebook enables a user to develop their own rgb image-processing algorithms in a structured way, validate their algorithm with a testing script, and add their algorithm to a repository. Optional steps are included for building a Docker image of the algorithm.


## Funding

* Award1 = {"agency": "USDA National Institute of Food and Agriculture, Hatch General Administration of Federal-Grant Fund Research 30152", "award_code":"30152"}


## Keywords

keywords=["plot-level", "rgb", "docker"]


## Citations


## Acknowledgements

This project is funded by [CyVerse](https://cyverse.org) and in turn the National Science Foundation Grant Nos. DBI-0735191, DBI-1265383, and DBI-1743442

template-rgb-plot is licensed under a [BSD 3-Clause License](https://opensource.org/licenses/BSD-3-Clause)

# Setup

This notebook expects to be run from a local copy of the [template-rgb-plot](https://github.com/AgPipeline/template-rgb-plot) repository. The folder does not need to be under source control, although that is recommended.

The following Python modules are used and are expected to be available (listed in alphabetic order):
- datetime
- importlib
- json
- numpy
- os
- pathlib
- PIL
- re
- textwrap

## Top-level Docstring

Change this to a meaningful docstring for your algorithm (replace 'My nifty plot-level RGB algorithm')

## Python module imports for running the algorithm

Please add any additional import statements that will be needed for your algorithm in the block below

## Parameter definitions

In this section we define information that gets used to identify and describe the algorithm, and declare what calculated values will be returned

#### Define the version number of your algorithm. Consider using [Semantic Versioning](https://semver.org/)

If you are updating the algorithm, be sure to change the version number to a larger value

#### Provide information on the creator and contributors of this algorithm

This provides a way for the author of the algorithm to be contacted and for proper attribution

#### Name and describe your algorithm

Give the algorithm a meaningful name and provide a short description on what it does. The algorithm name shouldn't have any spaces or tabs

*Example:* \
ALGORITHM_NAME = 'pixel-counter' \
ALGORITHM_DESCRIPTION = 'Counts the number of pixels in an image'

#### Provide citation information for algorithm publication. This includes the citation author, the citation title, and the citation year

This allows the algorithm to be cited correctly. Multiple citations are allowed

#### Include the names of variables

Add the names of the variables returned by the algorithm, separated by commas. Note that variable names cannot have comma's in them: use a different separator instead. Also, all white space is kept intact; don't add any extra whitespace since it may cause name comparisons to fail. These variable names are used as part of the formatted results.

Replace and/or remove any names that aren't part of your algorithm.

*Example:* VARIABLE_NAMES = 'size of image channels'

#### Include the units and labels of the variables

The order of VARIABLE_NAMES needs to be matched. Multiple entries are also separated by commas. A VARIABLE_UNITS entry must be made for each name in VARIABLE_NAMES, even if it's empty. VARIABLE_LABELS is an optional field and can be left empty.

*Example:* VARIABLE_UNITS = 'pixels'

#### Optional override for the generation of a BETYdb compatible csv file

Set to `True` to generate a compatible file

#### Optional override for the generation of a TERRA REF Geostreams compatible csv file

Set the variable to `True` to generate a compatible file

# Write the Algorithm

In this section you are able to define the calculate() function in order to generate values based off of the rgb images. Fill in the calculate() function below with your algorithm.

The **pxarray** parameter contains the pre-loaded image available for processing. This function is called once per plot-image

Save this notebook after completing the algorithm

## Generate algorithm_rgb.py

This will generate the algorithm file (named *algorithm_rgb.py*) containing the parameters as well as the calculate() function. This file can then be used by [itself or built into a Docker image](https://github.com/AgPipeline/template-rgb-plot/blob/main/HOW_TO.md) to be used as a part of a workflow.

**NOTE:** Be sure to save this notebook before running the following cell (the saved notebook is used to generate the file)

In [None]:
import json
from pathlib import Path
import re
import textwrap

def write_algorithm_rgb_file():
    cells = json.load(open(Path.cwd() / "JV_01_template-rgb-plot.ipynb"))["cells"]
    with open("algorithm_rgb.py", "w") as outfile:
        for key in cells:
            toWrite = ""
            if key["cell_type"] == "markdown":
                for entry in key["source"]:
                    if entry[0:4] == "####":
                        entry = re.sub('####', '', entry).lstrip()
                        toWrite = toWrite + entry
                        toWrite = format_string(toWrite)
                        outfile.write("\n\n" + str(toWrite) + "\n")
            elif key["cell_type"] == "raw":
                for entry in key["source"]:
                    if entry[0:4] == 'def ':
                        toWrite = toWrite + '\n\n'
                    toWrite = toWrite + entry
                    if entry[-3:] == '"""':
                        toWrite = toWrite + '\n'
                outfile.write(str(toWrite))
        outfile.write("\n")
                
def format_string(toWrite):
    returnStr = ""
    lines = textwrap.wrap(toWrite, width=115, break_long_words=False)
    for line in range(len(lines)):
        if line != len(lines)-1:
            returnStr = returnStr + "# " + lines[line] + "\n"
        else: 
            returnStr = returnStr + "# " + lines[line]
    return returnStr

write_algorithm_rgb_file()

## Test the function

Test the function using the sample plot images located in the *sample_plot_images* folder. It is assumed that the files in the folder are supported image files (no other types of files are in the folder).

This testing is used to determine if your algorithm is producing the correct results.

In [None]:
import os
import numpy as np
from pathlib import Path
from PIL import Image
from importlib import reload

import algorithm_rgb
reload(algorithm_rgb)

for filename in os.listdir("sample_plot_images"):
    img = Image.open(Path.cwd() / "sample_plot_images" / filename)
    img_arr = np.array(img)
    print(algorithm_rgb.calculate(img_arr))


Once you are satisfied with the results, you should save the generated *algorithm_rgb.py* file to source control

## (OPTIONAL) Local Production Testing

It's possible to test your algorithm as it would be run in a production setting, without building a Docker image (see below to build a Docker image).

### Python Modules

The following additional libraries need to be installed on the testing system. The name of the library may be different for your system.
- libgdal-dev

The following additional python modules need to be installed on the testing system.
- agpypeline


### Other Projects

Create a clone of the project at https://github.com/AgPipeline/plot-base-rgb.git into an empty directory

```
git clone https://github.com/AgPipeline/plot-base-rgb.git ./
```

This downloads the supporting files needed to run the algorithm outside of Docker

Copy the *algorithm_rgb.py* file to the directory and run the following command

```
python3 transformer.py <path-to-image> <path-to-image> ...
```

where the term __\<path-to-image\>__ is replaced with the path to one or more image files.

The *transformer.py* file handles all the preparation necessary to run the algorithm

You can run the following command to see what options are available

```
python3 transformer.py --help
```

# (OPTIONAL) Build a Docker Image

The following steps are for generating, running, and testing a Docker image based off of your algorithm. You need to run the following steps in an environment with Docker installed. The resulting Docker image will be ready for a production environment.

## Generate your Dockerfile

Run the *generate.py* script to create a [Dockerfile](https://docs.docker.com/engine/reference/builder/).

You should see something like the following:

> Confirming the environment  
 Continuing to generate files ...  
 Configuring files

The result of this command should be `0`.

When correcting any reported problems, be sure to re-save the *algorithm_rgb.py* file before trying again

In [None]:
import os

cmd0 = "python3 generate.py"
os.system(cmd0)

## Cleanup previous runs

If there are leftover files from previous runs, we remove them here

In [None]:
filelist = ["result.json", "rgb_plot.csv", "rgb_plot_betydb.csv", "rgb_plot_geo.csv"]
for file in filelist:
    if os.path.isfile(file):
        os.remove(file)

## Build the Docker Image

Now build the Docker image. The created image will have the project name and project version

In [None]:
import algorithm_rgb
reload(algorithm_rgb)

cmd = "docker build -t " + algorithm_rgb.ALGORITHM_NAME + ":" + algorithm_rgb.VERSION + " ."
os.system(cmd)

## Test the Docker Image

The following command will run the built Docker image and create the output files.

In [None]:
cmd = 'docker run --rm -v "`pwd`:/mnt" ' + algorithm_rgb.ALGORITHM_NAME + ":" + algorithm_rgb.VERSION + ' --working_space "/mnt"'
for filename in os.listdir("sample_plot_images"):
    cmd += ' "/mnt/sample_plot_images/' + filename + '"'
os.system(cmd)

## Confirm Results

Make sure that the correct files are generated and contain appropriate results. If WRITE_BETYDB_CSV or WRITE_GEOSTREAMS_CSV were set to `True`, additional CSV files will be generated.

In [None]:
from datetime import datetime

# Minimal set of expected files
filelist = ["result.json", "rgb_plot.csv"]

for filename in filelist:
    assert os.path.isfile(filename)
    if (file == "result.json"):
        result = json.load(open(file))[algorithm_rgb.ALGORITHM_NAME]
        assert result['version'] == algorithm_rgb.VERSION
        assert result['traits'] == algorithm_rgb.VARIABLE_NAMES
        assert result['units'] == algorithm_rgb.VARIABLE_UNITS
        assert result['labels'] == algorithm_rgb.VARIABLE_LABELS
        assert result['files_processed'] == str(len(os.listdir("sample_plot_images")))
        assert result['lines_written'] == str(len(os.listdir("sample_plot_images")))
        if (algorithm_rgb.WRITE_GEOSTREAMS_CSV == True):
            assert result['wrote_geostreams'] == "Yes"
        else:
            assert result['wrote_geostreams'] == "No"
        if (algorithm_rgb.WRITE_BETYDB_CSV == True):
            assert result['wrote_betydb'] == "Yes"
        else: 
            assert result['wrote_betydb'] == "No"
print ("Success")

# References

Schnaufer, et. al.

## Example of a RGB image processing algorithm that uses this template

https://github.com/AgPipeline/transformer-rgb-indices/blob/main/algorithm_rgb.py