# Tutorial 4: Setting up and managing projects


A central aspect of efficiently working with phenopype is the use of a `project`. Phenopype projects are composed of a directory tree in which each folder contains the copy or a link to a single raw image file. Alongside the images to be processed, users can store configuration file for the `pype` routine that were covered in [Tutorial 3](tutorial_3_phenopype_workflows.ipynb): configurations files can be created using preconfigured templates, which can easily be modifed. Once raw images have been added and configuration files are in place, the `pype` function is used until results are satisfactory (3). Together, the raw images, the `pype`-configuration files and the saved results files are all that is needed to completely reproduce any phenotypic data that was collected in the process. 

## Creating a project directory structure and adding images

A Phenopype project directory can be initiatlized with the `project` function. The *phenoype project root folder* should be separate from the raw data, e.g. as a folder inside of your *main project folder*:

<center>
<div style="width:500px; text-align: left" >
    
![Step 1](_assets/workflow_high_step1.png)
    
**Step. 1**: Create a phenopype project and organize raw images into separate folders where all relevant data, attributes and results are stored. 
    
</div>
</center>

In [1]:
import phenopype as pp
import os

'E:\\git_repos\\phenopype\\tutorials'

In [2]:
myproj = pp.project(root_dir=r"../_temp/my_project") ## doesn't have to be "myproj", can be named anything

--------------------------------------------
Phenopype will create a new project at
E:\git_repos\phenopype\_temp\my_project

Proceed? (y/n)
y

"E:\git_repos\phenopype\_temp\my_project" created (overwritten)

project attributes written to E:\git_repos\phenopype\_temp\my_project\attributes.yaml
--------------------------------------------


Next step is to add images to the project. You can do so with the `add_files` method of the created project (a *method* is an executable function that belongs to an existing *object*, in this case "myproj). The function offers some flexibility in terms of which files to import. Most important arguments here are `include`, `exclude` and `filetypes`. For example, given the following list of images:

In [6]:
images = "images"
os.listdir("images") ##

['cichlid1.jpg',
 'cichlid2.jpg',
 'cichlid3.jpg',
 'cichlid_multi1.jpg',
 'cichlid_multi2.jpg',
 'cichlid_multi3.jpg',
 'isopods.jpg',
 'isopods_fish.mp4',
 'phyto_445.jpg',
 'phyto_469.jpg',
 'phyto_586.jpg',
 'phyto_bright.jpg',
 'snails1.jpg',
 'snails2.jpg',
 'stickle1.JPG',
 'stickle2.JPG',
 'stickle3.JPG',
 'stickleback_side.jpg',
 'stickleback_top.jpg',
 'worms.jpg']

If we want to import "stickle1", "stickle2", and "stickle3", we can do a combination `include` and `exclude` (also prints all other default settings):

In [7]:
myproj.add_files(image_dir=images,
                 include="stickle",       ## can be type "str" or type "list"
                 exclude=["side","top"]   ## can be type "str" or type "list"
                ) 

--------------------------------------------
phenopype will search for image files at

E:\git_repos\phenopype\tutorials\images

using the following settings:

filetypes: ['jpg', 'JPG', 'jpeg', 'JPEG', 'tif', 'png', 'bmp'], include: stickle, exclude: ['side', 'top'], mode: copy, recursive: False, resize: False, unique: path

Found image stickle1.JPG - phenopype-project folder 0__stickle1 created
Found image stickle2.JPG - phenopype-project folder 0__stickle2 created
Found image stickle3.JPG - phenopype-project folder 0__stickle3 created

Found 3 files
--------------------------------------------


The three images have the same (nonstandard) file ending, so we can also use the filetype argument (and the overwrite argument, because have already added them above):

In [8]:
 myproj.add_files(image_dir=images,
                 filetypes="JPG" ,       ## can be type "str" or type "list"
                 exclude=["side","top"],      ## can be type "str" or type "list"
                 overwrite=True
                ) 

--------------------------------------------
phenopype will search for image files at

E:\git_repos\phenopype\tutorials\images

using the following settings:

filetypes: JPG, include: [], exclude: ['side', 'top'], mode: copy, recursive: False, resize: False, unique: path

Found image stickle1.JPG - phenopype-project folder 0__stickle1 created (overwritten)
Found image stickle2.JPG - phenopype-project folder 0__stickle2 created (overwritten)
Found image stickle3.JPG - phenopype-project folder 0__stickle3 created (overwritten)

Found 3 files
--------------------------------------------


The remaining settings are `mode`, `recursive`, and `unique`. `mode` determines whether raw files should be copied to each folder in the Phenopype directory tree (using `copy` [default]), or just their filepath (using `link`), which can be useful if data sets contain many or very large images. A third option is `mod`, which will open the iages and save them again in TIF format. This mode also allows to resize images. `recursive` indicates whether only the top directory (`False`; default), or also all subdirectories (`True`) should be included in the search. `unique` indicates whether files should be unique by their path (`filepath` [default]) or only by their name (`filename`) - duplicate files will be skipped. For more information on `add_files`, [refer to the API](api.html#phenopype.main.project.add_files), or use `help`:

In [9]:
help(pp.project.add_files)

Help on function add_files in module phenopype.main:

add_files(self, image_dir, filetypes=['jpg', 'JPG', 'jpeg', 'JPEG', 'tif', 'png', 'bmp'], include=[], include_all=True, exclude=[], mode='copy', extension='tif', recursive=False, overwrite=False, resize_factor=1, unique='path', **kwargs)
    Add files to your project from a directory, can look recursively. 
    Specify in- or exclude arguments, filetypes, duplicate-action and copy 
    or link raw files to save memory on the harddrive. For each found image,
    a folder will be created in the "data" folder within the projects root
    directory. If found images are in subfolders and "recursive==True", 
    the respective phenopype directories will be created with 
    flattened path as prefix. 
    
    E.g., with "raw_files" as folder with the original image files 
    and "phenopype_proj" as rootfolder:
    
    - raw_files/file.jpg ==> phenopype_proj/data/file.jpg
    - raw_files/subdir1/file.jpg ==> phenopype_proj/data/1__subdir

## Adding `pype`-configuration files 

In the next step we prepare the files we added for use with the `pype` routine by addding a configuration file with the `add_config` method. Instead of adding the functions one by one we can load presets that are appropriate for the given computer vision analysis.

Currently, the different templates are stored inside a Python file, and can be inspected using `dir(pp.presets)` to show all existing presets, and `print(pp.presets.landmarks_plain)` to show the contents.

<center>
<div style="width:500px; text-align: left" >
    
![Step 2](_assets/workflow_high_step2.png)
    
**Step. 2**: Create configuration files and store them alongside the raw images.
    
</div>
</center>

In [10]:
pp.pype_config_templates

{'ex1.yaml': 'e:\\git_repos\\phenopype\\phenopype\\templates\\ex1.yaml',
 'ex2.yaml': 'e:\\git_repos\\phenopype\\phenopype\\templates\\ex2.yaml',
 'ex3.yaml': 'e:\\git_repos\\phenopype\\phenopype\\templates\\ex3.yaml',
 'ex5_1.yaml': 'e:\\git_repos\\phenopype\\phenopype\\templates\\ex5_1.yaml',
 'ex5_2.yaml': 'e:\\git_repos\\phenopype\\phenopype\\templates\\ex5_2.yaml',
 'ex6.yaml': 'e:\\git_repos\\phenopype\\phenopype\\templates\\ex6.yaml',
 'ex7.yaml': 'e:\\git_repos\\phenopype\\phenopype\\templates\\ex7.yaml',
 'ex8_1.yaml': 'e:\\git_repos\\phenopype\\phenopype\\templates\\ex8_1.yaml',
 'ex8_2.yaml': 'e:\\git_repos\\phenopype\\phenopype\\templates\\ex8_2.yaml',
 'landmarks1.yaml': 'e:\\git_repos\\phenopype\\phenopype\\templates\\landmarks1.yaml',
 'tut3.yaml': 'e:\\git_repos\\phenopype\\phenopype\\templates\\tut3.yaml'}

In [11]:
pp.show_config_template("ex1")

SHOWING BUILTIN PHENOPYPE TEMPLATE ex1.yaml


- preprocessing:
  - create_mask
  - create_reference:
      mask: true
- segmentation:
  - blur:
      kernel_size: 15
  - threshold:
      method: adaptive
      blocksize: 49
      constant: 5
      channel: green
  - morphology:
      operation: open
      shape: cross
      kernel_size: 9
      iterations: 2
  - find_contours:
      retrieval: ccomp
      min_diameter: 0
      min_area: 250
- visualization:
  - select_canvas:
      canvas: image
  - draw_contours:
      line_width: 2
      label_width: 1
      label_size: 1
      fill: 0.3
- export:
  - save_contours:
      overwrite: true


For example, if we want to place landmarks, we can use one of the corresponding presets.  


In [12]:
myproj.add_config(name = "lm", template="landmarks1")

New pype configuration created (landmarks1.yaml) from phenopype template:
e:\git_repos\phenopype\phenopype\templates\landmarks1.yaml
pype_lm.yaml created for 0__stickle1
pype_lm.yaml created for 0__stickle2
pype_lm.yaml created for 0__stickle3


Now all images folders contain a configuration file in *yaml*  format (see [Tutorial 2](tutorial_2_phenopype_workflow.ipynb#yaml-syntax) and the [resources section](resources.html) of the Documentation for details). 

An imporant feature of `add_config` is the opportunity to evulate and edit the template before it gets saved in the folders. This is done by setting the flag `interactive=True` in the arguments. For example, if we globally want to change point and label size of the landmark preset, we can do: 

<center>
<div style="width:650px; text-align: left" >
    
![Edit template](_assets/change_template.png)
    
Edit the templates before saving them to the image folders.
    
</div>
</center>

**NOTE 1:** The `pype` function opens a text editor and a python window. To modify the `pype` configuration templates, by default, the first image in your project directory tree will copied over to the phenopype project root directory. After the windows have opened they can be controlled as described as in [Tutorial 2](tutorial_2_phenopype_workflow.ipynb#pype-behavior).

**NOTE 2:** If you have issues with this step, e.g. no text editor window is popping up, make sure you have set the default app for opening *yaml* files. Furthermore, consult the [Installation Instructions](installation.html#choose-a-text-editor) and check if your text editor is configured correctly. 


In [13]:
myproj.add_config(name = "lm", 
                  template="landmarks1",
                  interactive=True,
                  overwrite=True                 ## needed because config with the name "lm" already exists in the folders
                 )

New pype configuration created (landmarks1.yaml) from phenopype template:
e:\git_repos\phenopype\phenopype\templates\landmarks1.yaml
Succesfully loaded existing pype config (pype_config_MOD_lm.yaml) from:
E:\git_repos\phenopype\_temp\my_project\pype_config_MOD_lm.yaml 


------------+++ new pype iteration 2021:03:07 20:17:08 +++--------------


MEASUREMENT
landmarks
- setting landmarks
- terminated polyline creation
RESTART


------------+++ new pype iteration 2021:03:07 20:17:25 +++--------------


MEASUREMENT
landmarks
- setting landmarks
VISUALIZATION
- modifed image
- autoselect canvas
draw_landmarks


------------+++ finished pype iteration +++--------------
-------(End with Ctrl+Enter or re-run with Enter)--------




TERMINATE
Entered interactive config mode using first image (first).
pype_lm.yaml created for 0__stickle1 (overwritten)
pype_lm.yaml created for 0__stickle2 (overwritten)
pype_lm.yaml created for 0__stickle3 (overwritten)


## Saving and loading a project

Project objects can be saved using a the static method `save` (static = unbound to any object). This will save the project data to the project's root directory. Currently, the only useful information stored in the project object is the list of all contained directories. Future releases will make more use of the project object. 

**NOTE:** `pp.project.save` saves ONLY the project data, all data collected with the pype method or any of the other workflows need to be saved inside the folders using the appropriate [export](api.html#export) functions.

In [14]:
pp.project.save(myproj, overwrite=True)

Project data saved under E:\git_repos\phenopype\_temp\my_project\project.data.


To load the project again, add provide the path of the `project.data` file in the root folder to the `load` method:

In [16]:
import phenopype as pp

myproj = pp.project.load("../_temp/my_project")
myproj.dirpaths

--------------------------------------------
Project loaded from 
E:\git_repos\phenopype\_temp\my_project

Project has 3 image folders
--------------------------------------------


['E:\\git_repos\\phenopype\\_temp\\my_project\\data\\0__stickle1',
 'E:\\git_repos\\phenopype\\_temp\\my_project\\data\\0__stickle2',
 'E:\\git_repos\\phenopype\\_temp\\my_project\\data\\0__stickle3']

## Using `pype` with project folders

<center>
<div style="width:800px; text-align: left" >
    
![Step 3](_assets/workflow_high_step3.png)
    
**Step. 3**: Apply `pype` function image by image.
    
</div>
</center>

After adding images and configuration, all is set to process your dataset with high throughput. Using a simple `for` loop, we go through all directories one by one. You can modify the configuration file and controll the window as described as in [Tutorial 2](tutorial_2_phenopype_workflow.ipynb#yaml-syntax). The `skip` argument will allow to skip files with a given config name you have already analyzed. This allows you to return to the point where you left off.

**NOTE 1:** Make sure to specifiy the name of the config file you added before, in this case, "lm". The config file name serves multiple purposes: on the one hand it tells the `pype` function which configuration to load, if you have multiple in one directory. On the other hand, the name gets appended to all results files that are produced by this constellation.  

**NOTE 2:** Consult [Tutorial 2](tutorial_2_phenopype_workflow.ipynb#pype-behavior) to understand `pype` behavior. For example, the `pype` will automatically save all collected data, and by default overwrite any existing results files, but the latter only if indicated in the config file.   

In [18]:
for folder in myproj.dirpaths:
    directory = os.path.join(myproj.root_dir, folder)
    print(directory)

E:\git_repos\phenopype\_temp\my_project\data\0__stickle1
E:\git_repos\phenopype\_temp\my_project\data\0__stickle2
E:\git_repos\phenopype\_temp\my_project\data\0__stickle3


In [19]:
os.path.isdir(directory)

True

In [20]:
for folder in myproj.dirpaths:
    directory = os.path.join(myproj.root_dir, folder)
    pp.pype(directory, 
            name="lm",         ## loads the config file "pype_config_lm.yaml". "lm" gets appended to all results files
            skip=True          ## skip=True will skip over any directories that already contain results files with "lm"
           )

Succesfully loaded existing pype config (pype_config_lm.yaml) from:
E:\git_repos\phenopype\_temp\my_project\data\0__stickle1\pype_config_lm.yaml 


------------+++ new pype iteration 2021:03:07 20:19:16 +++--------------


Nothing loaded.
MEASUREMENT
landmarks
- setting landmarks
VISUALIZATION
- modifed image
- autoselect canvas
draw_landmarks
EXPORT
save_landmarks
- landmarks saved under E:\git_repos\phenopype\_temp\my_project\data\0__stickle1\landmarks_lm.csv.
=== AUTOSAVE ===
save_canvas
- canvas saved under E:\git_repos\phenopype\_temp\my_project\data\0__stickle1\canvas_lm.jpg.


------------+++ finished pype iteration +++--------------
-------(End with Ctrl+Enter or re-run with Enter)--------




TERMINATE
Succesfully loaded existing pype config (pype_config_lm.yaml) from:
E:\git_repos\phenopype\_temp\my_project\data\0__stickle2\pype_config_lm.yaml 


------------+++ new pype iteration 2021:03:07 20:19:20 +++--------------


Nothing loaded.
MEASUREMENT
landmarks
- setting landmark

<center>
<div style="width:500px; text-align: left" >
    
![Step 4](_assets/workflow_high_step4.png)
    
**Step. 4**: Each folder contains all information necessary to reproduce the collected phenopytic data. Ouput from different `pype` runs can be stored side by side in the same folders. 
    
</div>
</center>

As mentioned above, it's possible to have multiple configuration files side by side in phenopype folders. For example, if we want to implement an alternative set of landmarks, we can simply do:

In [None]:
myproj.add_config(name = "lm2",                  ## add different name (my not contain underscores or other special characters)
                  config_preset="landmarks_plain"    ## same preset
                 )

In [None]:
for img in myproj.dirpaths:
    pp.pype(img, 
            name="lm2",         ## loads the config file "pype_config_lm2.yaml". "lm2" gets appended to all results files
            skip=True          ## skip=True will skip over any directories that already contain results files with "lm2"
           )

In [None]:
os.listdir(r"data/0__stickle1")

## Collecting results

Using `collect_results` one can search the project folder for results, and copy them to a folder in the root directory ("results" is the default, but can be changed). 

In [21]:
myproj.collect_results(name="lm2",          # these two arguments create the search string for "landmarks_lm2.csv"
                       files=["landmarks"], # 
                       overwrite=True)

Created E:\git_repos\phenopype\_temp\my_project\results


SystemExit: No files found under the given location that match given criteria.

