Segmentation mask convertor #117

hlydecker · 2020-10-09T03:04:21Z

WIP. Some questions need answering.

Currently only works with hard coded categories, for single category images. Will need to adapt it to link categories to masks by colour codes.
Currently is missing license and info objects.

- added some more documentation and TODOs - changes "contours" to "segmentation" to fit within COCO terminology

codecov · 2020-10-09T03:05:24Z

Codecov Report

Merging #117 into master will decrease coverage by 5.28%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##           master     #117      +/-   ##
==========================================
- Coverage   51.15%   45.86%   -5.29%     
==========================================
  Files           8        9       +1     
  Lines         477      532      +55     
==========================================
  Hits          244      244              
- Misses        233      288      +55

Flag	Coverage Δ
#weedcoco	`45.86% <0.00%> (-5.29%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
weedcoco/importers/mask.py	`0.00% <0.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4d57ff5...6eed412. Read the comment docs.

hlydecker · 2020-10-09T03:06:25Z

@jnothman this has been built based on working with the ginger images/masks. These are binary (black = nothing, white = plant), and there are usually one or two main objects with a bunch of super tiny ones, created as a result of the masking/annotation process. How should we deal with these? Subtract the tiny polygons, or potentially subsume them with the main ones?

jnothman · 2020-10-09T03:30:15Z

this has been built based on working with the carrots images/masks

Do you mean ginger?
Why not add a test image and test mask to the repo, and design a test case?

These are binary (black = nothing, white = plant),

I had imagined a config file color-category-map.yml:

FFFFFF: "weed: UNSPECIFIED"

or

FF0000: "weed: lolium perenne"
00FF00: "weed: rapistrum rugosum"
0000FF: "weed: sonchus oleraceus"

Subtract the tiny polygons, or potentially subsume them with the main ones?

I don't think so. We should be authentic to the input. On this, it's not our job to be opinionated.

COCO assumes that there are multiple (or, more precisely, one or more) polygons. #90 codifies this in the schema: https://github.com/Sydney-Informatics-Hub/Weed-ID-Interchange/blob/6ef3a168215627c039b74224a73f2782a98a4b63/weedcoco/schema/Annotation.yaml#L33-L42.

Note that an alternative representation is as a mask (a 2d binary array) encoded with RLE and special encoding that only seems to be handled by COCO API (https://github.com/cocodataset/cocoapi/blob/8c9bcc3cf640524c4c20a9c40e89cb6a2f2fa0e9/common/maskApi.c#L204-L231).

Turning the image into a mask, based on known annotation colours, and then using pycocotools, may provide more straightforward solutions than thresholding and opencv, which is designed more for photography than discrete masking.

hlydecker · 2020-10-09T03:32:16Z

I have no idea why I wrote carrots; yes I meant ginger!!!

jnothman

I don't think coco_from_mask.json should be included in the repo. Rather:

there should be at least one pytest test case checking that the converter works;
we might add a script search/scripts/index_rds_images.sh which is given the path to the RDS root, and converts and loads data from there.

jnothman · 2020-10-12T11:40:56Z

weedcoco/importers/mask.py

+
+
+def generate_masks_contours(mask_path):
+


This blank line violates PEP257. I'm surprised black lets it through

jnothman · 2020-10-12T11:41:45Z

weedcoco/importers/mask.py

+
+    image_id = 0
+    for filename in os.listdir(image_dir):
+        if filename.endswith(".png") or filename.endswith(".jpg"):


there should be an else clause that warns or raises an error if the file type is unexpected

jpeg and tif might also be possible extensions

A neat shorthand

Suggested change

if filename.endswith(".png") or filename.endswith(".jpg"):

if filename.endswith((".png", ".jpg", ".jpeg", ".tif", ".tiff")):

weedcoco/importers/mask.py

jnothman · 2020-10-12T11:46:31Z

weedcoco/tests/importers/mask_data/agcontext-ginger.yaml

+### crop_type ###
+# description: 'Crop type.
+
+#   One of several strings describing the crop grown in the image.'


these comments aren't needed in a test file. You can do sed -E 's/^$|^#/d' on the file to get only the content lines

Similar in concept to the origina convertor. Bootstrapped from a blog post with changes. Still not completely functional, but parts of it will run and seem to behave how we want.

hlydecker · 2020-10-13T11:51:07Z

Major changes are afoot. Using this blog post as a template, to redesign this convertor to work with colour category mapping. In some ways this is reinventing aspects of the opencv contour generator, but it may be a better direction for our use case.

weedcoco/tests/importers/mask_data/category_name_chobbitty

weedcoco/importers/masks2.py

don't look

jnothman · 2021-01-11T11:55:23Z

I'm not yet happy with the sufficiency of the tests, but this is otherwise ready for review.

jnothman · 2021-01-11T11:56:07Z

Maybe I should open a new PR so that Henry can review. @hlydecker would you like to and are you available to do so?

hlydecker · 2021-01-11T22:43:31Z

Happy to review this sometime this afternoon!

hlydecker

This looks good and the existing tests are sensible enough. Most of my comments are just some suggestions for making warnings and errors more clear to potential users.

I do wonder what sort of other tests could be included. test_basic is indeed basic but it does test the basic functionality!

hlydecker · 2021-01-12T00:42:42Z

weedcoco/importers/mask.py

+                "image_id": len(images),
+                "category_id": cat_idx,
+                "segmentation": rle,
+                # "is_crowd": 0,  # TODO: how should we define this?


When using RLE, we should probably set is_crowd: 1. From my understanding, RLE is really made for situations where we have a a field of several objects of the same category but we aren't annotating each individual one. So in terms of our data, we aren't annotating individual plants; instead we are imply annotating any visible stuff that falls within that category, which could potentially be multiple plants.

weedcoco/importers/mask.py

hlydecker · 2021-01-12T03:12:36Z

Ah of course now I realise the awkwardness here; it makes sense that I cannot be a reviewer for my own pull request even if the actual content is not my progeny.

Co-authored-by: Henry Lydecker <henry.lydecker@gmail.com>

jnothman · 2021-01-25T04:01:14Z

Want to check out the changes since last review, @hlydecker and gimme a tick if possible?

hlydecker · 2021-01-25T04:06:04Z

Will take a look this afternoon!

hlydecker

LGTM; not much was changed. Documentation changes have improved clarity.

The testing plan sounds good to me as well.

hlydecker · 2021-01-25T04:35:09Z

weedcoco/importers/mask.py

@@ -30,7 +30,7 @@ def generate_segmentations(mask_path, color_map, colors_not_found):
    Yields
    ------
    segmentation : str
-        COCO segmentation string
+        COCO segmentation string in compressed RLE format


Good addition to the documentation

hlydecker · 2021-01-25T04:38:29Z

weedcoco/tests/importers/test_mask.py

+# * check segmentation RLE string can be read back in and reproduces the mask
+# * check handling of missing correspondence between mask and image files
+# * check handling of different image file formats
+# * check handling of agcontext


This would be good. The agcontext ingestion + testing would probably make sense as something that is shared as a utility called by each individual converter.

hlydecker · 2021-01-25T04:39:59Z

weedcoco/importers/mask.py

@@ -163,7 +163,7 @@ def _image_name_to_mask(name):
        warnings.warn(
            f"{len(categories)} categories defined, but only "
            f"{len(categories_found)} of these are present in masks. "
-            f"Missing are {missing_category_colors}"
+            f"These categories were not found: {missing_category_colors}"


Both this change and the one at 118 are great improvements in the clarity of the messages to users.

Both this change and the one at 118 are great improvements in the clarity of the messages to users.

They were both your explicit contributions! :D

jnothman · 2021-01-25T04:48:42Z

Okay to merge as is, despite TODOs?

hlydecker · 2021-01-25T04:51:18Z

I'd say so :)

jnothman · 2021-01-25T04:54:23Z

Thanks for the review @hlydecker

Elevn Li and others added 4 commits October 9, 2020 13:26

converter for images and masks

5e2c145

remove voc specific components

a0add67

documentation upgrade

d0d38b6

- added some more documentation and TODOs - changes "contours" to "segmentation" to fit within COCO terminology

added a todo

0afb3e8

hlydecker linked an issue Oct 9, 2020 that may be closed by this pull request

Convertor for dataset with images and segmentation mask #34

Closed

hlydecker added this to the October TCG milestone Oct 9, 2020

hlydecker added this to To Do in Current priorities via automation Oct 9, 2020

format fix

6eed412

jnothman reviewed Oct 12, 2020

View reviewed changes

work in progress second draft of a segmentation mask convetor

b66e4d6

Similar in concept to the origina convertor. Bootstrapped from a blog post with changes. Still not completely functional, but parts of it will run and seem to behave how we want.

jnothman reviewed Oct 14, 2020

View reviewed changes

weedcoco/tests/importers/mask_data/category_name_chobbitty Outdated Show resolved Hide resolved

weedcoco/importers/masks2.py Outdated Show resolved Hide resolved

weedcoco/importers/masks2.py Outdated Show resolved Hide resolved

hlydecker and others added 8 commits November 17, 2020 13:49

broken stuff

01fe756

don't look

Partially tested segmentation mask conversion

5be2b7a

Merge branch 'master' into segmentation-mask-convertor

85c5f08

Minor fixes

0534e74

Better handling of no images

eec79ea

Add images for testing

ad7b45f

sort filenames for determinism

aea263c

Update tests for sorted paths, and fix issue with .DS_Store

e3bf583

jnothman marked this pull request as ready for review January 11, 2021 11:55

hlydecker commented Jan 12, 2021

View reviewed changes

jnothman and others added 4 commits January 20, 2021 18:01

Improve error message

6633aad

Co-authored-by: Henry Lydecker <henry.lydecker@gmail.com>

Improve warning message

4f731ba

Co-authored-by: Henry Lydecker <henry.lydecker@gmail.com>

DOC

bf9df18

Co-authored-by: Henry Lydecker <henry.lydecker@gmail.com>

Describe testing plan

5658e62

hlydecker commented Jan 25, 2021

View reviewed changes

jnothman merged commit 63b8af5 into master Jan 25, 2021

Current priorities automation moved this from To Do to Done Jan 25, 2021

jnothman deleted the segmentation-mask-convertor branch January 25, 2021 04:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segmentation mask convertor #117

Segmentation mask convertor #117

hlydecker commented Oct 9, 2020

codecov bot commented Oct 9, 2020 •

edited

Loading

hlydecker commented Oct 9, 2020 •

edited

Loading

jnothman commented Oct 9, 2020 •

edited

Loading

hlydecker commented Oct 9, 2020

jnothman left a comment

jnothman Oct 12, 2020

jnothman Oct 12, 2020

jnothman Oct 12, 2020

jnothman Oct 12, 2020

jnothman Oct 12, 2020

hlydecker commented Oct 13, 2020

jnothman commented Jan 11, 2021

jnothman commented Jan 11, 2021

hlydecker commented Jan 11, 2021

hlydecker left a comment

hlydecker Jan 12, 2021

hlydecker commented Jan 12, 2021

jnothman commented Jan 25, 2021

hlydecker commented Jan 25, 2021

hlydecker left a comment

hlydecker Jan 25, 2021

hlydecker Jan 25, 2021

hlydecker Jan 25, 2021

jnothman Jan 25, 2021

jnothman commented Jan 25, 2021

hlydecker commented Jan 25, 2021

jnothman commented Jan 25, 2021

	if filename.endswith(".png") or filename.endswith(".jpg"):
	if filename.endswith((".png", ".jpg", ".jpeg", ".tif", ".tiff")):

Segmentation mask convertor #117

Segmentation mask convertor #117

Conversation

hlydecker commented Oct 9, 2020

codecov bot commented Oct 9, 2020 • edited Loading

Codecov Report

hlydecker commented Oct 9, 2020 • edited Loading

jnothman commented Oct 9, 2020 • edited Loading

hlydecker commented Oct 9, 2020

jnothman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hlydecker commented Oct 13, 2020

jnothman commented Jan 11, 2021

jnothman commented Jan 11, 2021

hlydecker commented Jan 11, 2021

hlydecker left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hlydecker commented Jan 12, 2021

jnothman commented Jan 25, 2021

hlydecker commented Jan 25, 2021

hlydecker left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jnothman commented Jan 25, 2021

hlydecker commented Jan 25, 2021

jnothman commented Jan 25, 2021

codecov bot commented Oct 9, 2020 •

edited

Loading

hlydecker commented Oct 9, 2020 •

edited

Loading

jnothman commented Oct 9, 2020 •

edited

Loading