Proposal to add terms to define spatial regions of interest within a media item #207

baskaufs · 2021-06-07T12:00:32Z

This proposal is the result of an extended discussion by the Maintenance Group about developing a system to define and demarcate portions of a media item. For details, see the meeting notes. For examples of how to use the new terms, see the Regions of Interest (ROI) Recipes document.

Proposed terms

Term name: ac:xFrac
Type: rdf:Property
Label: Fractional X
Definition: The horizontal position of a reference point, measured from the left side of the media item and expressed as a decimal fraction of the width of the media item.
Usage: A valid value MUST be greater than or equal to zero and less than or equal to one. The precision of this value SHOULD be great enough that when the ac:xFrac value is multiplied by the exif:PixelXDimension of the Best Quality variant of the Service Access point, rounding to the nearest integer results in the same horizontal pixel location originally used to define the point.
Notes: This point can serve as the horizontal position of the upper left corner of a bounding rectangle, or as the center of a circle.

Term name: ac:yFrac
Type: rdf:Property
Label: Fractional Y
Definition: The vertical position of a reference point, measured from the top of the media item and expressed as a decimal fraction of the height of the media item.
Usage: A valid value MUST be greater than or equal to zero and less than or equal to one. The precision of this value SHOULD be great enough that when the ac:yFrac value is multiplied by the exif:PixelYDimension of the Best Quality variant of the Service Access point, rounding to the nearest integer results in the same vertical pixel originally used to define the point.
Notes: This point can serve as the vertical position of the upper left corner of a bounding rectangle, or as the center of a circle.

Term name: ac:widthFrac
Type: rdf:Property
Label: Fractional Width
Definition: The width of the bounding rectangle, expressed as a decimal fraction of the width of the media item.
Usage: The sum of a valid value plus ac:xFrac MUST be greater than zero and less than or equal to one. The precision of this value SHOULD be great enough that when ac:widthFrac and ac:xFrac are used with the exif:PixelXDimension of the Best Quality variant of the Service Access point to calculate the lower right corner of the rectangle, rounding to the nearest integer results in the same horizontal pixel originally used to define the point. This term MUST NOT be used with ac:radius to define a region of interest.
Notes: Zero-sized bounding rectangles are not allowed. To designate a point, use the radius option with a zero value.

Term name: ac:heightFrac
Type: rdf:Property
Label: Fractional Height
Definition: The height of the bounding rectangle, expressed as a decimal fraction of the height of the media item.
Usage: The sum of a valid value plus ac:yFrac MUST be greater than zero and less than or equal to one. The precision of this value SHOULD be great enough that when ac:heightFrac and ac:yFrac are used with the exif:PixelYDimension of the Best Quality variant of the Service Access point to calculate the lower right corner of the rectangle, rounding to the nearest integer results in the same vertical pixel originally used to define the point. This term MUST NOT be used with ac:radius to define a region of interest.
Notes: Zero-sized bounding rectangles are not allowed. To designate a point, use the radius option with a zero value.

Term name: ac:radius
Type: rdf:Property
Label: Radius
Definition: The radius of a bounding circle or arc, expressed as a fraction of the width of the media item.
Usage: A valid value MUST be greater than or equal to zero. A valid value MAY cause the designated circle to extend beyond the bounds of the media item. In that case, the arc within the media item plus the bounds of the media item specify the region of interest. This term MUST NOT be used with ac:widthFrac or ac:heightFrac to define a region of interest.
Notes: This term may be used with ac:xFrac and ac:yFrac to define a point. In that case, the implication is that the point falls on some object of interest within the media item, but nothing more can be assumed about the bounds of that object.

Rationale

These terms are described using relative rather than absolute dimensions because a Region of Interest applies to all Service Access Points defined for an abstract media item. Specifying ROIs in absolute units (i.e. pixels) creates a complexity as regions would have to be attached to a specific representation. Using fractional proportions allows for regions to be defined once for a media item while being applicable to multiple representations.

To determine the absolute position and bounds, multiply the relative values by the values of exif:PixelXDimension and exif:PixelYDimension for the particular Service Access Point.

Although this proposal does not currently include terms for a third dimension (z), they could be added in the future to define 3 dimensional ROIs.

The text was updated successfully, but these errors were encountered:

afuchs1 · 2021-06-07T23:13:46Z

Hi Steve

Can I made a small suggestion that the issue description include that it relates within a media item to differentiate the usual use of spatial as the location the media was taken

eg. Proposal to add terms to define spatial regions of interest within a media item

cheers Anne

danstowell · 2021-06-09T20:53:46Z

I suggest defining somewhere which rounding mode is intended for "rounded" - I would assume rounded to the nearest integer (as opposed to rounded up, rounded down, or rounded towards zero).

baskaufs · 2021-06-10T14:26:54Z

@danstowell I revised the text to clarify this. See if it's better.

timrobertson100 · 2021-06-23T13:54:55Z

Super nit suggestion:

This term MUST NOT be used with ac:widthFrac and ac:heightFrac to define a region of interest

This term MUST NOT be used with ac:widthFrac or ac:heightFrac to define a region of interest

baskaufs · 2021-06-24T20:20:55Z

We love nit suggestions!

I was pondering whether it would be possible to use only one of ac:widthFrac or ac:heightFrac without the other. I guess we don't prohibit that, but I'm not sure that it would be meaningful since it wouldn't bound a box. So I think that's why it seemed to make sense to use "and" originally.

But I also don't think there is any harm with using "or", which would probably be simpler and cleaner.

baskaufs · 2021-07-20T20:36:03Z

Updated original proposal to include @timrobertson100's suggestion to use "or" instead of "and".

ben-norton · 2021-07-23T14:56:40Z

Machines will have a challenging time utilizing relative positioning. This may also be problematic for interoperability with GIS formats and processes. For example, this solution isn't interoperable with geotiffs, where a raster image has been georeferenced using absolute positioning. Computer vision processes bounding boxes using absolute positioning (x,y coordinates).
The problem is certainly a challenging one. I need to test it further, but I would have a hard time using relative positioning for any hard data processing such as GIS or Computer Vision. With that said, the alternative may be just as problematic. However, I can say that preservation of the original image that has been annotated with a region of interest is critically important for reuse. There's an incentive to preserve it.

baskaufs · 2021-07-23T17:47:01Z

Hi @ben-norton. Thanks for your thoughtful comments about the proposal and for taking the time to make them.

I just wanted to note that the primary purpose of Audubon Core is as a data exchange standard to facilitate discovery, evaluate fitness-for-use, and to lower the barrier to gathering and serving multimedia resources (see the Motivation and Rationale behind Audubon Core). As such, it doesn't prescribe how providers and consumers maintain their own databases (i.e. their own field names, data format, etc.). Thus there is not necessarily an assumption that an Audubon Core record could directly be produced or consumed by users without some processing to make it conform to the term names and structure specified by the standard.

Given that the transformation from relative to absolute coordinates involves a single multiplication or division, can you elaborate more about the problems you foresee in making the transformation between absolute and relative coordinates? The proposal does specify that the precision of the relative values should be great enough that the exact pixel values could be reconstructed for the highest resolution image available. Thus the process should not be lossy if this prescription is followed.

tucotuco · 2021-08-09T21:35:15Z

Machines will have a challenging time utilizing relative positioning. This may also be problematic for interoperability with GIS formats and processes. For example, this solution isn't interoperable with geotiffs, where a raster image has been georeferenced using absolute positioning. Computer vision processes bounding boxes using absolute positioning (x,y coordinates).
The problem is certainly a challenging one. I need to test it further, but I would have a hard time using relative positioning for any hard data processing such as GIS or Computer Vision. With that said, the alternative may be just as problematic. However, I can say that preservation of the original image that has been annotated with a region of interest is critically important for reuse. There's an incentive to preserve it.

@ben-norton I am curious why you say this solution is not interoperable with GeoTIFFs, or why it would be expected to be. The information embedded in the GeoTIFF allows that image to be aligned properly with other layers in GIS. The RoI here isn't expected to be used to create a georeference (sensu Chapman & Wieczorek 2020) without someone creating a tool to do so, but that seems way out of scope anyway. Can you elaborate? And can you also indicate if your comments constitute an objection to adopting the terms?

ben-norton · 2021-08-10T16:33:44Z

Machines will have a challenging time utilizing relative positioning. This may also be problematic for interoperability with GIS formats and processes. For example, this solution isn't interoperable with geotiffs, where a raster image has been georeferenced using absolute positioning. Computer vision processes bounding boxes using absolute positioning (x,y coordinates).
The problem is certainly a challenging one. I need to test it further, but I would have a hard time using relative positioning for any hard data processing such as GIS or Computer Vision. With that said, the alternative may be just as problematic. However, I can say that preservation of the original image that has been annotated with a region of interest is critically important for reuse. There's an incentive to preserve it.

@ben-norton I am curious why you say this solution is not interoperable with GeoTIFFs, or why it would be expected to be. The information embedded in the GeoTIFF allows that image to be aligned properly with other layers in GIS. The RoI here isn't expected to be used to create a georeference (sensu Chapman & Wieczorek 2020) without someone creating a tool to do so, but that seems way out of scope anyway. Can you elaborate? And can you also indicate if your comments constitute an objection to adopting the terms?

@tucotuco

My comments shouldn't be interpreted as an objection to the adoption. I joined this conversation at the last minute, which forgoes the standing to a formal objection. I'm not going to derail the substantial amount of work and discussion that predates my participation.
I do have questions. Some of these are based on assumptions (see item 1), that should be clarified.
Under ideal circumstances, areas of interest on an image are defined by absolute positioning. This leaves no ambiguity or distortion. The problem is that absolute coordinates are only relevant within the original context. if you don't have access to the original image, absolute positions are no longer useful. In general, relative positions have a lower value than absolute. However, workflows with a risk of losing the original image, relative positions may be a necessary compromise to prevent the worst-case scenario of data loss. Based on that logic alone, relative coordinates make sense. This is where my questions come in - situations where this compromise doesn't work. Your response and linked publication make it clear that georeferenced raster images are not an issue. Bounding boxes for computer vision may still be an issue. I don't think this is sufficient grounds to object, but the shortcomings of the proposed solution it is worth noting.

tucotuco · 2021-08-10T20:21:25Z

I am curious, but not proposing, if a combination of absolute coordinates of original source and dimensions of original source would overcome the issues you have identified. In that alternate view, the quantities that are being proposed could be calculated for any resolution derivative that maintain the same limits of content (not cropped).

baskaufs · 2021-08-11T18:51:53Z

I've been holding off on closing the comment period until we've determined that @ben-norton's comment didn't constitute an objection. It appears that it doesn't, so we'll move forward in the process.

I should also mention that this thread has caused the AC Maintenance Group to have some further discussions about how absolute coordinate terms could fit into the picture if there was sufficient demand for them. I put together a document that looks at how absolute coordinates might fit into the picture if we had terms for them. The complication comes from the Audubon Core model that differentiates between an abstract media item and service access points that can represent size variants of the abstract media item. Without that complication, the situation is simple (Strategy 2, basically what @tucotuco was talking about) in the document, but with it the situation gets messier (Strategy 3 and 4).

Anyway, we have that document and this discussion on the shelf for future reference if we come back to the issue of absolute coordinates.

baskaufs · 2021-08-11T19:32:55Z

Updated proposal to fix incorrect capitalization of rdf:Property

baskaufs · 2021-08-11T19:50:24Z

Update proposal to correct error in Usage of ac:heightFrac:

The sum of a valid value plus ac:xFrac MUST be greater than zero and less than or equal to one...

changed to

The sum of a valid value plus ac:yFrac MUST be greater than zero and less than or equal to one...

ben-norton · 2021-08-12T17:20:41Z

I am curious, but not proposing, if a combination of absolute coordinates of original source and dimensions of original source would overcome the issues you have identified. In that alternate view, the quantities that are being proposed could be calculated for any resolution derivative that maintains the same limits of content (not cropped).

I realize that comments are closed, but just for the record ->
@tucotuco Yes. An alternative to the original dimensions field is an absolute reference to the original image. Although less likely than a resize due to basic mechanics, cropping an image would render both absolute and relative positioning obsolete. An absolute persistent reference to the original image negates this from occurring.

baskaufs · 2021-10-07T20:47:20Z

Ratified by the Executive on 2021-10-05 and implemented in tdwg/rs.tdwg.org#79 and #212

baskaufs added the term proposal label Jun 7, 2021

baskaufs added this to the Region of Interest proposals milestone Jun 7, 2021

baskaufs changed the title ~~Proposal to add terms to define spatial regions of interest~~ Proposal to add terms to define spatial regions of interest within a media item Jun 8, 2021

baskaufs mentioned this issue Aug 11, 2021

Proposal to add the terms ac:RegionOfInterest and related terms #206

Closed

baskaufs closed this as completed Oct 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal to add terms to define spatial regions of interest within a media item #207

Proposal to add terms to define spatial regions of interest within a media item #207

baskaufs commented Jun 7, 2021 •

edited

Loading

afuchs1 commented Jun 7, 2021

danstowell commented Jun 9, 2021

baskaufs commented Jun 10, 2021

timrobertson100 commented Jun 23, 2021

baskaufs commented Jun 24, 2021

baskaufs commented Jul 20, 2021

ben-norton commented Jul 23, 2021 •

edited

Loading

baskaufs commented Jul 23, 2021

tucotuco commented Aug 9, 2021

ben-norton commented Aug 10, 2021

tucotuco commented Aug 10, 2021

baskaufs commented Aug 11, 2021

baskaufs commented Aug 11, 2021

baskaufs commented Aug 11, 2021

ben-norton commented Aug 12, 2021 •

edited

Loading

baskaufs commented Oct 7, 2021

Proposal to add terms to define spatial regions of interest within a media item #207

Proposal to add terms to define spatial regions of interest within a media item #207

Comments

baskaufs commented Jun 7, 2021 • edited Loading

Proposed terms

Rationale

afuchs1 commented Jun 7, 2021

danstowell commented Jun 9, 2021

baskaufs commented Jun 10, 2021

timrobertson100 commented Jun 23, 2021

baskaufs commented Jun 24, 2021

baskaufs commented Jul 20, 2021

ben-norton commented Jul 23, 2021 • edited Loading

baskaufs commented Jul 23, 2021

tucotuco commented Aug 9, 2021

ben-norton commented Aug 10, 2021

tucotuco commented Aug 10, 2021

baskaufs commented Aug 11, 2021

baskaufs commented Aug 11, 2021

baskaufs commented Aug 11, 2021

ben-norton commented Aug 12, 2021 • edited Loading

baskaufs commented Oct 7, 2021

baskaufs commented Jun 7, 2021 •

edited

Loading

ben-norton commented Jul 23, 2021 •

edited

Loading

ben-norton commented Aug 12, 2021 •

edited

Loading