New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Go-style return values for gocv functions that have result Mat #137

Closed
willdurand opened this Issue Mar 13, 2018 · 30 comments

Comments

Projects
None yet
5 participants
@willdurand
Contributor

willdurand commented Mar 13, 2018

When reviewing #135, I was like "wait, why is this test actually passing?". The test was about flipping a Mat into a brand new Mat, both passed to a function that looks roughly like this:

func Flip(src Mat, dst Mat, flipCode int) {}

In Go, parameters passed by value (default like above) should not be modified (and in any case, it will never change the value in the parent block). In gocv, this would works though because we manipulate a C pointer under the hood but this is a bit misleading.

Question is: could we agree on a "convention" stating that all arguments modified should be passed by pointer?

If we take the Flip() function as an example:

src := ...
dst := gocv.NewMat()

// wait, how does `dst` can be updated? No return-value either? Hmmm.
gocv.Flip(src, dst, 0);

// versus

// oh right so `Flip` takes a `src` Mat and puts the result in `dst`. Lovely <3
gocv.Flip(src, &dst, 0);
@berak

This comment has been minimized.

Contributor

berak commented Mar 13, 2018

imho, somehow, you've hit a spot on the general concept here.

the gocv wrappers all go like this(pun intended):

type someclass {
     p pointer_to_some_CGO_wrapper_around_some_cpp_implementation
}
func( src1, src2, dst someclass) {
     (do some op on dst.p, but it won't get reassigned !)
}

so, actually, whatever dst.p points to, will get changed, neither the go instance, nor dst.p.
(and taking the address of the go one, won't mend it.
hard to explain, hard to propagate that kind of knowledge to actual users (that would be my problem with it))

somehow, returning dst, instead of taking it by a (invisible) reference, like it is done in python, would be a better idea. (too late, too change it for all of the core bindings ?)

imho, it's already a problem in the underlying opencv c++ code, for further mindbending fun, have a look here, it's turtles, all the way down

@willdurand

This comment has been minimized.

Contributor

willdurand commented Mar 13, 2018

somehow, returning dst, instead of taking it by a (invisible) reference, like it is done in python, would be a better idea. (too late, too change it for all of the core bindings ?)

I think returning values would be indeed better.

@berak

This comment has been minimized.

Contributor

berak commented Mar 13, 2018

right, yet it still works, like it is, now (also it is close to the c++ version, which the current docmentation heavily relies on)

@deadprogram

This comment has been minimized.

Member

deadprogram commented Mar 16, 2018

There was a discussion about this at some point on slack or something like that. The question is, should we be more like the C++ API (the current way) or more like the Python API (return a new Mat in function that has a 'dest' that the dev needs to close).

I am still not pulled either way with great conviction, the C++ style seemed the best at the time, but we could decide to change the interface. Seems like doing so sooner would be best if we are going to do so.

@deadprogram

This comment has been minimized.

Member

deadprogram commented Mar 19, 2018

As I recall it was @221bytes that I was discussing this with on the Golang Slack channel.

In any case, we can decide to follow one of 2 approaches here:

  • continue using the dst Mat assigned pattern as currently implemented. It does make it more like the C++ code, so the C++ docs would match up, but is a little different than the typical Golang code.
  • switch to returning a new gocv.Mat for any returned values, more like how the Python code handles. It makes it a little less of a match to the C++ docs, but is certainly a little more like what go programmers expect.

I don't like to break the API, but this conversation keeps coming up for good reason. Seems like it would be better to be a little more idiomatic for Go.

To use the example above:

src := ...

// dst is now return-value is all cases that there is a Mat that is modified.
dst := gocv.Flip(src, 0);

There is, however, a use case that is handled by OpenCV's C++ interface that this would not accommodate, which is when the src and dst are the same, what openCV calls "inplace". There are a number of OpenCV functions which allow this, for example dilate. This is rather inconsistent, as only some OpenCV functions support this.

I am starting to get the feeling we should probably just ignore the inplace return assignment option and always return new Mat values from any functions that have an output. This would eliminate an entire category for potential confusion for the consumers of this package.

Ideally, I would like to hear from a few more community members before revamping the current API. But so far everyone has been in favor of changing to a more idiomatic Go approach of returning new values.

@willdurand

This comment has been minimized.

Contributor

willdurand commented Mar 19, 2018

continue using the dst Mat assigned pattern as currently implemented. It does make it more like the C++ code, so the C++ docs would match up, but is a little different than the typical Golang code.

I'd be OK with this if we could add pointers for arguments that are modified.

I don't like to break the API, but this conversation keeps coming up for good reason. Seems like it would be better to be a little more idiomatic for Go.

The good thing is 1.0.0 has not been released yet, so we can still BC break for good reasons.

I am starting to get the feeling we should probably just ignore the inplace return assignment option and always return new Mat values from any functions that have an output. This would eliminate an entire category for potential confusion for the consumers of this package.

I like this proposal a lot! I believe a package like gocv should provide a uniform/Go-oriented API on top of opencv. The current package is already good but this suggestion would make it even better.

(my 2cents)

@deadprogram deadprogram changed the title from Pass mutable parameters by pointer to Use Go-style return values for gocv functions that have result Mat Mar 19, 2018

@deadprogram

This comment has been minimized.

Member

deadprogram commented Mar 19, 2018

I changed the title of this to match the change I am proposing. @berak @221bytes anyone else have any thoughts on this change?

@221bytes

This comment has been minimized.

Contributor

221bytes commented Mar 19, 2018

I remember this talk, you said it was easier to stick to the C++ API for documentation and ease of implementation. Indeed, as a total newbie, it was for me easy to contribute (very small contribution though) to the project.
Except the reason, we wanna do it the gopher way, what are the other pros of refactoring now?

@willdurand

This comment has been minimized.

Contributor

willdurand commented Mar 19, 2018

Except the reason, we wanna do it the gopher way, what are the other pros of refactoring now?

see my initial issue description, mutation happens by accident.

@221bytes

This comment has been minimized.

Contributor

221bytes commented Mar 19, 2018

By accident ? I can't understand what you mean sorry

@willdurand

This comment has been minimized.

Contributor

willdurand commented Mar 19, 2018

By accident ? I can't understand what you mean sorry

Go passes variables by value (copy) not by reference to functions. In Gocv, we should not expect dst to be modified after having called gocv.Flip(src, dst, 0) since dst is passed by value. Yet, it works because Mat only attribute is a C pointer. This is very misleading.

@221bytes

This comment has been minimized.

Contributor

221bytes commented Mar 19, 2018

Hum, the name his dest and every functions as a nice comment associated with it. I can't see how someone can be misguided to be honest. I am not a super user and it was clear for me. Refactoring all the library only for this purpose sounds a bit overkill to me.
Ron took this decision at the beginning and it's working for us. Sticking to the c++ api is nice for documenting and for moving code from c++ to golang so for me we should keep the way it.
Please remember that I was a supporter of a more golang style at first ^^

@deadprogram

This comment has been minimized.

Member

deadprogram commented Mar 19, 2018

Here is a basic example that shows that any approach has its perils.

The current approach reuses the same memory allocated for the Mat without the Golang developer having to worry about releasing the OpenCV Mat object underneath:

webcam, err := gocv.VideoCaptureDevice(int(deviceID))
img := gocv.NewMat()

for {
    ok := webcam.Read(img);
}

My proposed approach would require a release of the OpenCV Mat object underneath at the end of each loop iteration, or risk running our of memory:

webcam, err := gocv.VideoCaptureDevice(int(deviceID))

for {
    img, ok := webcam.Read()
    ...
    img.Close()
}

I think the proposal being made by @willdurand is to do this:

webcam, err := gocv.VideoCaptureDevice(int(deviceID))
img := gocv.NewMat()

for {
    ok := webcam.Read(&img);
}

This would still have the implementation just using the C pointer underneath as it is now, but makes code look more like Go programmers expect it to work, if I have it correct.

We certainly cannot avoid the reality that, as Rob Pike put it "CGo is not Go". We can certainly do our best to be as Go-like as possible.

More thoughts/feelings/opinions from the world-at-large?

@deadprogram

This comment has been minimized.

Member

deadprogram commented Mar 20, 2018

I'd love to hear your opinions on this @berak @maruel @drnic or anyone else!

@maruel

This comment has been minimized.

Contributor

maruel commented Mar 20, 2018

Hi! Thanks for the call for opinions. I have opinions! :)

Since I was paged in, I'll state my opinion. On the other hand I haven't used the API in any significant way so I'm probably not the best person to weight in, so I'll state in from the prospect of an API designer, which I happen to have experience in.

I use two goals when designing an API:

  1. Reading a snippet of code should make things obvious to an unfamiliar reader, without having to resort to documentation.
  2. It shall be designed for one of speed, simplicity of use or consistency with an external API. Know what you optimize for first.

About 1), nobody reads the doc unless they get stuck, you don't want users to get stuck. They'll prefer to copy paste a snippet and start from there. Having an input being mutated in Go does happen in some cases, io.Reader is a notorious example as @deadprogram noted above. In practice, it is important to note that the method Read() accepts a slice, not an struct instance. What's confusing here is that Mat is an struct instance, not an interface. Take as an example io.Copy(), what would be io.Copy(&bytes.Buffer{}, ...) is simply Read(gocv.Mat{}), a pass by value instead of a pointer. I think it's the primary source of confusion.

About 2), it really depends. People in the thread, including Ron himself, noted the value of sticking as close as possible to the C API for documentation purposes. If you want to aim for speed, you want to reduce heap allocations, so you want to accept outputs as arguments, instead of returning them (like io.Reader). If you aim for simplicity, returning a new gocov.Mat instance is likely the most idiomatic way. (I personally value speed, otherwise why using cgo at all?)

Then there's the trade off of breaking the API. In my projects, I "fixed" it by immediately releasing major version 1 and being quite open to releasing new major versions, around once per year or so. I try to batch the breaking API changes into one short amount of time (a ~3 months window), and each breakage is done for a real purpose that users can understand why.

I'm in the camp that trying to be "as close to the C API" when writing a C API wrapper in Go is counter-productive, but the cost of breaking current gocv users may not make doing structural changes like the one I describe below be worth it.

I'll still give my opinion since I was asked about it but feel free to dump it to /dev/null. :)

Examples of things I'd change:

  • I'd make a MatFloat64, MatFloat32, MatInt32, MatUint16 (and whatever other types) so that the methods GetXXX() would not be a mess. I'm using image.Image.At() as an example here; https://golang.org/pkg/image/#Image. Then, Mat could be an interface! You get strong typing for free. As a matter of fact, it would enable you to make the MatXX to implement https://golang.org/pkg/image/draw/#Drawer, which then would make it possible to decode a jpeg into a gocv.Mat or things like that.
  • Make most functions that "returns" a Mat be methods of Mat, so that AbsDiff(src1, src2, dst Mat) would become a method Mat.AbsDiff(src1, src2 Mat). Same for stuff like PutText(). That would make it really more obvious IMHO. Right now https://godoc.org/gocv.io/x/gocv is a mess (sorry) and bucketting the functions into methods would make the API tremendously more readable. Example: https://golang.org/pkg/time/
  • I'd change bool return values to be error, in particular Read().
  • I'd move UI stuff (Window) into a subpackage gocvui to make it clear it's for ui, not algorithm.
  • Same for I/O (IMRead/IMWrite/VideoCapture) into gocvio.
  • Remove the implicit dependency on the C API documentation. Make yours be self-standing and be free! :)

Then I think the API would be much more idiomatic, more welcoming to new users by being more focused, at the cost of being less C-like.

Is it worth? Is it desirable? I can't make the call but I it's worth taking a deep look and experimenting on this.

@deadprogram

This comment has been minimized.

Member

deadprogram commented Mar 20, 2018

Thanks for the comments @maruel however some of these items are counter the design of not trying to re-architect the internals of OpenCV itself. For example, the UI stuff is in the OpenCV highgui module in the main package, so as to allow people to find where to add things within GoCV. I think we discussed this via email early on.

In any case, lots of ideas in your thoughtful response. So there is a possible new option. To restate all the possibilities:

Current code:

dst := gocv.NewMat()
gocv.Flip(src, dst, 0);

Pass dst as a pointer to avoid "surprise" to Go programmers:

dst := gocv.NewMat()
gocv.Flip(src, &dst, 0);

Return a new Mat as dst:

dst := gocv.Flip(src, 0);

Change functions that mutate dst into methods off of the Mat they change:

dst := gocv.NewMat()
dst.Flip(src, 0);
@deadprogram

This comment has been minimized.

Member

deadprogram commented Mar 20, 2018

Another use case to consider is when using any algorithm's Apply() function.

Current code

src := ...
dst := gocv.NewMat()

mog2 := gocv.NewBackgroundSubtractorMOG2()
mog2.Apply(src, dst)

Pass dst as a pointer to avoid "surprise" to Go programmers

src := ...
dst := gocv.NewMat()

mog2 := gocv.NewBackgroundSubtractorMOG2()
mog2.Apply(src, &dst)

Return a new Mat as dst

src := ...

mog2 := gocv.NewBackgroundSubtractorMOG2()
dst := mog2.Apply(src)

Change functions that mutate dst into methods off of the Mat they change:

// not sure how to use this pattern here...

Seems like the mismatch with option 4 in this use case is problematic.

@deadprogram

This comment has been minimized.

Member

deadprogram commented Mar 20, 2018

At the moment, I'm starting to get the feeling that the @willdurand original suggestion of passing Mat* to indicate output param might be the best balance between Go-like idiom (anything passed by pointer is subject to mutation) and performance (we do not need to allocate/de-allocate extra cv::Mat objects within loops).

This would also probably be the smallest change to the API needed for consumers of the package (adding & in front of any dst params) vs. the other proposed changes.

@maruel

This comment has been minimized.

Contributor

maruel commented Mar 20, 2018

Makes sense given the trade offs.

(Only for completeness)

The form would look like:

Change functions that mutate dst into methods off of the Mat they change:

dst := gocv.NewMat()
mog2 := gocv.NewBackgroundSubtractorMOG2()
dst.Apply(mog2)
@deadprogram

This comment has been minimized.

Member

deadprogram commented Mar 20, 2018

Actually I think that would be

Change functions that mutate dst into methods off of the Mat they change:

src := ...
dst := gocv.NewMat()
mog2 := gocv.NewBackgroundSubtractorMOG2()
dst.Apply(mog2, src)
@deadprogram

This comment has been minimized.

Member

deadprogram commented Mar 20, 2018

This is my WIP branch implementing the Mat* approach:

https://github.com/hybridgroup/gocv/tree/feature/mat-pointer

@maruel

This comment has been minimized.

Contributor

maruel commented Mar 21, 2018

I think it definitely helps, especially for the cases where dst and src are reversed.

@willdurand

This comment has been minimized.

Contributor

willdurand commented Mar 21, 2018

Thanks for working on this, let me know if you need help, I have some spare time for this lib.

I skipped @maruel's comment about documentation, I also think that it would be better for end users to have the documentation as part of the godoc, not links to opencv documentation. What if the urls change for some reason?

@deadprogram

This comment has been minimized.

Member

deadprogram commented Mar 21, 2018

The OpenCV docs have hashes that do not change between versions, and now that we adopted earlier @berak suggestion as discussed in #109 to just link to master branch version, the links will always be the latest.

I like linking to the OpenCV docs as it gives us the full benefit of the OpenCV community's docs. But I do not like requiring devs to look at both places to do anything useful.

We ideally should have "good enuf'" GoDocs that do not require always going to the OpenCV docs. That will require a lot more effort on our parts to do so but for sure will be worth it. To achieve this we need to include a little more info for each function, what would the ideal function look like?

Regarding the @maruel comment on moving things to be methods of Mat. The problem is that since nearly everything uses Mat it would not really help much as far as "unflattening" the docs. I do not want to use sub-packages that do not match the OpenCV modules layout for reasons already expressed. Hence our options are limited on that.

@willdurand

This comment has been minimized.

Contributor

willdurand commented Mar 21, 2018

Roger that, I forgot about the links.

I do not want to use sub-packages that do not match the OpenCV modules layout for reasons already expressed. Hence our options are limited on that.

The highgui stuff could be put in a sub-package, maybe? It is part of the opencv "highgui" module. WDYT?

@deadprogram

This comment has been minimized.

Member

deadprogram commented Mar 21, 2018

In addition to helping newcomers to this project but who know OpenCV, or people who are looking to add some specific unimplemented functionality, the current structure mapping directly to OpenCV's modules works well. Are those the designs patterns we might otherwwise choose? IDK but OpenCV is the target, so we want to reduce any cognitive load needed to find things.

Also, CGo does not work well with subpackages in all my previous experiments.

@deadprogram

This comment has been minimized.

Member

deadprogram commented Mar 21, 2018

I am very close to merging the https://github.com/hybridgroup/gocv/tree/feature/mat-pointer branch into the dev branch, and committing to this style of API for the 1.x GoCV.

Any last comments/feedback/critiques/requests before this becomes "official"?

@willdurand

This comment has been minimized.

Contributor

willdurand commented Mar 21, 2018

do you have a PR opened?

@deadprogram

This comment has been minimized.

Member

deadprogram commented Mar 21, 2018

Here it is #142

@deadprogram

This comment has been minimized.

Member

deadprogram commented Mar 26, 2018

This has now been released as part of v0.11.0 thanks again everyone! Now closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment