New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrency issues in OpenGL renderer prevent proper garbage collection #3384

Closed
neilcsmith-net opened this Issue Jun 14, 2015 · 8 comments

Comments

Projects
None yet
3 participants
@neilcsmith-net

neilcsmith-net commented Jun 14, 2015

I'm currently trying to track down an occasional issue in Praxis LIVE when PGraphicsOpenGL and/or PShader instances are disposed of. Every now and again this breaks rendering entirely. That this isn't consistently repeatable would point to a concurrency issue.

A possible cause might be the finalize*() methods in PGraphicsOpenGL. The comments correctly recognise that finalization happens on a separate thread, so the methods are synchronized. However, none of the code on the GL thread appears to take that lock at any point before touching those collections!

Another issue is that using GC / finalize() to manage native resources in the first place is a big Java anti-pattern, for good reason - there is no guarantee that GL resources will be released in a timely manner.

First (and easiest) solution would be to enforce manual disposal. This would require an API change, although for many users only creating GL resources in setup() there would be no change.

The alternative might be changing the GLResource class to extend WeakReference, and remove all finalize() methods. This still doesn't handle the second problem, but would help get rid of concurrency issues. Java2D uses a similar approach.

ie. something like

static protected Set<GLResource<PShader>> glslPrograms = new HashSet<>();
// ...
protected static void deleteFinalizedGLSLProgramObjects(PGL pgl) {
  Iterator<GLResource> iter = glslPrograms.iterator();
  while (iter.hasNext()) {
    GLResource<PShader> res = iter.next();
    if (r.get() == null) {
      pgl.deleteProgram(res.id);
      iter.remove();
    }
  }
}
// ...
protected static class GLResource<T> extends WeakReference<T> {

  GLResource(T obj, int id, int content) {
// ...

@codeanticode codeanticode self-assigned this Jun 15, 2015

@codeanticode codeanticode added the opengl label Jun 15, 2015

@codeanticode codeanticode added this to the 3.0 beta 1 milestone Jun 15, 2015

@codeanticode

This comment has been minimized.

Show comment
Hide comment
@codeanticode

codeanticode Jun 18, 2015

Member

Not sure how the resource disposal is causing the concurrency errors you are observing. It would be very useful to have a simple sketch that reproduces the issue, even if it is only some of the time, or at least a log of the errors. The deleteFinalize*() methods do check that the GL thread is current before before deleting the resource, but not sure if this is related to your error without seeing any code or log.

As for manual resource disposal, PGraphics does have a public dispose() method, which in PGraphicsOpenGL marks all GL resources as finalized, and then deletes them. Similarly, PShader has a dispose() method that releases the resources used by the shader. It is currently marked as protected, but would make sense to make it public, at least this would make it consistent: the surface one renders to (PGraphics) and the shader making the rendering (PShader) can both be manually disposed (@benfry any thought about this?)

On the other hand, the fields holding the GL IDs of all the resources in PShader (glProgram, glVertex, glFragment) are public, so you can use the GL functions in PGL to delete them.

Member

codeanticode commented Jun 18, 2015

Not sure how the resource disposal is causing the concurrency errors you are observing. It would be very useful to have a simple sketch that reproduces the issue, even if it is only some of the time, or at least a log of the errors. The deleteFinalize*() methods do check that the GL thread is current before before deleting the resource, but not sure if this is related to your error without seeing any code or log.

As for manual resource disposal, PGraphics does have a public dispose() method, which in PGraphicsOpenGL marks all GL resources as finalized, and then deletes them. Similarly, PShader has a dispose() method that releases the resources used by the shader. It is currently marked as protected, but would make sense to make it public, at least this would make it consistent: the surface one renders to (PGraphics) and the shader making the rendering (PShader) can both be manually disposed (@benfry any thought about this?)

On the other hand, the fields holding the GL IDs of all the resources in PShader (glProgram, glVertex, glFragment) are public, so you can use the GL functions in PGL to delete them.

@benfry

This comment has been minimized.

Show comment
Hide comment
@benfry

benfry Jun 18, 2015

Member

The dispose() is a PGraphics method since it's necessary to shut down the renderer safely (PDF and OpenGL in particular), and it's handled by the PSurface.

Generally speaking, we should (and do) bend over backwards to not put the burden on users to call dispose() methods. It's a bad thing to add now, and on our side, it can lead to lazy implementations rather than fixing issues that we should be sorting out ourselves.

Member

benfry commented Jun 18, 2015

The dispose() is a PGraphics method since it's necessary to shut down the renderer safely (PDF and OpenGL in particular), and it's handled by the PSurface.

Generally speaking, we should (and do) bend over backwards to not put the burden on users to call dispose() methods. It's a bad thing to add now, and on our side, it can lead to lazy implementations rather than fixing issues that we should be sorting out ourselves.

@neilcsmith-net

This comment has been minimized.

Show comment
Hide comment
@neilcsmith-net

neilcsmith-net Jun 18, 2015

On 18 June 2015 at 19:03, codeanticode notifications@github.com wrote:

Not sure how the resource disposal is causing the concurrency errors you
are observing. It would be very useful to have a simple sketch that
reproduces the issue, even if it is only some of the time, or at least a
log of the error.

Unfortunately, as with most concurrency issues, it's not generally
reproducible. It's an occasional error - I'll work on getting an accurate
log for you if possible. However, in tracing through the operations in the
code that happen when the bug is triggered, there's an obvious race
condition in finalization.

Just a quick note on how GL resources get deleted: the deleteFinalized*()

methods are always called when a new resource of the same kind is created,
so this ensures that, at least for those objects that were finalized by the
GC, they are released by the GL driver

The issue is occurring at a time when some resources are being replaced -
some created, some disposed. None of the deleteFinalized*() methods called
on the GL thread seem to lock on anything before manipulating the
collections that are being written to by the finalization thread. It's
possible that the collection is being corrupted in some fashion.

You could keep this all single threaded anyway by using WeakReferences and
removing any need for finalization - this may improve GC too.

PGraphics does have a public dispose() method, which in PGraphicsOpenGL
marks all GL resources as finalized, and then deletes them.

PShader doesn't though, for example. The "correct" way to fix this would
be to force people to use dispose() on PShader, etc. and remove finalize()
entirely - a memory leak is possibly better than memory corruption? I'm a
little wary that finalization() here is more akin to "do magic" - anyone
getting that advanced should understand the underlying hardware and
implications.

neilcsmith-net commented Jun 18, 2015

On 18 June 2015 at 19:03, codeanticode notifications@github.com wrote:

Not sure how the resource disposal is causing the concurrency errors you
are observing. It would be very useful to have a simple sketch that
reproduces the issue, even if it is only some of the time, or at least a
log of the error.

Unfortunately, as with most concurrency issues, it's not generally
reproducible. It's an occasional error - I'll work on getting an accurate
log for you if possible. However, in tracing through the operations in the
code that happen when the bug is triggered, there's an obvious race
condition in finalization.

Just a quick note on how GL resources get deleted: the deleteFinalized*()

methods are always called when a new resource of the same kind is created,
so this ensures that, at least for those objects that were finalized by the
GC, they are released by the GL driver

The issue is occurring at a time when some resources are being replaced -
some created, some disposed. None of the deleteFinalized*() methods called
on the GL thread seem to lock on anything before manipulating the
collections that are being written to by the finalization thread. It's
possible that the collection is being corrupted in some fashion.

You could keep this all single threaded anyway by using WeakReferences and
removing any need for finalization - this may improve GC too.

PGraphics does have a public dispose() method, which in PGraphicsOpenGL
marks all GL resources as finalized, and then deletes them.

PShader doesn't though, for example. The "correct" way to fix this would
be to force people to use dispose() on PShader, etc. and remove finalize()
entirely - a memory leak is possibly better than memory corruption? I'm a
little wary that finalization() here is more akin to "do magic" - anyone
getting that advanced should understand the underlying hardware and
implications.

@benfry

This comment has been minimized.

Show comment
Hide comment
@benfry

benfry Jun 18, 2015

Member

No, you're not hearing me: dispose() methods have no place for our target audience.

Member

benfry commented Jun 18, 2015

No, you're not hearing me: dispose() methods have no place for our target audience.

@neilcsmith-net

This comment has been minimized.

Show comment
Hide comment
@neilcsmith-net

neilcsmith-net Jun 18, 2015

No, you're not hearing me:

@benfry That would have been difficult! :-) Sorry, problem of replying via email - hadn't even seen your post when I replied to Andres.

I quite understand your viewpoint, hence suggesting WeakReference as an alternative as well. There is a minor issue that it's impossible to know when the Java object will be GC'd and release the GL resources as there's no pressure on Java memory. This is related but tangential to the concurrency issue.

neilcsmith-net commented Jun 18, 2015

No, you're not hearing me:

@benfry That would have been difficult! :-) Sorry, problem of replying via email - hadn't even seen your post when I replied to Andres.

I quite understand your viewpoint, hence suggesting WeakReference as an alternative as well. There is a minor issue that it's impossible to know when the Java object will be GC'd and release the GL resources as there's no pressure on Java memory. This is related but tangential to the concurrency issue.

@codeanticode

This comment has been minimized.

Show comment
Hide comment
@codeanticode

codeanticode Jun 21, 2015

Member

Ok, I found a better way to do resource disposal without relying on finalization, see the section "An Alternative to Finalization" in this article. Implementing it now, should solve the issues you have described since the glDelete*() functions are always called in the animation thread, and relies on weak references.

Member

codeanticode commented Jun 21, 2015

Ok, I found a better way to do resource disposal without relying on finalization, see the section "An Alternative to Finalization" in this article. Implementing it now, should solve the issues you have described since the glDelete*() functions are always called in the animation thread, and relies on weak references.

@neilcsmith-net

This comment has been minimized.

Show comment
Hide comment
@neilcsmith-net

neilcsmith-net Jun 21, 2015

Great! I meant to share a link to that article with you. ;-) It's pretty much the suggestion at the top - the places I use that pattern I'm tending to use the "explicit" get() check rather than the "implicit" queue check - either would suit though.

Still think that making dispose() public (even if non-official API) would be a good move - GC is just not deterministic.

This should solve the concurrency issue. Whether this is the cause of the problem that caused me to hunt for race conditions is another matter! :-\

neilcsmith-net commented Jun 21, 2015

Great! I meant to share a link to that article with you. ;-) It's pretty much the suggestion at the top - the places I use that pattern I'm tending to use the "explicit" get() check rather than the "implicit" queue check - either would suit though.

Still think that making dispose() public (even if non-official API) would be a good move - GC is just not deterministic.

This should solve the concurrency issue. Whether this is the cause of the problem that caused me to hunt for race conditions is another matter! :-\

@codeanticode

This comment has been minimized.

Show comment
Hide comment

@benfry benfry changed the title from Concurrency issues in OpenGL renderer to Concurrency issues in OpenGL renderer prevent proper garbage collection Sep 4, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment