Skip to content
Permalink
Browse files

OpenGL Renderer: Before rendering, determine the list of clipped poly…

…gons, and then only render the clipped polygons, just like how SoftRasterizer does it. Most 3D games will see a significant performance improvement. For certain games with very high polygon count scenes, those games will see a massive performance boost.
  • Loading branch information...
rogerman committed Jan 22, 2019
1 parent bb93a0a commit 4cd19ce52276ca6d2e193605bdb1d90c6b3f287d
@@ -1264,12 +1264,12 @@ OpenGLRenderer::OpenGLRenderer()

OpenGLRenderer::~OpenGLRenderer()
{
free_aligned(_framebufferColor);
free_aligned(_workingTextureUnpackBuffer);
free_aligned(this->_framebufferColor);
free_aligned(this->_workingTextureUnpackBuffer);

// Destroy OpenGL rendering states
delete ref;
ref = NULL;
delete this->ref;
this->ref = NULL;
}

bool OpenGLRenderer::IsExtensionPresent(const std::set<std::string> *oglExtensionSet, const std::string extensionName) const
@@ -1833,9 +1833,9 @@ size_t OpenGLRenderer::DrawPolygonsForIndexRange(const POLYLIST *polyList, const
{
OGLRenderRef &OGLRef = *this->ref;

if (lastIndex > (polyList->count - 1))
if (lastIndex > (this->_clippedPolyCount - 1))
{
lastIndex = polyList->count - 1;
lastIndex = this->_clippedPolyCount - 1;
}

if (firstIndex > lastIndex)
@@ -1860,7 +1860,7 @@ size_t OpenGLRenderer::DrawPolygonsForIndexRange(const POLYLIST *polyList, const
};

// Set up the initial polygon
const POLY &initialPoly = polyList->list[indexList->list[firstIndex]];
const POLY &initialPoly = *this->_clipper.GetClippedPolyByIndex(firstIndex).poly;
TEXIMAGE_PARAM lastTexParams = initialPoly.texParam;
u32 lastTexPalette = initialPoly.texPalette;
u32 lastViewport = initialPoly.viewport;
@@ -1874,7 +1874,7 @@ size_t OpenGLRenderer::DrawPolygonsForIndexRange(const POLYLIST *polyList, const

for (size_t i = firstIndex; i <= lastIndex; i++)
{
const POLY &thePoly = polyList->list[indexList->list[i]];
const POLY &thePoly = *this->_clipper.GetClippedPolyByIndex(i).poly;

// Set up the polygon if it changed
if (lastPolyAttr.value != thePoly.attribute.value)
@@ -1914,7 +1914,7 @@ size_t OpenGLRenderer::DrawPolygonsForIndexRange(const POLYLIST *polyList, const
// the same and we're not drawing a line loop or line strip.
if (i+1 <= lastIndex)
{
const POLY &nextPoly = polyList->list[indexList->list[i+1]];
const POLY &nextPoly = *this->_clipper.GetClippedPolyByIndex(i+1).poly;

if (lastPolyAttr.value == nextPoly.attribute.value &&
lastTexParams.value == nextPoly.texParam.value &&
@@ -4051,7 +4051,7 @@ Render3DError OpenGLRenderer_1_2::ZeroDstAlphaPass(const POLYLIST *polyList, con
glColorMask(GL_TRUE, GL_TRUE, GL_TRUE, GL_FALSE);
glStencilFunc(GL_NOTEQUAL, 0x40, 0x40);

this->DrawPolygonsForIndexRange<OGLPolyDrawMode_ZeroAlphaPass>(polyList, indexList, polyList->opaqueCount, polyList->count - 1, indexOffset, lastPolyAttr);
this->DrawPolygonsForIndexRange<OGLPolyDrawMode_ZeroAlphaPass>(polyList, indexList, this->_clippedPolyOpaqueCount, this->_clippedPolyCount - 1, indexOffset, lastPolyAttr);

// Restore OpenGL states back to normal.
this->_geometryProgramFlags = oldGProgramFlags;
@@ -4255,10 +4255,14 @@ Render3DError OpenGLRenderer_1_2::BeginRender(const GFX3D &engine)
OGLRef.vtxPtrColor = (this->isShaderSupported) ? (GLvoid *)&engine.vertList[0].color : OGLRef.color4fBuffer;
}

// Generate the clipped polygon list.
this->_PerformClipping<ClipperMode_DetermineClipOnly>(engine.vertList, engine.polylist, &engine.indexlist);

this->_renderNeedsDepthEqualsTest = false;
for (size_t i = 0, vertIndexCount = 0; i < engine.polylist->count; i++)
for (size_t i = 0, vertIndexCount = 0; i < this->_clippedPolyCount; i++)
{
const POLY &thePoly = engine.polylist->list[engine.indexlist.list[i]];
const POLY &thePoly = *this->_clipper.GetClippedPolyByIndex(i).poly;

const size_t polyType = thePoly.type;
const VERT vert[4] = {
engine.vertList[thePoly.vertIndexes[0]],
@@ -4428,7 +4432,7 @@ Render3DError OpenGLRenderer_1_2::BeginRender(const GFX3D &engine)

Render3DError OpenGLRenderer_1_2::RenderGeometry(const GFX3D_State &renderState, const POLYLIST *polyList, const INDEXLIST *indexList)
{
if (polyList->count > 0)
if (this->_clippedPolyCount > 0)
{
glEnable(GL_DEPTH_TEST);
glEnable(GL_STENCIL_TEST);
@@ -4448,29 +4452,29 @@ Render3DError OpenGLRenderer_1_2::RenderGeometry(const GFX3D_State &renderState,

size_t indexOffset = 0;

const POLY &firstPoly = polyList->list[indexList->list[0]];
const POLY &firstPoly = *this->_clipper.GetClippedPolyByIndex(0).poly;
POLYGON_ATTR lastPolyAttr = firstPoly.attribute;

if (polyList->opaqueCount > 0)
if (this->_clippedPolyOpaqueCount > 0)
{
this->SetupPolygon(firstPoly, false, true);
this->DrawPolygonsForIndexRange<OGLPolyDrawMode_DrawOpaquePolys>(polyList, indexList, 0, polyList->opaqueCount - 1, indexOffset, lastPolyAttr);
this->DrawPolygonsForIndexRange<OGLPolyDrawMode_DrawOpaquePolys>(polyList, indexList, 0, this->_clippedPolyOpaqueCount - 1, indexOffset, lastPolyAttr);
}

if (polyList->opaqueCount < polyList->count)
if (this->_clippedPolyOpaqueCount < this->_clippedPolyCount)
{
if (this->_needsZeroDstAlphaPass && this->_emulateSpecialZeroAlphaBlending)
{
if (polyList->opaqueCount == 0)
if (this->_clippedPolyOpaqueCount == 0)
{
this->SetupPolygon(firstPoly, true, false);
}

this->ZeroDstAlphaPass(polyList, indexList, renderState.enableAlphaBlending, indexOffset, lastPolyAttr);

if (polyList->opaqueCount > 0)
if (this->_clippedPolyOpaqueCount > 0)
{
const POLY &lastOpaquePoly = polyList->list[indexList->list[polyList->opaqueCount - 1]];
const POLY &lastOpaquePoly = *this->_clipper.GetClippedPolyByIndex(this->_clippedPolyOpaqueCount - 1).poly;
lastPolyAttr = lastOpaquePoly.attribute;
this->SetupPolygon(lastOpaquePoly, false, true);
}
@@ -4485,12 +4489,12 @@ Render3DError OpenGLRenderer_1_2::RenderGeometry(const GFX3D_State &renderState,
glStencilMask(0xFF);
}

if (polyList->opaqueCount == 0)
if (this->_clippedPolyOpaqueCount == 0)
{
this->SetupPolygon(firstPoly, true, true);
}

this->DrawPolygonsForIndexRange<OGLPolyDrawMode_DrawTranslucentPolys>(polyList, indexList, polyList->opaqueCount, polyList->count - 1, indexOffset, lastPolyAttr);
this->DrawPolygonsForIndexRange<OGLPolyDrawMode_DrawTranslucentPolys>(polyList, indexList, this->_clippedPolyOpaqueCount, this->_clippedPolyCount - 1, indexOffset, lastPolyAttr);
}

glColorMask(GL_TRUE, GL_TRUE, GL_TRUE, GL_TRUE);
@@ -5558,10 +5562,14 @@ Render3DError OpenGLRenderer_2_0::BeginRender(const GFX3D &engine)
// Only copy as much vertex data as we need to, since this can be a potentially large upload size.
glBufferSubData(GL_ARRAY_BUFFER, 0, sizeof(VERT) * engine.vertListCount, engine.vertList);

// Generate the clipped polygon list.
this->_PerformClipping<ClipperMode_DetermineClipOnly>(engine.vertList, engine.polylist, &engine.indexlist);

this->_renderNeedsDepthEqualsTest = false;
for (size_t i = 0, vertIndexCount = 0; i < engine.polylist->count; i++)
for (size_t i = 0, vertIndexCount = 0; i < this->_clippedPolyCount; i++)
{
const POLY &thePoly = engine.polylist->list[engine.indexlist.list[i]];
const POLY &thePoly = *this->_clipper.GetClippedPolyByIndex(i).poly;

const size_t polyType = thePoly.type;
const VERT vert[4] = {
engine.vertList[thePoly.vertIndexes[0]],
@@ -309,7 +309,8 @@ enum OGLTextureUnitID

enum OGLBindingPointID
{
OGLBindingPointID_RenderStates = 0
OGLBindingPointID_RenderStates = 0,
OGLBindingPointID_PolyStates = 1
};

enum OGLErrorCode
@@ -488,6 +489,7 @@ struct OGLRenderRef

// UBO / TBO
GLuint uboRenderStatesID;
GLuint uboPolyStatesID;
GLuint tboPolyStatesID;
GLuint texPolyStatesID;

@@ -715,6 +717,7 @@ class OpenGLRenderer : public Render3D

Render3DError FlushFramebuffer(const FragmentColor *__restrict srcFramebuffer, FragmentColor *__restrict dstFramebufferMain, u16 *__restrict dstFramebuffer16);
OpenGLTexture* GetLoadedTextureFromPolygon(const POLY &thePoly, bool enableTexturing);

template<OGLPolyDrawMode DRAWMODE> size_t DrawPolygonsForIndexRange(const POLYLIST *polyList, const INDEXLIST *indexList, size_t firstIndex, size_t lastIndex, size_t &indexOffset, POLYGON_ATTR &lastPolyAttr);
template<OGLPolyDrawMode DRAWMODE> Render3DError DrawAlphaTexturePolygon(const GLenum polyPrimitive,
const GLsizei vertIndexCount,
Oops, something went wrong.

4 comments on commit 4cd19ce

@Jules-A

This comment has been minimized.

Copy link
Contributor

replied Jan 22, 2019

Just had a quick try out, this causes a ~20% perf increase in one scene that looked rather simple but performed terrible along with a few other scenes seeing smaller increases. Thanks again for your work on optimization.

EDIT: When enabling all OpenGL compatibility options, I've seen up to 38% increases!!
Is it directly because of culling that Depth testing becomes significantly cheaper?

@rogerman

This comment has been minimized.

Copy link
Collaborator Author

replied Jan 24, 2019

@Jules-A: When clipping polygons, EVERYTHING becomes cheaper, including depth testing. This commit only helps the OpenGL renderer, since it never clipped any polygons before rendering, whereas SoftRasterizer already did clip its polygons before rendering.

On the NDS, there are many games that process lots of polygons, but only render less than half of them in any given scene. This is why a lot of 3D games will see significant performance boosts.

And then you have games like Sands of Destruction, which take this to the extreme. Sands of Destruction processes several thousands of polygons, but only renders about 15% of them after clipping. When the user runs the OpenGL renderer, this commit literally makes Sands of Destruction go from "totally unplayable" to "fully playable". The performance benefit for that game is, well, game-changing. Users are no longer stuck to resorting to SoftRasterizer in order to run this game in HD.

Please try commit e06d11f, which is available on our Downloads page. It finalizes the work on this polygon clipping stuff, and should yield a universal performance improvement for all 3D renderers.

@tabnk

This comment has been minimized.

Copy link

replied Jan 24, 2019

The Legend of Zelda: Spirit Tracks performance improve significantly. Nice.

@Jules-A

This comment has been minimized.

Copy link
Contributor

replied Jan 24, 2019

@rogerman

When clipping polygons, EVERYTHING becomes cheaper, including depth testing.

Yes but it seems to affect the Polygon Facing depth testing the most, for my system and the game's I've tested the impact has dropped from up to 35% to just 3-5%. If that is replicated across other systems and games, it might even be worth enabling by default?

Please try commit e06d11f. It finalizes the work on this polygon clipping stuff, and should yield a universal performance improvement for all 3D renderers.

With that commit I'm seeing up to 6% gain over this commit, nice work!

Please sign in to comment.
You can’t perform that action at this time.