Skip to content

Commit

Permalink
drm/nouveau/gr: enable memory loads on helper invocation on all channels
Browse files Browse the repository at this point in the history
commit 1cb9e2e upstream.

We have a lurking bug where Fragment Shader Helper Invocations can't load
from memory. But this is actually required in OpenGL and is causing random
hangs or failures in random shaders.

It is unknown how widespread this issue is, but shaders hitting this can
end up with infinite loops.

We enable those only on all Kepler and newer GPUs where we use our own
Firmware.

Nvidia's firmware provides a way to set a kernelspace controlled list of
mmio registers in the gr space from push buffers via MME macros.

v2: drop code for gm200 and newer.

Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: David Airlie <airlied@gmail.com>
Cc: nouveau@lists.freedesktop.org
Cc: stable@vger.kernel.org # 4.19+
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230622152017.2512101-1-kherbst@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  • Loading branch information
karolherbst authored and gregkh committed Aug 16, 2023
1 parent 061fbf6 commit 1092c92
Show file tree
Hide file tree
Showing 6 changed files with 17 additions and 1 deletion.
1 change: 1 addition & 0 deletions drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.h
Expand Up @@ -117,6 +117,7 @@ void gk104_grctx_generate_r418800(struct gf100_gr *);

extern const struct gf100_grctx_func gk110_grctx;
void gk110_grctx_generate_r419eb0(struct gf100_gr *);
void gk110_grctx_generate_r419f78(struct gf100_gr *);

extern const struct gf100_grctx_func gk110b_grctx;
extern const struct gf100_grctx_func gk208_grctx;
Expand Down
4 changes: 3 additions & 1 deletion drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgk104.c
Expand Up @@ -906,7 +906,9 @@ static void
gk104_grctx_generate_r419f78(struct gf100_gr *gr)
{
struct nvkm_device *device = gr->base.engine.subdev.device;
nvkm_mask(device, 0x419f78, 0x00000001, 0x00000000);

/* bit 3 set disables loads in fp helper invocations, we need it enabled */
nvkm_mask(device, 0x419f78, 0x00000009, 0x00000000);
}

void
Expand Down
10 changes: 10 additions & 0 deletions drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgk110.c
Expand Up @@ -820,6 +820,15 @@ gk110_grctx_generate_r419eb0(struct gf100_gr *gr)
nvkm_mask(device, 0x419eb0, 0x00001000, 0x00001000);
}

void
gk110_grctx_generate_r419f78(struct gf100_gr *gr)
{
struct nvkm_device *device = gr->base.engine.subdev.device;

/* bit 3 set disables loads in fp helper invocations, we need it enabled */
nvkm_mask(device, 0x419f78, 0x00000008, 0x00000000);
}

const struct gf100_grctx_func
gk110_grctx = {
.main = gf100_grctx_generate_main,
Expand Down Expand Up @@ -854,4 +863,5 @@ gk110_grctx = {
.gpc_tpc_nr = gk104_grctx_generate_gpc_tpc_nr,
.r418800 = gk104_grctx_generate_r418800,
.r419eb0 = gk110_grctx_generate_r419eb0,
.r419f78 = gk110_grctx_generate_r419f78,
};
1 change: 1 addition & 0 deletions drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgk110b.c
Expand Up @@ -103,4 +103,5 @@ gk110b_grctx = {
.gpc_tpc_nr = gk104_grctx_generate_gpc_tpc_nr,
.r418800 = gk104_grctx_generate_r418800,
.r419eb0 = gk110_grctx_generate_r419eb0,
.r419f78 = gk110_grctx_generate_r419f78,
};
1 change: 1 addition & 0 deletions drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgk208.c
Expand Up @@ -568,4 +568,5 @@ gk208_grctx = {
.dist_skip_table = gf117_grctx_generate_dist_skip_table,
.gpc_tpc_nr = gk104_grctx_generate_gpc_tpc_nr,
.r418800 = gk104_grctx_generate_r418800,
.r419f78 = gk110_grctx_generate_r419f78,
};
1 change: 1 addition & 0 deletions drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm107.c
Expand Up @@ -988,4 +988,5 @@ gm107_grctx = {
.r406500 = gm107_grctx_generate_r406500,
.gpc_tpc_nr = gk104_grctx_generate_gpc_tpc_nr,
.r419e00 = gm107_grctx_generate_r419e00,
.r419f78 = gk110_grctx_generate_r419f78,
};

0 comments on commit 1092c92

Please sign in to comment.