Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BF4, BF Hardline, DA:I, Plants vs. Zombies: Garden Warfare / Frostbite 3 Engine #10

Closed
xatornet opened this issue Mar 14, 2021 · 38 comments

Comments

@xatornet
Copy link

xatornet commented Mar 14, 2021

Games status:

  • Battlefield 4 --> Not working
  • Battlefield Hardline --> Untested/Unknown
  • Dragon Age Inquisition --> Not working
  • Plants vs. Zombies: Garden Warfare --> Untested/Unknown

Details:

BF4 and Dragon Age Inquisiton Error Message (both the same)

Here's BF 4 log:

=== GRVK 0.3.0 ===
I/grInitAndEnumerateGpus: app "Battlefield" (01000000), engine "Frostbite" (00C00000), api 00018000
W/grInitAndEnumerateGpus: unhandled alloc callbacks
W/grGetExtensionSupport: STUB
W/grGetExtensionSupport: STUB

Here's DAI log:

=== GRVK 0.3.0 ===
I/grInitAndEnumerateGpus: app "DragonAgeInquisition" (01000000), engine "Frostbite" (00FDE001), api 00018000
W/grInitAndEnumerateGpus: unhandled alloc callbacks
W/grGetExtensionSupport: STUB
W/grGetExtensionSupport: STUB

@xatornet xatornet changed the title Battlefield 4 Not working Battlefield 4 & Dragon Age Inquisition Not working Mar 14, 2021
@libcg
Copy link
Owner

libcg commented Mar 14, 2021

Thanks for the report, I haven't tried any games other than Star Swarm. Fortunately the GR_BORDER_COLOR_PALETTE extension can be implemented using VK_EXT_custom_border_color (spec)

@xatornet
Copy link
Author

No, thanks to you for this amazing project. Keep up the good work :-)

@xatornet xatornet changed the title Battlefield 4 & Dragon Age Inquisition Not working Battlefield 4, Dragon Age Inquisition, Sniper Elite III Not working Mar 14, 2021
@xatornet xatornet changed the title Battlefield 4, Dragon Age Inquisition, Sniper Elite III Not working Battlefield 4, Dragon Age Inquisition, (maybe)Sniper Elite III Not working Mar 14, 2021
@Cherser-s
Copy link
Contributor

Did you try to launch BF4 from wine? It doesn't detect Mantle API at all because I think, the client tries to find amdmantle64.dll first instead of mantle64.dll.
It probably needs a function called IcdInit, which isn't included in the API docs.

@xatornet
Copy link
Author

I tried it on windows 10 without wine. To make games be able to detect mantle using Nvidia, you have to move amdmantle64.dll and mantleaxl64.dll into the executable folder, and then paste GRVK's dlls, in this case mantle64.dll.

You can get those dlls downloading the Adrenaline 19.4.3 driver drom AMD.

@xatornet xatornet changed the title Battlefield 4, Dragon Age Inquisition, (maybe)Sniper Elite III Not working Battlefield 4, Dragon Age Inquisition, Thief and (maybe)Sniper Elite III Not working Mar 14, 2021
@Cherser-s
Copy link
Contributor

I hope there will be at least some explanation of function parameters from such libraries (there is a ton of parameters for this function for example), so we could implement these functions to avoid using proprietary libraries.

@libcg
Copy link
Owner

libcg commented Mar 15, 2021

Tracking these two issues here: #11 #12

@libcg
Copy link
Owner

libcg commented Mar 15, 2021

@xatornet can you restrict this issue for Frostbite games, and create separate issues for UE3 and other engines?

@xatornet xatornet changed the title Battlefield 4, Dragon Age Inquisition, Thief and (maybe)Sniper Elite III Not working [Frostbite Engine based games] Battlefield 4, Dragon Age Inquisition - Not working Mar 15, 2021
@xatornet xatornet changed the title [Frostbite Engine based games] Battlefield 4, Dragon Age Inquisition - Not working [Frostbite Engine based games] - Not working Mar 15, 2021
@xatornet xatornet changed the title [Frostbite Engine based games] - Not working [Frostbite Engine based games] Mar 15, 2021
@xatornet xatornet changed the title [Frostbite Engine based games] [Frostbite 3 Engine based games] Mar 15, 2021
@libcg libcg changed the title [Frostbite 3 Engine based games] BF4, BF Hardline, DA:I, Plants vs. Zombies: Garden Warfare / Frostbite 3 Engine Mar 15, 2021
@Cherser-s
Copy link
Contributor

Well, considering it's possible to select Mantle backend in bf4 now, does the game even work with it (at least in-menu)?

@libcg
Copy link
Owner

libcg commented Apr 17, 2021

It looks like this with some additional hacks. I get a crash after the grWsiWinSetMaxQueuedFrames call for some reason.

T/0000094C/grInitAndEnumerateGpus: 000000006101f3c8 000000006101f3b8 0000000142739998
I/0000094C/grInitAndEnumerateGpus: app "Battlefield" (01000000), engine "Frostbite" (00C00000), api 00018000
W/0000094C/grInitAndEnumerateGpus: unhandled alloc callbacks
T/0000094C/grGetGpuInfo: 0000000000e242a0 0x6100 000000006101f3b0 000000006101f3f0
T/0000094C/grGetGpuInfo: 0000000000e242a0 0x6100 000000006101e8e8 0000000087ff45c0
T/0000094C/grGetGpuInfo: 0000000000e242a0 0x6101 000000006101e8e8 0000000087ff59c0
W/0000094C/grGetExtensionSupport: STUB GR_WSI_WINDOWS
W/0000094C/grGetExtensionSupport: STUB GR_BORDER_COLOR_PALETTE
W/0000094C/grGetExtensionSupport: STUB GR_DMA_QUEUE
W/0000094C/grGetExtensionSupport: STUB GR_ADVANCED_MSAA
W/0000094C/grGetExtensionSupport: STUB GR_TIMER_QUEUE
T/0000094C/grCreateDevice: 0000000000e242a0 000000006101e948 0000000087ff5b00
I/0000094C/grCreateDevice: 1002:7300 "AMD RADV FIJI (ACO)" (Vulkan 1.2.145, driver 21.0.2)
T/0000094C/grWsiWinGetDisplays: 0000000000e28230 000000006101e8f4 000000006101ea00
T/0000094C/grGetObjectInfo: 0000000000e28070 0x206801 000000006101e9a8 000000006101ea88
T/0000094C/grGetObjectInfo: 0000000000e280b0 0x206801 000000006101e9a8 000000006101ea88
T/0000094C/grGetObjectInfo: 0000000000e280f0 0x206801 000000006101e9a8 000000006101ea88
T/0000094C/grGetObjectInfo: 0000000000e28130 0x206801 000000006101e9a8 000000006101ea88
T/0000094C/grGetObjectInfo: 0000000000e28170 0x206801 000000006101e9a8 000000006101ea88
T/0000094C/grGetObjectInfo: 0000000000e281b0 0x206801 000000006101e9a8 000000006101ea88
T/0000094C/grGetObjectInfo: 0000000000e28910 0x206801 000000006101e9a8 000000006101ea88
T/0000094C/grGetObjectInfo: 0000000000e28950 0x206801 000000006101e9a8 000000006101ea88
T/0000094C/grWsiWinGetDisplayModeList: 0000000000e28070 000000006101e8f0 0000000000000000
T/0000094C/grWsiWinGetDisplayModeList: 0000000000e28070 000000006101e8f0 0000000008ee0450
T/0000094C/grGetGpuInfo: 0000000000e242a0 0x6102 000000006101e8c8 0000000000000000
T/0000094C/grGetGpuInfo: 0000000000e242a0 0x6102 000000006101e8c8 0000000008eb0080
T/0000094C/grGetDeviceQueue: 0000000000e28230 0x1000 0 0000000008e70148
T/0000094C/grGetObjectInfo: 0000000000e28c90 0x206800 000000006101e8c8 0000000008e70160
T/0000094C/grGetDeviceQueue: 0000000000e28230 0x1001 0 0000000008e701e8
T/0000094C/grGetObjectInfo: 0000000000e4bba0 0x206800 000000006101e8c8 0000000008e70200
T/0000094C/grGetMemoryHeapCount: 0000000000e28230 0000000087ff5d38
T/0000094C/grGetMemoryHeapInfo: 0000000000e28230 0 0x6200 000000006101e9a0 0000000087ff5d40
T/0000094C/grGetMemoryHeapInfo: 0000000000e28230 1 0x6200 000000006101e9a0 0000000087ff5d70
T/0000094C/grGetMemoryHeapInfo: 0000000000e28230 2 0x6200 000000006101e9a0 0000000087ff5da0
T/0000094C/grGetMemoryHeapInfo: 0000000000e28230 3 0x6200 000000006101e9a0 0000000087ff5dd0
W/0000094C/grWsiWinSetMaxQueuedFrames: STUB

@Cherser-s
Copy link
Contributor

Cherser-s commented Apr 17, 2021

Interesting, might look into it as well.

I think it's probably due to T/0000094C/grGetObjectInfo: 0000000000e28070 0x206801 000000006101e9a8 000000006101ea88, since client requests object info with weird object type 0x206801, it's probably 0x6801 which is parent device (seems to be a hack), but it's not handled at all. Also there is some request with info type GR_WSI_WIN_INFO_TYPE_QUEUE_PROPERTIES for device queues, which is not documented and exists in include files.

UPD: object type is really 0x206801, which is GR_WSI_WIN_INFO_TYPE_DISPLAY_PROPERTIES flag (used for GR_WSI_WIN_DISPLAY_PROPERTIES), so it checks display info. I think it's better to launch bf4 in windowed mode btw. Well, it will probably require grWsiWinGetDisplays anyway.

@libcg
Copy link
Owner

libcg commented Apr 17, 2021

@Cherser-s correct, I'm handling that and using windowed mode, but it still crashes.

@Cherser-s
Copy link
Contributor

Cherser-s commented Apr 17, 2021

Wait, did you already implemented the code, which is handling these 3 features? Can you push it at least to another branch, so I can look into it? I've implemented it partially myself though.

Also, do you handle ""present"" queue flags info as well?

@Cherser-s
Copy link
Contributor

Cherser-s commented Apr 17, 2021

Hm weird, it doesn't initialize on my machine at all, it doesn't get past grGetGpuInfo and fallbacks to dxvk. The game probably didn't like gpu info.

@libcg
Copy link
Owner

libcg commented Apr 18, 2021

@Cherser-s That's because BF4 checks the driver version. I pushed a fix earlier today, I'll post another branch tomorrow so you can take a look

@Cherser-s
Copy link
Contributor

Cherser-s commented Apr 18, 2021

Yeah I got those commits, and managed to reproduce the same problem as well. Interesting, that crash happens much later, perhaps during the game initialization, as there are no calls at all to Mantle libraries.

It's really weird that the game (the code in executable that is) itself crashes.

@xatornet
Copy link
Author

v0.4.0 has the same problem on DA:I

Sin título

Here's grvk.log:
=== GRVK 0.4.0 ===
I/00003ADC/grInitAndEnumerateGpus: app "DragonAgeInquisition" (01000000), engine "Frostbite" (00FDE001), api 00018000
W/00003ADC/grInitAndEnumerateGpus: unhandled alloc callbacks
I/00003ADC/grCreateDevice: 10DE:2204 "NVIDIA GeForce RTX 3090" (Vulkan 1.2.168, driver 466.11.0)
W/00003ADC/grWsiWinGetDisplays: semi-stub
W/00003ADC/grGetObjectInfo: unsupported info type 0x206801
E/00003ADC/grGetGpuInfo: unsupported info type 0x6102

@libcg
Copy link
Owner

libcg commented Apr 22, 2021

I have patches for this that I haven't posted yet.

@xatornet
Copy link
Author

I have patches for this that I haven't posted yet.

I'll be looking for those on later releases

@Osyfe
Copy link

Osyfe commented May 11, 2021

For me, DA:I is working just fine (Windows 10, Radeon RX 570) with the versions 0.2.0 (about 30h of gameplay without any issues) and 0.3.0, although it seems that multisampling is not working in 0.2.0. However, with the newest version I get

=== GRVK 0.4.0 ===
I/00006174/grInitAndEnumerateGpus: app "DragonAgeInquisition" (01000000), engine "Frostbite" (00FDE001), api 00018000
W/00006174/grInitAndEnumerateGpus: unhandled alloc callbacks
I/00006174/grCreateDevice: 1002:67DF "Radeon RX 570 Series" (Vulkan 1.2.159, driver 2.0.168)
W/00006174/decodeInstruction: unhandled opcode 258
W/00006174/emitInstr: unhandled instruction 258
E/00006174/loadSource: source register 4 4099 not found
E/00006174/loadSource: source register 4 4096 not found
E/00006174/emitStructuredSrvLoad: resource 0 not found
E/00006174/loadSource: source register 4 4099 not found
E/00006174/loadSource: source register 4 4097 not found
E/00006174/emitStructuredSrvLoad: resource 0 not found
E/00006174/loadSource: source register 4 4099 not found
E/00006174/loadSource: source register 4 4098 not found
E/00006174/emitStructuredSrvLoad: resource 0 not found

and the game crashes silently.

@libcg
Copy link
Owner

libcg commented May 12, 2021

@Osyfe there's no way the game actually runs on Mantle with 0.2.0, it's most likely falling back to DX11 at boot.

@Osyfe
Copy link

Osyfe commented May 12, 2021

@Osyfe there's no way the game actually runs on Mantle with 0.2.0, it's most likely falling back to DX11 at boot.

Oh, interesting. Is there a way to check which API actually is used by an application?

@Cherser-s
Copy link
Contributor

Cherser-s commented May 13, 2021

@Osyfe there's no way the game actually runs on Mantle with 0.2.0, it's most likely falling back to DX11 at boot.

Confirm this, in the case if either there isn't mantleaxl64.dll present, or API version isn't supported by the client (as in case with FB3 games), then the game just fallbacks to using D3D11.

W/00006174/decodeInstruction: unhandled opcode 258
W/00006174/emitInstr: unhandled instruction 258

Hmmm, also have to implement more UAV operations. Interesting, how it doesn't crash after calling WSI functions here...

@Cherser-s
Copy link
Contributor

Cherser-s commented May 13, 2021

Try applying this patch to the latest commit from master branch to avoid those errors in shader translation:

diff --git a/src/amdilc/amdilc_compiler.c b/src/amdilc/amdilc_compiler.c
index 52e8e97..5b752d8 100644
--- a/src/amdilc/amdilc_compiler.c
+++ b/src/amdilc/amdilc_compiler.c
@@ -2125,13 +2125,26 @@ static void emitUavAtomicOp(
     IlcSpvId src1Id = loadSource(compiler, &instr->srcs[1], COMP_MASK_XYZW, vecTypeId);
     IlcSpvId valueId = emitVectorTrim(compiler, src1Id, vecTypeId, COMP_INDEX_X, 1);

-    if (instr->opcode == IL_OP_UAV_ADD || instr->opcode == IL_OP_UAV_READ_ADD) {
-        readId = ilcSpvPutAtomicOp(compiler->module, SpvOpAtomicIAdd, resource->texelTypeId,
-                                   texelPtrId, scopeId, semanticsId, valueId);
-    } else {
+    IlcSpvWord operation;
+    switch (instr->opcode) {
+    case IL_OP_UAV_ADD:
+    case IL_OP_UAV_READ_ADD:
+        operation = SpvOpAtomicIAdd;
+        break;
+    case IL_OP_UAV_MAX:
+    case IL_OP_UAV_READ_MAX:
+        operation = SpvOpAtomicSMax;
+        break;
+    case IL_OP_UAV_MIN:
+    case IL_OP_UAV_READ_MIN:
+        operation = SpvOpAtomicSMin;
+        break;
+    default:
         assert(false);
+        break;
     }
-
+    readId = ilcSpvPutAtomicOp(compiler->module, operation, resource->texelTypeId,
+                               texelPtrId, scopeId, semanticsId, valueId);
     if (instr->dstCount > 0) {
         IlcSpvId resId = emitVectorGrow(compiler, readId, resource->texelTypeId, 1);
         storeDestination(compiler, &instr->dsts[0], resId, vecTypeId);
@@ -2407,6 +2420,10 @@ static void emitInstr(
         break;
     case IL_OP_UAV_ADD:
     case IL_OP_UAV_READ_ADD:
+    case IL_OP_UAV_MAX:
+    case IL_OP_UAV_READ_MAX:
+    case IL_OP_UAV_MIN:
+    case IL_OP_UAV_READ_MIN:
         emitUavAtomicOp(compiler, instr);
         break;
     case IL_OP_DCL_STRUCT_SRV:
diff --git a/src/amdilc/amdilc_decoder.c b/src/amdilc/amdilc_decoder.c
index 309fb1e..2003b73 100644
--- a/src/amdilc/amdilc_decoder.c
+++ b/src/amdilc/amdilc_decoder.c
@@ -99,6 +99,10 @@ static const OpcodeInfo mOpcodeInfos[IL_OP_LAST] = {
     [IL_OP_UAV_STORE] = { IL_OP_UAV_STORE, 0, 2, 0, false },
     [IL_OP_UAV_ADD] = { IL_OP_UAV_ADD, 0, 2, 0, false },
     [IL_OP_UAV_READ_ADD] = { IL_OP_UAV_READ_ADD, 1, 2, 0, false },
+    [IL_OP_UAV_MAX] = { IL_OP_UAV_MAX, 0, 2, 0, false },
+    [IL_OP_UAV_READ_MAX] = { IL_OP_UAV_READ_MAX, 1, 2, 0, false },
+    [IL_OP_UAV_MIN] = { IL_OP_UAV_MIN, 0, 2, 0, false },
+    [IL_OP_UAV_READ_MIN] = { IL_OP_UAV_READ_MIN, 1, 2, 0, false },
     [IL_OP_DCL_STRUCT_SRV] = { IL_OP_DCL_STRUCT_SRV, 0, 0, 1, false },
     [IL_OP_SRV_STRUCT_LOAD] = { IL_OP_SRV_STRUCT_LOAD, 1, 1, 0, false },
     [IL_DCL_STRUCT_LDS] = { IL_DCL_STRUCT_LDS, 0, 0, 2, false },

P.S.: I think the UAV atomics should be typed.

@Osyfe
Copy link

Osyfe commented May 14, 2021

Try applying this patch to the latest commit from master branch to avoid those errors in shader translation:

I still get the same error.

@Cherser-s
Copy link
Contributor

Cherser-s commented May 14, 2021

Ah damn it, it seems that I assume the wrong opcode
Sorry, it wasn't UAV operations, instead it was RAW_SRV resource handling, which is not implemented yet...

@Cherser-s
Copy link
Contributor

Ok, I have added this missing instruction handling, please try it out. I also added dump handling for these instructions.

diff --git a/src/amdilc/amdilc_compiler.c b/src/amdilc/amdilc_compiler.c
index 52e8e97..b549301 100644
--- a/src/amdilc/amdilc_compiler.c
+++ b/src/amdilc/amdilc_compiler.c
@@ -1078,6 +1078,41 @@ static void emitTypedUav(
     addResource(compiler, &resource);
 }
 
+static void emitRawSrv(
+    IlcCompiler* compiler,
+    const Instruction* instr)
+{
+    uint16_t id = GET_BITS(instr->control, 0, 13);
+
+    IlcSpvId arrayId = ilcSpvPutRuntimeArrayType(compiler->module, compiler->floatId, true);
+    IlcSpvId structId = ilcSpvPutStructType(compiler->module, 1, &arrayId);
+    IlcSpvId pointerId = ilcSpvPutPointerType(compiler->module, SpvStorageClassStorageBuffer,
+                                              structId);
+    IlcSpvId resourceId = ilcSpvPutVariable(compiler->module, pointerId,
+                                            SpvStorageClassStorageBuffer);
+
+    IlcSpvWord arrayStride = sizeof(float);
+    IlcSpvWord memberOffset = 0;
+    ilcSpvPutDecoration(compiler->module, arrayId, SpvDecorationArrayStride, 1, &arrayStride);
+    ilcSpvPutDecoration(compiler->module, structId, SpvDecorationBlock, 0, NULL);
+    ilcSpvPutMemberDecoration(compiler->module, structId, 0, SpvDecorationOffset, 1, &memberOffset);
+    ilcSpvPutDecoration(compiler->module, resourceId, SpvDecorationNonWritable, 0, NULL);
+
+    ilcSpvPutName(compiler->module, arrayId, "rawSrv");
+    emitBinding(compiler, resourceId, ILC_BASE_RESOURCE_ID + id, VK_DESCRIPTOR_TYPE_STORAGE_BUFFER);
+
+    const IlcResource resource = {
+        .id = resourceId,
+        .typeId = arrayId,
+        .texelTypeId = compiler->floatId,
+        .ilId = id,
+        .ilType = IL_USAGE_PIXTEX_UNKNOWN,
+        .strideId = 0,
+    };
+
+    addResource(compiler, &resource);
+}
+
 static void emitStructuredSrv(
     IlcCompiler* compiler,
     const Instruction* instr)
@@ -2125,13 +2160,38 @@ static void emitUavAtomicOp(
     IlcSpvId src1Id = loadSource(compiler, &instr->srcs[1], COMP_MASK_XYZW, vecTypeId);
     IlcSpvId valueId = emitVectorTrim(compiler, src1Id, vecTypeId, COMP_INDEX_X, 1);
 
-    if (instr->opcode == IL_OP_UAV_ADD || instr->opcode == IL_OP_UAV_READ_ADD) {
-        readId = ilcSpvPutAtomicOp(compiler->module, SpvOpAtomicIAdd, resource->texelTypeId,
-                                   texelPtrId, scopeId, semanticsId, valueId);
-    } else {
+    IlcSpvWord operation;
+    switch (instr->opcode) {
+    case IL_OP_UAV_ADD:
+    case IL_OP_UAV_READ_ADD:
+        operation = SpvOpAtomicIAdd;
+        break;
+    case IL_OP_UAV_MAX:
+    case IL_OP_UAV_READ_MAX:
+        operation = SpvOpAtomicSMax;
+        break;
+    case IL_OP_UAV_MIN:
+    case IL_OP_UAV_READ_MIN:
+        operation = SpvOpAtomicSMin;
+        break;
+    case IL_OP_UAV_OR:
+    case IL_OP_UAV_READ_OR:
+        operation = SpvOpAtomicOr;
+        break;
+    case IL_OP_UAV_AND:
+    case IL_OP_UAV_READ_AND:
+        operation = SpvOpAtomicAnd;
+        break;
+    case IL_OP_UAV_XOR:
+    case IL_OP_UAV_READ_XOR:
+        operation = SpvOpAtomicXor;
+        break;
+    default:
         assert(false);
+        break;
     }
-
+    readId = ilcSpvPutAtomicOp(compiler->module, operation, resource->texelTypeId,
+                               texelPtrId, scopeId, semanticsId, valueId);
     if (instr->dstCount > 0) {
         IlcSpvId resId = emitVectorGrow(compiler, readId, resource->texelTypeId, 1);
         storeDestination(compiler, &instr->dsts[0], resId, vecTypeId);
@@ -2202,6 +2262,65 @@ static void emitStructuredSrvLoad(
     storeDestination(compiler, dst, loadId, compiler->float4Id);
 }
 
+static void emitRawSrvLoad(
+    IlcCompiler* compiler,
+    const Instruction* instr)
+{
+    uint8_t ilResourceId = GET_BITS(instr->control, 0, 7);
+    bool indexedResourceId = GET_BIT(instr->control, 12);
+
+    if (indexedResourceId) {
+        LOGW("unhandled indexed resource ID\n");
+    }
+
+    const IlcResource* resource = findResource(compiler, ilResourceId);
+    const Destination* dst = &instr->dsts[0];
+
+    if (resource == NULL) {
+        LOGE("resource %d not found\n", ilResourceId);
+        return;
+    }
+
+    IlcSpvId srcId = loadSource(compiler, &instr->srcs[0], COMP_MASK_XYZW, compiler->int4Id);
+    IlcSpvId byteAddrId = emitVectorTrim(compiler, srcId, compiler->int4Id, COMP_INDEX_X, 1);
+
+    const IlcSpvId divIds[] = {
+        byteAddrId, ilcSpvPutConstant(compiler->module, compiler->intId, 4)
+    };
+    IlcSpvId wordAddrId = ilcSpvPutAlu(compiler->module, SpvOpSDiv, compiler->intId, 2, divIds);
+
+    // Read up to four components based on the destination mask
+    IlcSpvId zeroId = ilcSpvPutConstant(compiler->module, compiler->intId, ZERO_LITERAL);
+    IlcSpvId oneId = ilcSpvPutConstant(compiler->module, compiler->intId, 1);
+    IlcSpvId ptrTypeId = ilcSpvPutPointerType(compiler->module, SpvStorageClassStorageBuffer,
+                                              resource->texelTypeId);
+    IlcSpvId fZeroId = ilcSpvPutConstant(compiler->module, compiler->floatId, ZERO_LITERAL);
+    IlcSpvWord constituents[] = { fZeroId, fZeroId, fZeroId, fZeroId };
+
+    for (unsigned i = 0; i < 4; i++) {
+        if (dst->component[i] == IL_MODCOMP_NOWRITE) {
+            break;
+        }
+
+        if (i > 0) {
+            // Increment address
+            const IlcSpvId incrementIds[] = { wordAddrId, oneId };
+            wordAddrId = ilcSpvPutAlu(compiler->module, SpvOpIAdd, compiler->intId,
+                                      2, incrementIds);
+        }
+
+        const IlcSpvId indexIds[] = { zeroId, wordAddrId };
+        IlcSpvId ptrId = ilcSpvPutAccessChain(compiler->module, ptrTypeId, resource->id,
+                                              2, indexIds);
+        constituents[i] = ilcSpvPutLoad(compiler->module, resource->texelTypeId, ptrId);
+    }
+
+    IlcSpvId loadId = ilcSpvPutCompositeConstruct(compiler->module, compiler->float4Id,
+                                                  4, constituents);
+    storeDestination(compiler, dst, loadId, compiler->float4Id);
+}
+
+
 static void emitImplicitInput(
     IlcCompiler* compiler,
     SpvBuiltIn spvBuiltIn,
@@ -2407,6 +2526,16 @@ static void emitInstr(
         break;
     case IL_OP_UAV_ADD:
     case IL_OP_UAV_READ_ADD:
+    case IL_OP_UAV_MAX:
+    case IL_OP_UAV_READ_MAX:
+    case IL_OP_UAV_MIN:
+    case IL_OP_UAV_READ_MIN:
+    case IL_OP_UAV_AND:
+    case IL_OP_UAV_READ_AND:
+    case IL_OP_UAV_OR:
+    case IL_OP_UAV_READ_OR:
+    case IL_OP_UAV_XOR:
+    case IL_OP_UAV_READ_XOR:
         emitUavAtomicOp(compiler, instr);
         break;
     case IL_OP_DCL_STRUCT_SRV:
@@ -2415,6 +2544,12 @@ static void emitInstr(
     case IL_OP_SRV_STRUCT_LOAD:
         emitStructuredSrvLoad(compiler, instr);
         break;
+    case IL_OP_DCL_RAW_SRV:
+        emitRawSrv(compiler, instr);
+        break;
+    case IL_OP_SRV_RAW_LOAD:
+        emitRawSrvLoad(compiler, instr);
+        break;
     case IL_DCL_STRUCT_LDS:
         emitStructuredLds(compiler, instr);
         break;
diff --git a/src/amdilc/amdilc_decoder.c b/src/amdilc/amdilc_decoder.c
index 309fb1e..10ff8c4 100644
--- a/src/amdilc/amdilc_decoder.c
+++ b/src/amdilc/amdilc_decoder.c
@@ -99,8 +99,20 @@ static const OpcodeInfo mOpcodeInfos[IL_OP_LAST] = {
     [IL_OP_UAV_STORE] = { IL_OP_UAV_STORE, 0, 2, 0, false },
     [IL_OP_UAV_ADD] = { IL_OP_UAV_ADD, 0, 2, 0, false },
     [IL_OP_UAV_READ_ADD] = { IL_OP_UAV_READ_ADD, 1, 2, 0, false },
+    [IL_OP_UAV_MAX] = { IL_OP_UAV_MAX, 0, 2, 0, false },
+    [IL_OP_UAV_READ_MAX] = { IL_OP_UAV_READ_MAX, 1, 2, 0, false },
+    [IL_OP_UAV_MIN] = { IL_OP_UAV_MIN, 0, 2, 0, false },
+    [IL_OP_UAV_READ_MIN] = { IL_OP_UAV_READ_MIN, 1, 2, 0, false },
+    [IL_OP_UAV_AND] = { IL_OP_UAV_AND, 0, 2, 0, false },
+    [IL_OP_UAV_READ_AND] = { IL_OP_UAV_READ_AND, 1, 2, 0, false },
+    [IL_OP_UAV_OR] = { IL_OP_UAV_OR, 0, 2, 0, false },
+    [IL_OP_UAV_READ_OR] = { IL_OP_UAV_READ_OR, 1, 2, 0, false },
+    [IL_OP_UAV_XOR] = { IL_OP_UAV_XOR, 0, 2, 0, false },
+    [IL_OP_UAV_READ_XOR] = { IL_OP_UAV_READ_XOR, 1, 2, 0, false },
     [IL_OP_DCL_STRUCT_SRV] = { IL_OP_DCL_STRUCT_SRV, 0, 0, 1, false },
     [IL_OP_SRV_STRUCT_LOAD] = { IL_OP_SRV_STRUCT_LOAD, 1, 1, 0, false },
+    [IL_OP_DCL_RAW_SRV] = { IL_OP_DCL_RAW_SRV, 0, 0, 0, false },
+    [IL_OP_SRV_RAW_LOAD] = { IL_OP_SRV_RAW_LOAD, 1, 1, 0, false },
     [IL_DCL_STRUCT_LDS] = { IL_DCL_STRUCT_LDS, 0, 0, 2, false },
     [IL_OP_U_BIT_EXTRACT] = { IL_OP_U_BIT_EXTRACT, 1, 3, 0, false },
     [IL_OP_U_BIT_INSERT] = { IL_OP_U_BIT_INSERT, 1, 4, 0, false },
diff --git a/src/amdilc/amdilc_dump.c b/src/amdilc/amdilc_dump.c
index 6d2d173..28827ce 100644
--- a/src/amdilc/amdilc_dump.c
+++ b/src/amdilc/amdilc_dump.c
@@ -721,6 +721,36 @@ static void dumpInstruction(
     case IL_OP_UAV_READ_ADD:
         fprintf(file, "uav_read_add_id(%u)", GET_BITS(instr->control, 0, 13));
         break;
+    case IL_OP_UAV_MAX:
+        fprintf(file, "uav_max_id(%u)", GET_BITS(instr->control, 0, 13));
+        break;
+    case IL_OP_UAV_READ_MAX:
+        fprintf(file, "uav_read_max_id(%u)", GET_BITS(instr->control, 0, 13));
+        break;
+    case IL_OP_UAV_MIN:
+        fprintf(file, "uav_min_id(%u)", GET_BITS(instr->control, 0, 13));
+        break;
+    case IL_OP_UAV_READ_MIN:
+        fprintf(file, "uav_read_min_id(%u)", GET_BITS(instr->control, 0, 13));
+        break;
+    case IL_OP_UAV_OR:
+        fprintf(file, "uav_or_id(%u)", GET_BITS(instr->control, 0, 13));
+        break;
+    case IL_OP_UAV_READ_OR:
+        fprintf(file, "uav_read_or_id(%u)", GET_BITS(instr->control, 0, 13));
+        break;
+    case IL_OP_UAV_AND:
+        fprintf(file, "uav_and_id(%u)", GET_BITS(instr->control, 0, 13));
+        break;
+    case IL_OP_UAV_READ_AND:
+        fprintf(file, "uav_read_and_id(%u)", GET_BITS(instr->control, 0, 13));
+        break;
+    case IL_OP_UAV_XOR:
+        fprintf(file, "uav_xor_id(%u)", GET_BITS(instr->control, 0, 13));
+        break;
+    case IL_OP_UAV_READ_XOR:
+        fprintf(file, "uav_read_xor_id(%u)", GET_BITS(instr->control, 0, 13));
+        break;
     case IL_OP_DCL_STRUCT_SRV:
         fprintf(file, "dcl_struct_srv_id(%u) %u",
                 GET_BITS(instr->control, 0, 13), instr->extras[0]);
@@ -729,6 +759,13 @@ static void dumpInstruction(
         fprintf(file, "srv_struct_load%s_id(%u)",
                 GET_BIT(instr->control, 12) ? "_ext" : "", GET_BITS(instr->control, 0, 7));
         break;
+    case IL_OP_DCL_RAW_SRV:
+        fprintf(file, "dcl_raw_srv_id(%u)", GET_BITS(instr->control, 0, 13));
+        break;
+    case IL_OP_SRV_RAW_LOAD:
+        fprintf(file, "srv_raw_load%s_id(%u)",
+                GET_BIT(instr->control, 12) ? "_ext" : "", GET_BITS(instr->control, 0, 7));
+        break;
     case IL_DCL_STRUCT_LDS:
         fprintf(file, "dcl_struct_lds_id(%u) %u, %u",
                 GET_BITS(instr->control, 0, 13), instr->extras[0], instr->extras[1]);

@Osyfe
Copy link

Osyfe commented May 16, 2021

Ok, I have added this missing instruction handling, please try it out. I also added dump handling for these instructions.

The instruction errors have disappeared, but the game still crashes:

=== GRVK 0.4.0 ===
I/0000029C/grInitAndEnumerateGpus: app "DragonAgeInquisition" (01000000), engine "Frostbite" (00FDE001), api 00018000
W/0000029C/grInitAndEnumerateGpus: unhandled alloc callbacks
I/0000029C/grCreateDevice: 1002:67DF "Radeon RX 570 Series" (Vulkan 1.2.170, driver 2.0.179)
E/0000029C/loadSource: source register 4 4099 not found
E/0000029C/loadSource: source register 4 4096 not found
E/0000029C/loadSource: source register 4 4099 not found
E/0000029C/loadSource: source register 4 4097 not found
E/0000029C/loadSource: source register 4 4099 not found
E/0000029C/loadSource: source register 4 4098 not found

@Cherser-s
Copy link
Contributor

Cherser-s commented May 16, 2021

Now that's weird, especially considering that there are no longer missing instructions in the log...

@Cherser-s
Copy link
Contributor

Cherser-s commented Sep 5, 2021

I'm getting these errors while launching bf4 (also I'm getting graphical errors obviously), also I've resolved AMD IL opcodes myself:

  • unhandled instruction IL_OP_PREFIX
  • unhandled instruction IL_DCL_LDS
  • unhandled instruction IL_OP_UAV_STRUCT_STORE
  • unhandled instruction IL_OP_UAV_READ_UMAX
  • unhandled instruction IL_OP_LDS_READ_ADD
  • unhandled instruction IL_OP_APPEND_BUF_ALLOC (requires adding extra atomic counter per resource and atomicAdd'ing it on invokation)
  • W/00000968/getVkAccessFlagsImage: unsupported image state 0x130F (don't know what that enumeration means as it isn't present in docs), probably should be GR_IMAGE_STATE_DISCARD but it's value is 0x131f (weird that it is placed in between 0x130e and 0x1310)
  • W/00000978/grCreateShader: unhandled Re-Z flag
  • W/0000096C/ilcCompileKernel: unhandled hull/domain shader type
  • W/00000958/grCreateImageView: non-identity swizzle 3.4.1.2 for storage image 103
  • W/00000958/grCreateDepthStencilView: unhandled flags 0x1
  • W/00000958/grCreateImageView: non-cube image view created for cube image 00007fffff4a2f10

@Cherser-s
Copy link
Contributor

Cherser-s commented Sep 5, 2021

ok, so I've implemented some changes regarding some of these instructions:

diff --git a/src/amdilc/amdilc_compiler.c b/src/amdilc/amdilc_compiler.c
index a150d26..5d042d8 100644
--- a/src/amdilc/amdilc_compiler.c
+++ b/src/amdilc/amdilc_compiler.c
@@ -48,6 +48,7 @@ typedef struct {
     uint32_t ilId;
     uint8_t ilType;
     IlcSpvId strideId;
+    bool structured;
 } IlcResource;
 
 typedef struct {
@@ -1144,6 +1145,7 @@ static void emitResource(
         .ilId = id,
         .ilType = type,
         .strideId = 0,
+        .structured = false,
     };
 
     addResource(compiler, &resource);
@@ -1210,6 +1212,7 @@ static void emitTypedUav(
         .ilId = id,
         .ilType = type,
         .strideId = 0,
+        .structured = false,
     };
 
     addResource(compiler, &resource);
@@ -1219,6 +1222,7 @@ static void emitUav(
     IlcCompiler* compiler,
     const Instruction* instr)
 {
+    bool isStructured = instr->opcode == IL_OP_DCL_STRUCT_UAV;
     uint16_t id = GET_BITS(instr->control, 0, 13);
 
     IlcSpvId arrayId = ilcSpvPutRuntimeArrayType(compiler->module, compiler->floatId, true);
@@ -1234,7 +1238,7 @@ static void emitUav(
     ilcSpvPutDecoration(compiler->module, structId, SpvDecorationBlock, 0, NULL);
     ilcSpvPutMemberDecoration(compiler->module, structId, 0, SpvDecorationOffset, 1, &memberOffset);
 
-    ilcSpvPutName(compiler->module, arrayId, "structUav");
+    ilcSpvPutName(compiler->module, arrayId, isStructured ? "structUav" : "rawUav");
     emitBinding(compiler, resourceId, ILC_BASE_RESOURCE_ID + id, VK_DESCRIPTOR_TYPE_STORAGE_BUFFER);
 
     const IlcResource resource = {
@@ -1244,7 +1248,9 @@ static void emitUav(
         .texelTypeId = compiler->floatId,
         .ilId = id,
         .ilType = IL_USAGE_PIXTEX_UNKNOWN,
-        .strideId = instr->extras[0],
+        .strideId = ilcSpvPutConstant(compiler->module, compiler->intId,
+                                      isStructured ? instr->extras[0] : 4),
+        .structured = isStructured,
     };
 
     addResource(compiler, &resource);
@@ -1283,6 +1289,35 @@ static void emitSrv(
         .ilType = IL_USAGE_PIXTEX_UNKNOWN,
         .strideId = ilcSpvPutConstant(compiler->module, compiler->intId,
                                       isStructured ? instr->extras[0] : 4),
+        .structured = isStructured,
+    };
+
+    addResource(compiler, &resource);
+}
+
+static void emitLds(
+    IlcCompiler* compiler,
+    const Instruction* instr)
+{
+    uint16_t id = GET_BITS(instr->control, 0, 13);
+    unsigned length = instr->extras[0];
+
+    IlcSpvId lengthId = ilcSpvPutConstant(compiler->module, compiler->uintId, length);
+    IlcSpvId arrayId = ilcSpvPutArrayType(compiler->module, compiler->uintId, lengthId);
+    IlcSpvId pArrayId = ilcSpvPutPointerType(compiler->module, SpvStorageClassWorkgroup, arrayId);
+    IlcSpvId resourceId = ilcSpvPutVariable(compiler->module, pArrayId, SpvStorageClassWorkgroup);
+
+    ilcSpvPutName(compiler->module, arrayId, "rawLds");
+
+    const IlcResource resource = {
+        .resType = RES_TYPE_LDS,
+        .id = resourceId,
+        .typeId = arrayId,
+        .texelTypeId = compiler->uintId,
+        .ilId = id,
+        .ilType = IL_USAGE_PIXTEX_UNKNOWN,
+        .strideId = ilcSpvPutConstant(compiler->module, compiler->intId, 4),
+        .structured = false,
     };
 
     addResource(compiler, &resource);
@@ -1311,6 +1346,7 @@ static void emitStructuredLds(
         .ilId = id,
         .ilType = IL_USAGE_PIXTEX_UNKNOWN,
         .strideId = ilcSpvPutConstant(compiler->module, compiler->intId, stride),
+        .structured = true,
     };
 
     addResource(compiler, &resource);
@@ -2450,30 +2486,169 @@ static void emitUavStore(
     ilcSpvPutImageWrite(compiler->module, resourceId, addressId, elementId);
 }
 
-static void emitUavAtomicOp(
+static void emitStructUavStore(
     IlcCompiler* compiler,
     const Instruction* instr)
 {
     uint8_t ilResourceId = GET_BITS(instr->control, 0, 14);
 
     const IlcResource* resource = findResource(compiler, RES_TYPE_GENERIC, ilResourceId);
+    const Destination* dst = &instr->dsts[0];
 
     if (resource == NULL) {
         LOGE("resource %d not found\n", ilResourceId);
         return;
     }
 
-    IlcSpvId vecTypeId = ilcSpvPutVectorType(compiler->module, resource->texelTypeId, 4);
-    IlcSpvId pointerTypeId = ilcSpvPutPointerType(compiler->module, SpvStorageClassImage,
+    //IlcSpvId resourceId = ilcSpvPutLoad(compiler->module, resource->typeId, resource->id);
+    IlcSpvId srcId = loadSource(compiler, &instr->srcs[0], COMP_MASK_XYZW, compiler->int4Id);
+    IlcSpvId indexId = emitVectorTrim(compiler, srcId, compiler->int4Id, COMP_INDEX_X, 1);
+    IlcSpvId offsetId = emitVectorTrim(compiler, srcId, compiler->int4Id, COMP_INDEX_Y, 1);
+
+    IlcSpvId elementTypeId = ilcSpvPutVectorType(compiler->module, resource->texelTypeId, 4);
+    IlcSpvId elementId = loadSource(compiler, &instr->srcs[1], COMP_MASK_XYZW, elementTypeId);
+
+    // addr = (index * stride + offset) / 4
+    const IlcSpvId mulIds[] = { indexId, resource->strideId };
+    IlcSpvId baseId = ilcSpvPutAlu(compiler->module, SpvOpIMul, compiler->intId, 2, mulIds);
+    const IlcSpvId addIds[] = { baseId, offsetId };
+    IlcSpvId byteAddrId = ilcSpvPutAlu(compiler->module, SpvOpIAdd, compiler->intId, 2, addIds);
+    const IlcSpvId divIds[] = {
+        byteAddrId, ilcSpvPutConstant(compiler->module, compiler->intId, 4)
+    };
+    IlcSpvId wordAddrId = ilcSpvPutAlu(compiler->module, SpvOpSDiv, compiler->intId, 2, divIds);
+
+    IlcSpvId oneId = ilcSpvPutConstant(compiler->module, compiler->intId, 1);
+    IlcSpvId ptrTypeId = ilcSpvPutPointerType(compiler->module, SpvStorageClassStorageBuffer,
+                                              resource->texelTypeId);
+    // Write up to four components based on the destination mask
+    for (unsigned i = 0; i < 4; i++) {
+        if (dst->component[i] == IL_MODCOMP_NOWRITE) {
+            break;
+        }
+
+        if (i > 0) {
+            // Increment address
+            const IlcSpvId incrementIds[] = { wordAddrId, oneId };
+            wordAddrId = ilcSpvPutAlu(compiler->module, SpvOpIAdd, compiler->intId,
+                                      2, incrementIds);
+        }
+
+        IlcSpvId ptrId = ilcSpvPutAccessChain(compiler->module, ptrTypeId, resource->id,
+                                              1, &wordAddrId);
+        IlcSpvId componentId = emitVectorTrim(compiler, elementId, elementTypeId, i, 1);
+        ilcSpvPutStore(compiler->module, ptrId, componentId);
+    }
+}
+
+static void emitLdsAtomicOp(
+    IlcCompiler* compiler,
+    const Instruction* instr)
+{
+    uint8_t ilResourceId = GET_BITS(instr->control, 0, 4);
+
+    const IlcResource* resource = findResource(compiler, RES_TYPE_LDS, ilResourceId);
+
+    if (resource == NULL) {
+        LOGE("resource %d not found\n", ilResourceId);
+        return;
+    }
+
+    IlcSpvId pointerTypeId = ilcSpvPutPointerType(compiler->module, SpvStorageClassWorkgroup,
                                                   resource->texelTypeId);
     IlcSpvId addressId = loadSource(compiler, &instr->srcs[0], COMP_MASK_XYZW, compiler->int4Id);
-    IlcSpvId trimAddressId = emitVectorTrim(compiler, addressId, compiler->int4Id, COMP_INDEX_X,
-                                            getResourceDimensionCount(resource->ilType));
-    IlcSpvId zeroId = ilcSpvPutConstant(compiler->module, compiler->intId, ZERO_LITERAL);
-    IlcSpvId texelPtrId = ilcSpvPutImageTexelPointer(compiler->module, pointerTypeId, resource->id,
-                                                     trimAddressId, zeroId);
+    IlcSpvId byteAddrId;
+    if (resource->structured) {
+        IlcSpvId indexId = emitVectorTrim(compiler, addressId, compiler->int4Id, COMP_INDEX_X, 1);
+        IlcSpvId offsetId = emitVectorTrim(compiler, addressId, compiler->int4Id, COMP_INDEX_Y, 1);
+        // addr = (index * stride + offset) / 4
+        const IlcSpvId mulIds[] = { indexId, resource->strideId };
+        IlcSpvId baseId = ilcSpvPutAlu(compiler->module, SpvOpIMul, compiler->intId, 2, mulIds);
+        const IlcSpvId addIds[] = { baseId, offsetId };
+        byteAddrId = ilcSpvPutAlu(compiler->module, SpvOpIAdd, compiler->intId, 2, addIds);
+    } else {
+        byteAddrId = emitVectorTrim(compiler, addressId, compiler->int4Id, COMP_INDEX_X, 1);
+    }
+    const IlcSpvId divIds[] = {
+        byteAddrId, ilcSpvPutConstant(compiler->module, compiler->intId, 4)
+    };
+    IlcSpvId wordAddrId = ilcSpvPutAlu(compiler->module, SpvOpSDiv, compiler->intId, 2, divIds);
+    IlcSpvId bufferPtrId = ilcSpvPutAccessChain(compiler->module, pointerTypeId, resource->id,
+                                      1, &wordAddrId);
+    IlcSpvId readId = 0;
+    IlcSpvId vecTypeId = ilcSpvPutVectorType(compiler->module, resource->texelTypeId, 4);
+    IlcSpvId scopeId = ilcSpvPutConstant(compiler->module, compiler->intId, SpvScopeDevice);
+    IlcSpvId semanticsId = ilcSpvPutConstant(compiler->module, compiler->intId,
+                                             SpvMemorySemanticsAcquireReleaseMask |
+                                             SpvMemorySemanticsImageMemoryMask);
+    IlcSpvId src1Id = loadSource(compiler, &instr->srcs[1], COMP_MASK_XYZW, vecTypeId);
+    IlcSpvId valueId = emitVectorTrim(compiler, src1Id, vecTypeId, COMP_INDEX_X, 1);
+
+    if (instr->opcode == IL_OP_LDS_ADD || instr->opcode == IL_OP_LDS_READ_ADD) {
+        readId = ilcSpvPutAtomicOp(compiler->module, SpvOpAtomicIAdd, resource->texelTypeId,
+                                   bufferPtrId, scopeId, semanticsId, valueId);
+    } else if (instr->opcode == IL_OP_LDS_UMAX || instr->opcode == IL_OP_LDS_READ_UMAX) {
+        readId = ilcSpvPutAtomicOp(compiler->module, SpvOpAtomicUMax, resource->texelTypeId,
+                                   bufferPtrId, scopeId, semanticsId, valueId);
+    } else {
+        assert(false);
+    }
 
+    if (instr->dstCount > 0) {
+        IlcSpvId resId = emitVectorGrow(compiler, readId, resource->texelTypeId, 1);
+        storeDestination(compiler, &instr->dsts[0], resId, vecTypeId);
+    }
+}
+
+static void emitUavAtomicOp(
+    IlcCompiler* compiler,
+    const Instruction* instr)
+{
+    uint8_t ilResourceId = GET_BITS(instr->control, 0, 14);
+
+    const IlcResource* resource = findResource(compiler, RES_TYPE_GENERIC, ilResourceId);
+
+    if (resource == NULL) {
+        LOGE("resource %d not found\n", ilResourceId);
+        return;
+    }
+
+    IlcSpvId texelPtrId ;
+
+    if (resource->strideId == 0) {
+        IlcSpvId pointerTypeId = ilcSpvPutPointerType(compiler->module, SpvStorageClassImage,
+                                                      resource->texelTypeId);
+        IlcSpvId addressId = loadSource(compiler, &instr->srcs[0], COMP_MASK_XYZW, compiler->int4Id);
+        IlcSpvId trimAddressId = emitVectorTrim(compiler, addressId, compiler->int4Id, COMP_INDEX_X,
+                                                getResourceDimensionCount(resource->ilType));
+        IlcSpvId zeroId = ilcSpvPutConstant(compiler->module, compiler->intId, ZERO_LITERAL);
+        texelPtrId = ilcSpvPutImageTexelPointer(compiler->module, pointerTypeId, resource->id,
+                                                trimAddressId, zeroId);
+    } else {
+        IlcSpvId pointerTypeId = ilcSpvPutPointerType(compiler->module, SpvStorageClassStorageBuffer,
+                                                      resource->texelTypeId);
+        IlcSpvId addressId = loadSource(compiler, &instr->srcs[0], COMP_MASK_XYZW, compiler->int4Id);
+        IlcSpvId byteAddrId;
+        if (resource->structured) {
+            IlcSpvId indexId = emitVectorTrim(compiler, addressId, compiler->int4Id, COMP_INDEX_X, 1);
+            IlcSpvId offsetId = emitVectorTrim(compiler, addressId, compiler->int4Id, COMP_INDEX_Y, 1);
+            // addr = (index * stride + offset) / 4
+            const IlcSpvId mulIds[] = { indexId, resource->strideId };
+            IlcSpvId baseId = ilcSpvPutAlu(compiler->module, SpvOpIMul, compiler->intId, 2, mulIds);
+            const IlcSpvId addIds[] = { baseId, offsetId };
+            byteAddrId = ilcSpvPutAlu(compiler->module, SpvOpIAdd, compiler->intId, 2, addIds);
+        } else {
+            byteAddrId = emitVectorTrim(compiler, addressId, compiler->int4Id, COMP_INDEX_X, 1);
+        }
+        const IlcSpvId divIds[] = {
+            byteAddrId, ilcSpvPutConstant(compiler->module, compiler->intId, 4)
+        };
+        IlcSpvId wordAddrId = ilcSpvPutAlu(compiler->module, SpvOpSDiv, compiler->intId, 2, divIds);
+        texelPtrId = ilcSpvPutAccessChain(compiler->module, pointerTypeId, resource->id,
+                                              1, &wordAddrId);
+    }
     IlcSpvId readId = 0;
+    IlcSpvId vecTypeId = ilcSpvPutVectorType(compiler->module, resource->texelTypeId, 4);
     IlcSpvId scopeId = ilcSpvPutConstant(compiler->module, compiler->intId, SpvScopeDevice);
     IlcSpvId semanticsId = ilcSpvPutConstant(compiler->module, compiler->intId,
                                              SpvMemorySemanticsAcquireReleaseMask |
@@ -2484,6 +2659,9 @@ static void emitUavAtomicOp(
     if (instr->opcode == IL_OP_UAV_ADD || instr->opcode == IL_OP_UAV_READ_ADD) {
         readId = ilcSpvPutAtomicOp(compiler->module, SpvOpAtomicIAdd, resource->texelTypeId,
                                    texelPtrId, scopeId, semanticsId, valueId);
+    } else if (instr->opcode == IL_OP_UAV_UMAX || instr->opcode == IL_OP_UAV_READ_UMAX) {
+        readId = ilcSpvPutAtomicOp(compiler->module, SpvOpAtomicUMax, resource->texelTypeId,
+                                   texelPtrId, scopeId, semanticsId, valueId);
     } else {
         assert(false);
     }
@@ -2777,7 +2955,9 @@ static void emitInstr(
     case IL_OP_DCL_TYPED_UAV:
         emitTypedUav(compiler, instr);
         break;
+    case IL_OP_DCL_STRUCT_UAV:
     case IL_OP_DCL_TYPELESS_UAV:
+    case IL_OP_DCL_RAW_UAV:
         emitUav(compiler, instr);
         break;
     case IL_OP_UAV_LOAD:
@@ -2786,10 +2966,21 @@ static void emitInstr(
     case IL_OP_UAV_STORE:
         emitUavStore(compiler, instr);
         break;
+    case IL_OP_UAV_STRUCT_STORE:
+        emitStructUavStore(compiler, instr);
+        break;
     case IL_OP_UAV_ADD:
     case IL_OP_UAV_READ_ADD:
+    case IL_OP_UAV_UMAX:
+    case IL_OP_UAV_READ_UMAX:
         emitUavAtomicOp(compiler, instr);
         break;
+    case IL_OP_LDS_ADD:
+    case IL_OP_LDS_READ_ADD:
+    case IL_OP_LDS_UMAX:
+    case IL_OP_LDS_READ_UMAX:
+        emitLdsAtomicOp(compiler, instr);
+        break;
     case IL_OP_DCL_RAW_SRV:
     case IL_OP_DCL_STRUCT_SRV:
         emitSrv(compiler, instr);
@@ -2800,6 +2991,9 @@ static void emitInstr(
     case IL_DCL_STRUCT_LDS:
         emitStructuredLds(compiler, instr);
         break;
+    case IL_DCL_LDS:
+        emitLds(compiler, instr);
+        break;
     case IL_DCL_GLOBAL_FLAGS:
         emitGlobalFlags(compiler, instr);
         break;

But some changes must be made:

  1. according to docs, atomic operations have different logic on address calculation for structs, raw and typed uavs. Currently only typed UAV atomics are working properly.
  2. add optional atomic counter to each resource
  3. LDS atomic operations have to be implemented as well
    UPD: 1 and 3 were implemented in patch above

@bazookaben
Copy link

bazookaben commented Dec 6, 2021

Tried out 0.5.0 on a Radeon 5700xt w/ Ryzen 1700x in Windows 10.

For some reason, the game would only run in exclusive fullscreen mode. If I try to run in borderless I get a crash to desktop.

Also, I noticed v-sync is broken.

Beyond that, the only other bug I noticed is that some post-effects seem to break the graphics. Like if I go out of bounds or underwater, some or all parts of the world go black.

As far as performance goes, compared to DX11 it looks like CPU is 20% slower but GPU seems about 10% faster, just based on a quick look at the game's built in performance graph.

The settings I use DX11 run a tightrope between GPU and CPU though, so on some maps I would be GPU limited, others CPU limited. So with GRVK that's become completely CPU limited (unless there is some memory bandwidth bottleneck that the in-game performance graph wouldn't show).

By the way, Frostbite Engine's peformance graph is super useful to analyze CPU/GPU performance. You can enable it in the dev console with perfoverly.DrawGraph 1

grvk.log

Also, I noticed no mantle cache was created in Documents/Battlefield 4/cache.

@libcg
Copy link
Owner

libcg commented Dec 6, 2021

@bazookaben Thanks for the feedback, it's a small miracle that the game is running at all, looking at the logs some image format is not supported. You're correct that Vsync is not implemented right now, I kinda forgot about it because I'm using Freesync. I don't expect the native pipeline cache to be ever implemented, because Mantle assumes that the pipeline can be compiled on the spot, which isn't the case with the current state of Vulkan. I'll check out the performance graph for sure!

@niobium93
Copy link

I'm not sure if I'm missing something obvious, but the Graphics API setting seems missing in my game?
2021-12-07-23:15:39-screenshot

@libcg
Copy link
Owner

libcg commented Dec 7, 2021

@niobium93 check that the DLLs are in the correct place, and upload grvk.log if you see it in the game folder.

@niobium93
Copy link

They are right next to bf4.exe
No grvk.log is created by the game.

@libcg
Copy link
Owner

libcg commented Dec 7, 2021

@niobium93 what's your setup like? are you using BF4 from Steam?

@niobium93
Copy link

I'm on wine-tkg 6.22.r11.g61c3c024-326 and mesa-git-22.0.0_devel.147797.92d84f189c7. DXVK v1.9.2-62-gc13395db is also present. No Steam.

@libcg libcg closed this as completed Dec 10, 2021
@libcg
Copy link
Owner

libcg commented Dec 10, 2021

Moving to #37 #38 #39 #40

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants