New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
push_constant not working in my case? #168
Comments
Thank you for reporting this @aliPMPAINT - could you try running the specific tests for the C++ push constants? Also can you give me a bit more info on your current AMD card + drivers? I'll be able to see whether it's actually an issue on the drivers. |
Ah just realised the logs are from the tests, not from the readme - ok that's strange, did you clear your cmake build directory and rebuilt from scratch for this? |
Also can you compile it with debug logs and post the logs here? That will show us whether the push consts are being set. |
I just had a quick look and it seems that push constants are always available in drivers, but that it's important to see what are the memory limits for the push constants. It sounds like it should be supported so it may be something to do with the build process |
Thanks for the response!
I'm using AMD Radeon HD 7770(in Linux), and I have two vulkan driver's, AMDVLK and mesa's Vulkan(RADV). |
Yeah... Will completely uninstall it and try again |
The tests? Ok |
So I completely uninstalled Vulkan Kompute, then recloned the whole repository(master branch) and built it from source with these CMake options:
The tests with spdlog set to
With AMDVLK:
|
When removed traces of push_constant, AMDVLK's segfault error also disappeared and the code worked fine. |
@aliPMPAINT thank you for diving deeper - not sure I understand the issue/solution tho. What do you mean by removing traces of push_cosnstant? |
When I removed the push_constant section from the code(below): int main() {
spdlog::set_level(
static_cast<spdlog::level::level_enum>(SPDLOG_LEVEL_TRACE));
// 1. Create Kompute Manager with default settings (device 0 and first compute compatible queue)
kp::Manager mgr(0);
// 2. Create and initialise Kompute Tensors through manager
auto tensorInA = mgr.tensor({ 2., 2., 2. });
auto tensorInB = mgr.tensor({ 1., 2., 3. });
auto tensorOutA = mgr.tensor({ 0., 0., 0. });
auto tensorOutB = mgr.tensor({ 0., 0., 0. });
std::vector<std::shared_ptr<kp::Tensor>> params = {tensorInA, tensorInB, tensorOutA, tensorOutB};
// 3. Create algorithm based on shader (supports buffers & push/spec constants)
std::string shader = (R"(
#version 450
layout (local_size_x = 1) in;
// The input tensors bind index is relative to index in parameter passed
layout(set = 0, binding = 0) buffer buf_in_a { float in_a[]; };
layout(set = 0, binding = 1) buffer buf_in_b { float in_b[]; };
layout(set = 0, binding = 2) buffer buf_out_a { float out_a[]; };
layout(set = 0, binding = 3) buffer buf_out_b { float out_b[]; };
// Kompute supports push constants updated on dispatch
// Kompute also supports spec constants on initalization
layout(constant_id = 0) const float const_one = 0;
void main() {
uint index = gl_GlobalInvocationID.x;
out_a[index] += in_a[index] * in_b[index];
out_b[index] += const_one;
}
)");
kp::Workgroup workgroup({3, 1, 1});
kp::Constants specConsts({ 2 });
auto algorithm = mgr.algorithm(params, kp::Shader::compile_source(shader), workgroup, specConsts);
// 4. Run operation synchronously using sequence
mgr.sequence()
->record<kp::OpTensorSyncDevice>(params)
->record<kp::OpAlgoDispatch>(algorithm)
->record<kp::OpAlgoDispatch>(algorithm)
->eval();
// 5. Sync results from the GPU asynchronously
auto sq = mgr.sequence();
sq->evalAsync<kp::OpTensorSyncLocal>(params);
// ... Do other work asynchronously whilst GPU finishes
sq->evalAwait();
// Prints the first output which is: { 4, 8, 12 }
for (const float& elem : tensorOutA->data()) std::cout << elem << " ";
// Prints the second output which is: { 10, 10, 10 }
for (const float& elem : tensorOutB->data()) std::cout << elem << " ";
} The segfault error disappeared(with AMDVLK's driver) and worked fine. |
Ah ok I see what you mean now, ok interesting so it actually is the step that adds the push_const into the command buffer that seems to cause the issue... I have just tried this locally with the nvidia card and it seems like there wasn't any issue. Would you be able to try again with a different push const value and share the code here? |
Also if you can share the version of your Vulkan SDK that should give us the full picture, basically run |
Which the results were similar to the previous one:
AMDVLK:
|
|
I ran kp python in Windows and git ErrorOutOfHostMemory error: Traceback (most recent call last):
File "welp.py", line 46, in <module>
algo = mgr.algorithm(params, kp.Shader.compile_source(shader), workgroup, spec_consts)
RuntimeError: vk::Device::createComputePipeline: ErrorOutOfHostMemory
WARNING:kp:Kompute Tensor destructor reached with null Device pointer
WARNING:kp:Kompute Tensor destructor reached with null Device pointer
WARNING:kp:Kompute Tensor destructor reached with null Device pointer
WARNING:kp:Kompute Tensor destructor reached with null Device pointer |
So, I connected my monitor to onboard GPU and then got a really interesting result(with mesa's driver):
No matter what value push_constants hold, I get these huge number as a result: 9.48375e+27 9.48375e+27 9.48375e+27 |
Ummm I got a funny news, my 8 year old onboard Graphics card(Intel HD 2500), which somehow supports Vulkan, yields the expected results and works perfectly:
So, I guess the problem is with the GPU/Vulkan drivers. |
I think it's with the GPU, here is something similar: |
@aliPMPAINT this is so strange - I think probably best to raise an issue for this, not sure if best is in the Vulkan forums... This definitely looks like some strange / obscure issue with the driver... I am looking at the vkPhysicalDeviceLimits and it says that the maxPushConstantsSize is 128bytes so definitely the provided variables wouldn't be an issue. One thing to consider is that by default it uses float32 as push constants - if you get those numbers I wonder if there may be a different size block that is being passed into the GPU. Although if it works on an older GPU I'm not sure why it would fail in the AMD one - it does sound like it may be a bug with the driver. |
@axsaucedo I've got a bad news. I tested the code with my laptop, which has an AMD intergrated GPU and an AMD dedicated GPU, and the same behavior was observed. Mesa's vulkan yielded zeros, with AMDVLK segfault, and with Windows, with kp python bindings(coudn't figure out how to compile and build the pure C++ version on Windows), I got the same error:
Now I find it unlikely that the issue is just about the drivers, as I tested different driver implemantations with 3 AMD GPUs. |
This is really strange... I am not sure what would be the best way to verify this, but would you be able to try out the SachaWilliems examples, particularly the one about push constants? It may be that potentially some type of push constants could be supported but not others (although I find it not completely likely) - if this is the case thos we could have a better understanding on what the issue is and proceed to add the fix accordingly |
Actually I've tested it before, it works |
Hmm ok, looking at the sachawilliems example on // Color and position data for each sphere is uploaded using push constants
struct SpherePushConstantData {
glm::vec4 color;
glm::vec4 position;
}; From what I can see in the glm package, the vec4 values can be of type float32 or float64 (when high precision is enabled). I guess what I can suspect is that perhaps your GPU may only accept high precision? Is there a way you could try this and re-run the sachawilliams example on push constant with high precision and non-high precision enabled? If this is the issue, then the solution may be at looking towards supporting multiple types of constants, but looking at the code, I can't see much that is different, namely the vkCmdPushConstants(
drawCmdBuffers[i],
pipelineLayout,
VK_SHADER_STAGE_VERTEX_BIT,
0,
sizeof(SpherePushConstantData),
&spheres[j]); Whereas in kompute we add it as: commandBuffer.pushConstants(*this->mPipelineLayout,
vk::ShaderStageFlagBits::eCompute,
0,
pushConstants.size() * sizeof(float),
pushConstants.data()); Which is pretty much the same, only using the vulkan C++ interface instead - I can't really see anything different except the potential type of float64 that could be as part of the type... |
Actually, I am just looking at the pushConstant example from SachaWilliems and I actually can see there is something different - it seems in their example the push constants are described in the PipelineLayout, whereas Kompute doesn't have this, so this actually may be a bug that for some reason doesn't come up in NVIDIA cards! Let me open a quick PR so we can test it, but it seems that it's an actual bug in Kompute - good shout on diving into it further |
Thank you for checking out! Regarding the first concern, I've actually tested changing the constants type but got the same results anyways: |
@aliPMPAINT thank you for the update, I have just added #174 - could you test out that PR to see if that fixes your issue? Basically running the tests in |
Yeah sure, will do after I get back home |
Adds push const ranges in pipelinelayout to fix #168
I tried both codes from README and the test (TestPushConstant.cpp), but apparently push_constant values all get zero value for some reasons?
Here is the result for
TestPushConstant.cpp)
:(specConsts work fine btw)
I don't know where the problem exactly is... Since the tests have passed in Github actions and I have clones the same code, I guess it's local. I have two vulkan drivers, one is AMDVLK and the second one AMD RADV VERDE from mesa.
What I just described happens with AMD RADV VERDE . In AMDVLK, I get a segfault(I still haven't investigated whether it's the push_constant or not since I'm running out of time) in both codes. So, I guess it has something to do with my Vulkan drivers?
The text was updated successfully, but these errors were encountered: