Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Note] Running cuDLA-samples on 6.0.6.0 #5

Open
zerollzeng opened this issue Sep 8, 2023 · 4 comments
Open

[Note] Running cuDLA-samples on 6.0.6.0 #5

zerollzeng opened this issue Sep 8, 2023 · 4 comments
Assignees
Labels
enhancement New feature or request fixed issue has been fixed

Comments

@zerollzeng
Copy link
Collaborator

This help users running cuDLA-samples on some old Drive-OS release.

Problem Description

If you see following error when import a external semaphore to cuDLA on 6.0.6.0.

    // Import semaphore to cudla
    memset(&m_WaitEventContext.cudla_ext_sema_mem_desc, 0, sizeof(m_WaitEventContext.cudla_ext_sema_mem_desc));
    m_WaitEventContext.cudla_ext_sema_mem_desc.extSyncObject = m_WaitEventContext.sync_obj;
    m_cudla_err = cudlaImportExternalSemaphore(m_DevHandle, &(m_WaitEventContext.cudla_ext_sema_mem_desc),
                                               &m_WaitEventContext.nvsci_sync_obj_reg_ptr, 0);
    CHECK_CUDLA_ERR(m_cudla_err, "cudla import external semaphore");

image

Root Cause

cuDLA doesn't support regular system semaphores. cuDLA supports sync point and deterministic semaphores. In 6.0.8, CUDA has changed the priority to prioritize NvSciSyncInternalAttrValPrimitiveType_Syncpoint over NvSciSyncInternalAttrValPrimitiveType_SysmemSemaphore. Hence, NvSciSync object of syncpoint type will be created when CUDA and cuDLA are participating entities as both prefer NvSciSyncInternalAttrValPrimitiveType_Syncpoint over NvSciSyncInternalAttrValPrimitiveType_SysmemSemaphore. But we didn't port the priority change to 6.0.6(previous releases may have the same issue but we didn't test) so it use NvSciSyncInternalAttrValPrimitiveType_SysmemSemaphore by default and cause the error.

Fix info:

NvSciSyncAttrKey_RequireDeterministicFences attribute will create NvSciSync object of determinitic semaphore, please apply the following patch on 6.0.6 manually.

diff --git a/src/cudla_context_standalone.cpp b/src/cudla_context_standalone.cpp
index 0252dac..ceaf7fb 100644
--- a/src/cudla_context_standalone.cpp
+++ b/src/cudla_context_standalone.cpp
@@ -283,6 +283,13 @@ bool cuDLAContextStandalone::initialize()
     CHECK_CUDLA_ERR(m_cudla_err, "get NvSci waiter sync attributes");
     m_cuda_err = cudaDeviceGetNvSciSyncAttributes(m_WaitEventContext.signaler_attr_list, 0, cudaNvSciSyncAttrSignal);
     CHECK_CUDA_ERR(m_cuda_err, "cuda get NvSci signal list");
+
+    memset(m_KeyValue, 0, sizeof(m_KeyValue));
+    keyValue[0].attrKey = NvSciSyncAttrKey_RequireDeterministicFences;
+    keyValue[0].value = (void*)&m_RequireDeterFences;
+    keyValue[0].len = sizeof(m_RequireDeterFences);
+    NvSciSyncAttrListSetAttrs(m_WaitEventContext.signaler_attr_list, keyValue, 1);
+
     NvSciSyncAttrList wait_event_attrs[2] = {m_WaitEventContext.signaler_attr_list, m_WaitEventContext.waiter_attr_list};
     m_nvsci_err = NvSciSyncAttrListReconcile(
         wait_event_attrs,
@@ -330,8 +337,10 @@ bool cuDLAContextStandalone::initialize()
         CUDLA_NVSCISYNC_ATTR_SIGNAL);
     CHECK_CUDLA_ERR(m_cudla_err, "get NvSci sync attributes");
     m_cuda_err = cudaDeviceGetNvSciSyncAttributes(m_SignalEventContext.waiter_attr_list, 0, cudaNvSciSyncAttrWait);
+    NvSciSyncAttrListSetAttrs(m_SignalEventContext.signaler_attr_list, keyValue, 1);
     CHECK_CUDA_ERR(m_cuda_err, "cuda get NvSci wait attribute list");
     NvSciSyncAttrList signal_event_attrs[2] = {m_SignalEventContext.signaler_attr_list, m_SignalEventContext.waiter_attr_list};
+
     m_nvsci_err = NvSciSyncAttrListReconcile(
         signal_event_attrs,
         2,
diff --git a/src/cudla_context_standalone.h b/src/cudla_context_standalone.h
index a002de5..5a9c343 100644
--- a/src/cudla_context_standalone.h
+++ b/src/cudla_context_standalone.h
@@ -123,6 +123,9 @@ class cuDLAContextStandalone
     NvSciSyncContext m_SignalEventContext;
     cudaExternalSemaphore_t m_SignalSem;
     cudaExternalSemaphoreSignalParams m_SignalParams;
+    
+    bool m_RequireDeterFences = true;
+    NvSciSyncAttrKeyValuePair m_KeyValue[1];
 
     cudlaTask m_Task;
@zerollzeng zerollzeng pinned this issue Sep 8, 2023
@zerollzeng zerollzeng self-assigned this Sep 8, 2023
@zerollzeng zerollzeng added the enhancement New feature or request label Sep 8, 2023
@lynettez lynettez unpinned this issue Sep 8, 2023
@zerollzeng
Copy link
Collaborator Author

The above patch still have some problems and we are trying to fix it.

@yaoshw
Copy link

yaoshw commented Nov 6, 2023

The above patch still have some problems and we are trying to fix it.

@zerollzeng hello, has the problem with this patch been resolved?

@zerollzeng
Copy link
Collaborator Author

Yes the patch is under review, I'll try to public ASAP.

@zerollzeng
Copy link
Collaborator Author

The patch has been merge in a2d645b

Please pull the latest code and rebuild.

@zerollzeng zerollzeng added the fixed issue has been fixed label Nov 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request fixed issue has been fixed
Projects
None yet
Development

No branches or pull requests

2 participants