Skip to content

GH Unified memory error #2749

@zohimchandani

Description

@zohimchandani

Required prerequisites

  • Consult the security policy. If reporting a security vulnerability, do not report the bug using this form. Use the process described in the policy to report the issue.
  • Make sure you've read the documentation. Your issue may be addressed there.
  • Search the issue tracker to verify that this hasn't already been reported. +1 or comment there if it has.
  • If possible, make a PR with a failing test to give us a starting point to work on!

Describe the bug

In the latest container and running on a GH system:

import os 
os.environ["CUDAQ_MAX_CPU_MEMORY_GB"] = "NONE"
print("CUDAQ_MAX_CPU_MEMORY_GB:", os.environ.get("CUDAQ_MAX_CPU_MEMORY_GB"))

import cudaq

cudaq.set_target('nvidia', option='mgpu,fp32')

# cudaq.set_target('nvidia')

cudaq.mpi.initialize()

print('mpi initialized?', cudaq.mpi.is_initialized())

n_qubits = 35

@cudaq.kernel
def kernel(n_qubits:int):
    
    qubits = cudaq.qvector(n_qubits)
    
    x(qubits)
    h(qubits)    
    y(qubits)

print('starting exp val calculation')

expectation_value = cudaq.observe(kernel, cudaq.spin.z(0), n_qubits)
# expectation_value = cudaq.sample(kernel, n_qubits)

print("num qubits", n_qubits, 'expectation_value', expectation_value)

cudaq.mpi.finalize()

cudaq.sample() works but observe throws the following error:


python3 hello_world.py --cudaq-full-stack-trace
CUDAQ_MAX_CPU_MEMORY_GB: NONE
[lego-cg1-qs-141:1715226] PMIX ERROR: ERROR in file gds_ds12_lock_pthread.c at line 168
mpi initialized? True
starting exp val calculation
Traceback (most recent call last):
  File "/demo/hello_world.py", line 28, in <module>
    expectation_value = cudaq.observe(kernel, cudaq.spin.z(0), n_qubits)
  File "/opt/nvidia/cudaq/cudaq/runtime/observe.py", line 136, in observe
    cudaq_runtime.resetExecutionContext()
RuntimeError: cudaErrorMemoryAllocation

Steps to reproduce the bug

NA

Expected behavior

NA

Is this a regression? If it is, put the last known working version (or commit) here.

Not a regression

Environment

  • CUDA-Q version:
  • Python version:
  • C++ compiler:
  • Operating system:

Suggestions

No response

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions