Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Unclear runtime error when running GPU binary without CUDA #262

Closed
waltsims opened this issue Jan 15, 2024 · 2 comments · Fixed by #297
Closed

[BUG] Unclear runtime error when running GPU binary without CUDA #262

waltsims opened this issue Jan 15, 2024 · 2 comments · Fixed by #297
Assignees
Labels
bug Something isn't working
Milestone

Comments

@waltsims
Copy link
Owner

Describe the bug

k-wave-python fails to give informative error messages when an error occurs during binary execution. Though k-wave-python gives the return code, the fact that k-wave-python can hide binary output obfuscates the cause for the error.

When running the photoacoustic waveform example on a google colab instance without a GPU instance, k-wave-python returns the following error code:

WARNING:root:DeprecationWarning: Attributes will soon be typed when saved and not saved 

---------------------------------------------------------------------------

CalledProcessError                        Traceback (most recent call last)

[<ipython-input-24-0596841266ea>](https://localhost:8080/#) in <cell line: 2>()
      1 # run the simulation
----> 2 sensor_data_2D = kspaceFirstOrder2D(
      3     medium=medium2,
      4     kgrid=kgrid2,
      5     source=source2,

2 frames

[/usr/local/lib/python3.10/dist-packages/kwave/kspaceFirstOrder2D.py](https://localhost:8080/#) in kspaceFirstOrder2D(kgrid, source, sensor, medium, simulation_options, execution_options)
    450         executor = Executor(simulation_options=simulation_options, execution_options=execution_options)
    451         executor_options = execution_options.get_options_string(sensor=k_sim.sensor)
--> 452         sensor_data = executor.run_simulation(k_sim.options.input_filename, k_sim.options.output_filename,
    453                                               options=executor_options)
    454         return sensor_data

[/usr/local/lib/python3.10/dist-packages/kwave/executor.py](https://localhost:8080/#) in run_simulation(self, input_filename, output_filename, options)
     31         stdout = None if self.execution_options.show_sim_log else subprocess.DEVNULL
     32         try:
---> 33             subprocess.run(command, stdout=stdout, shell=True, check=True)
     34         except subprocess.CalledProcessError as e:
     35             if isinstance(e.returncode, unittest.mock.MagicMock):

[/usr/lib/python3.10/subprocess.py](https://localhost:8080/#) in run(input, capture_output, timeout, check, *popenargs, **kwargs)
    524         retcode = process.poll()
    525         if check and retcode:
--> 526             raise CalledProcessError(retcode, process.args,
    527                                      output=stdout, stderr=stderr)
    528     return CompletedProcess(process.args, retcode, stdout, stderr)

CalledProcessError: Command 'OMP_PLACES=cores  OMP_PROC_BIND=SPREAD  /usr/local/lib/python3.10/dist-packages/kwave/bin/linux/kspaceFirstOrder-CUDA -i /tmp/15-Jan-2024-23-26-47_kwave_output.h5 -o /tmp/15-Jan-2024-23-26-47_kwave_input.h5  --verbose 2 --p_raw -s 1' returned non-zero exit status 1.

When running the binary manually, the CUDA executable returns the following:



┌───────────────────────────────────────────────────────────────┐
│                  kspaceFirstOrder-CUDA v1.3                   │
├───────────────────────────────────────────────────────────────┤
│ Git hash:            468dc31c2842a7df5f2a07c3a13c16c9b0b2b770 │
├───────────────────────────────────────────────────────────────┤
│ Reading simulation configuration:                        Done │
│ File format version:                                      1.2 │
│ Selected GPU device id:                                Failed │
└───────────────────────────────────────────────────────────────┘
┌───────────────────────────────────────────────────────────────┐
│            !!! K-Wave experienced a fatal error !!!           │
├───────────────────────────────────────────────────────────────┤
│ Error: Insufficient CUDA driver version. The code needs CUDA  │
│        version 12.0 but 0.0 is installed.                     │
├───────────────────────────────────────────────────────────────┤
│                      Execution terminated                     │
└───────────────────────────────────────────────────────────────┘

This output makes it clear, that CUDA is not installed on the instance/machine, which leads to the execution error.

@waltsims waltsims added the bug Something isn't working label Jan 15, 2024
@faridyagubbayli
Copy link
Collaborator

Surprized to see that the error is not displayed to the user even when we use stdout and stderr redirections for the child user.

@waltsims waltsims added this to the v0.3.2 milestone Jan 26, 2024
@waltsims waltsims self-assigned this Feb 8, 2024
@waltsims
Copy link
Owner Author

waltsims commented Feb 8, 2024

Solution: Always capture the binary output and only print to the terminal on failure or when verbosity is high.

waltsims added a commit that referenced this issue Feb 15, 2024
* capture stdout and show regardless on exception

* stream output to command line instead of capturing it

* update logic

* cleanup

* bugfix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants