Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add methods to query GPU hardware properties (utilization percentage, temperature, core/memory clocks, power draw) #6820

Open
Calinou opened this issue May 6, 2023 · 0 comments

Comments

@Calinou
Copy link
Member

Calinou commented May 6, 2023

Describe the project you are working on

The Godot editor 🙂

Describe the problem or limitation you are having in your project

Some games like Returnal display a preview of GPU resources when the settings menu is open:

image
From left to right: frames per second, CPU utilization, GPU utilization, GPU temperature, GPU core clock speed, video memory utilization.

This is useful to know whether the GPU is being utilized correctly, without having to install any third-party software such as RTSS or MangoHud.

From the Godot side of things, displaying this information in the editor when the View Frame Time panel is visible could prove useful. In the editor, the GPU may downclock as a result of low utilization, which can cause the displayed FPS to be lower than it should really be (sometimes more than twice as low). This is especially common on high-end GPUs.

Describe the feature / enhancement and how it helps to overcome the problem or limitation

Add methods to query GPU hardware properties (utilization percentage, temperature, core/memory clocks, power draw).

Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams

Unfortunately, there is no standard API to retrieve this information (not even an OS-specific one). The implementation will be OS-specific by design, although some code can be shared between Windows and Linux for NVIDIA (since both offer nvidia-smi out of the box).

The set of methods I propose is based on the most commonly needed metrics. In theory, these should be are universally available in one way or another on all GPUs (integrated or dedicated). This is the list of methods I propose adding:

  • OS.get_gpu_utilization_ratio() (returns a float between 0.0 and 1.0, with 1.0 being 100% utilization)
  • OS.get_gpu_memory_used() (returns memory used in bytes1 – see #4871 for total memory)
  • OS.get_gpu_temperature() (returns an integer GPU core temperature in degrees Celsius)
  • OS.get_gpu_current_core_clock() (returns an integer current GPU core clock in MHz)
  • OS.get_gpu_max_core_clock() (returns an integer maximum GPU core clock in MHz permitted by the hardware)
  • OS.get_gpu_current_memory_clock() (returns an integer current GPU memory clock in MHz permitted by the hardware)
  • OS.get_gpu_max_memory_clock() (returns an integer maximum GPU memory clock in MHz permitted by the hardware)
  • OS.get_gpu_power_draw() (returns an integer current power draw in watts)
  • OS.get_gpu_power_limit() (returns an integer power limit in watts – can be overridden by software)

Methods should return -1 if the value cannot be queried for any reason, so that the project developer can act accordingly. This also applies to unsupported platforms (Android, iOS, HTML5).

Some of the return values may be approximate, as the hardware may limit the sampling frequency or precision. For example, memory used may be rounded to the nearest megabyte. This should be noted in the class reference.

The method to use should be based on the GPU vendor name returned by RenderingServer.get_video_adapter_vendor(). This will avoid querying NVIDIA GPU information if using integrated graphics to run Godot on a system with hybrid graphics, for instance.

NVIDIA on Windows/Linux

You can use the nvidia-smi command line tool. This tool is fairly slow when executed manually (~15 ms per call), which means that calling it in a blocking manner will cause stuttering during gameplay (and therefore skew the readouts). Instead, the tool should be called once in "live" mode and its output should be read on every line printed. This makes this proposal effectively require #216 to be implemented efficiently.

For instance, to get all the fields proposed here as a set of comma-separated values that is continuously printed every 500 ms:

nvidia-smi -lms 500 \
		--query-gpu utilization.gpu,memory.used,temperature.gpu,clocks.current.graphics,clocks.max.graphics,clocks.current.memory,clocks.max.memory,power.draw,power.limit \
		--format csv,noheader,nounits

The output can then be used to populate various properties in the OS singleton, which can then be returned directly by the getter methods (so that querying them in _process() has no performance penalty).

Since the tool must be called continuously, I suggest only starting nvidia-smi once any of the metrics is queried for the first time, or gating the GPU measurement methods behind a OS.set_query_gpu_hardware_metrics(true) method (similar to viewport render time measurements in RenderingServer). This avoids calling nvidia-smi when it's not needed.

AMD/Intel on Linux (Mesa drivers in general)

Some files can be read in the /proc filesystem to retrieve these statistics. However, some of them may not be available for non-root users due to security restrictions. Some may not be available without modifying kernel boot parameters either.

AMD/Intel (Windows), macOS

This is likely the hardest part, and will require figuring out how various open source utilities have achieved this.

If this enhancement will not be used often, can it be worked around with a few lines of script?

The approach described here can be implemented by a script, but it requires #4871 to be done efficiently. Also, such a script will be fairly complex in nature.

Is there a reason why this should be core and not an add-on in the asset library?

This is about improving the performance troubleshooting and optimization experience for players and developers alike.

Footnotes

  1. The return value should be in bytes for consistency with https://github.com/godotengine/godot-proposals/issues/4871's return value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant