Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 14 additions & 16 deletions tutorials/performance/cpu_optimization.rst
Original file line number Diff line number Diff line change
@@ -1,12 +1,10 @@
:article_outdated: True

.. _doc_cpu_optimization:

CPU optimization
================

Measuring performance
=====================
---------------------

We have to know where the "bottlenecks" are to know how to speed up our program.
Bottlenecks are the slowest parts of the program that limit the rate that
Expand All @@ -18,7 +16,7 @@ lead to small performance improvements.
For the CPU, the easiest way to identify bottlenecks is to use a profiler.

CPU profilers
=============
-------------

Profilers run alongside your program and take timing measurements to work out
what proportion of time is spent in each function.
Expand All @@ -31,7 +29,7 @@ slow down your project significantly.
After profiling, you can look back at the results for a frame.

.. figure:: img/godot_profiler.png
.. figure:: img/godot_profiler.png
:align: center
:alt: Screenshot of the Godot profiler

Results of a profile of one of the demo projects.
Expand All @@ -51,7 +49,7 @@ For more info about using Godot's built-in profiler, see
:ref:`doc_debugger_panel`.

External profilers
~~~~~~~~~~~~~~~~~~
------------------

Although the Godot IDE profiler is very convenient and useful, sometimes you
need more power, and the ability to profile the Godot engine source code itself.
Expand Down Expand Up @@ -98,7 +96,7 @@ batching, which greatly speeds up 2D rendering by reducing bottlenecks in this
area.

Manually timing functions
=========================
-------------------------

Another handy technique, especially once you have identified the bottleneck
using a profiler, is to manually time the function or area under test.
Expand Down Expand Up @@ -126,7 +124,7 @@ time them as you go. This will give you crucial feedback as to whether the
optimization is working (or not).

Caches
======
------

CPU caches are something else to be particularly aware of, especially when
comparing timing results of two different versions of a function. The results
Expand Down Expand Up @@ -159,7 +157,7 @@ rendering and physics. Still, you should be especially aware of caching when
writing GDExtensions.

Languages
=========
---------

Godot supports a number of different languages, and it is worth bearing in mind
that there are trade-offs involved. Some languages are designed for ease of use
Expand All @@ -170,7 +168,7 @@ language you choose. If your project is making a lot of calculations in its own
code, consider moving those calculations to a faster language.

GDScript
~~~~~~~~
^^^^^^^^

:ref:`GDScript <toc-learn-scripting-gdscript>` is designed to be easy to use and iterate,
and is ideal for making many types of games. However, in this language, ease of
Expand All @@ -179,7 +177,7 @@ calculations, consider moving some of your project to one of the other
languages.

C#
~~
^^

:ref:`C# <toc-learn-scripting-C#>` is popular and has first-class support in Godot. It
offers a good compromise between speed and ease of use. Beware of possible
Expand All @@ -188,13 +186,13 @@ common approach to workaround issues with garbage collection is to use *object
pooling*, which is outside the scope of this guide.

Other languages
~~~~~~~~~~~~~~~
^^^^^^^^^^^^^^^

Third parties provide support for several other languages, including `Rust
<https://github.com/godot-rust/gdext>`_.

C++
~~~
^^^

Godot is written in C++. Using C++ will usually result in the fastest code.
However, on a practical level, it is the most difficult to deploy to end users'
Expand All @@ -203,7 +201,7 @@ GDExtensions and
:ref:`custom modules <doc_custom_modules_in_cpp>`.

Threads
=======
-------

Consider using threads when making a lot of calculations that can run in
parallel to each other. Modern CPUs have multiple cores, each one capable of
Expand All @@ -222,7 +220,7 @@ debugger doesn't support setting up breakpoints in threads yet.
For more information on threads, see :ref:`doc_using_multiple_threads`.

SceneTree
=========
---------

Although Nodes are an incredibly powerful and versatile concept, be aware that
every node has a cost. Built-in functions such as `_process()` and
Expand All @@ -247,7 +245,7 @@ You can avoid the SceneTree altogether by using Server APIs. For more
information, see :ref:`doc_using_servers`.

Physics
=======
-------

In some situations, physics can end up becoming a bottleneck. This is
particularly the case with complex worlds and large numbers of physics objects.
Expand Down
43 changes: 23 additions & 20 deletions tutorials/performance/general_optimization.rst
Original file line number Diff line number Diff line change
@@ -1,12 +1,10 @@
:article_outdated: True

.. _doc_general_optimization:

General optimization tips
=========================

Introduction
~~~~~~~~~~~~
------------

In an ideal world, computers would run at infinite speed. The only limit to
what we could achieve would be our imagination. However, in the real world, it's
Expand Down Expand Up @@ -48,7 +46,7 @@ But in reality, there are several different kinds of performance problems:
Each of these are annoying to the user, but in different ways.

Measuring performance
=====================
---------------------

Probably the most important tool for optimization is the ability to measure
performance - to identify where bottlenecks are, and to measure the success of
Expand All @@ -57,19 +55,24 @@ our attempts to speed them up.
There are several methods of measuring performance, including:

- Putting a start/stop timer around code of interest.
- Using the Godot profiler.
- Using external third-party CPU profilers.
- Using GPU profilers/debuggers such as
`NVIDIA Nsight Graphics <https://developer.nvidia.com/nsight-graphics>`__
or `apitrace <https://apitrace.github.io/>`__.
- Checking the frame rate (with V-Sync disabled).
- Using the :ref:`Godot profiler <doc_the_profiler>`.
- Using :ref:`external CPU profilers <doc_using_cpp_profilers>`.
- Using external GPU profilers/debuggers such as
`NVIDIA Nsight Graphics <https://developer.nvidia.com/nsight-graphics>`__,
`Radeon GPU Profiler <https://gpuopen.com/rgp/>`__ or
`Intel Graphics Performance Analyzers <https://www.intel.com/content/www/us/en/developer/tools/graphics-performance-analyzers/overview.html>`__.
- Checking the frame rate (with V-Sync disabled). Third-party utilities such as
`RivaTuner Statistics Server <https://www.guru3d.com/files-details/rtss-rivatuner-statistics-server-download.html>`__
(Windows) or `MangoHud <https://github.com/flightlessmango/MangoHud>`__
(Linux) can also be useful here.
- Using an unofficial `debug menu add-on <https://github.com/godot-extended-libraries/godot-debug-menu>`.

Be very aware that the relative performance of different areas can vary on
different hardware. It's often a good idea to measure timings on more than one
device. This is especially the case if you're targeting mobile devices.

Limitations
~~~~~~~~~~~
^^^^^^^^^^^

CPU profilers are often the go-to method for measuring performance. However,
they don't always tell the whole story.
Expand All @@ -87,7 +90,7 @@ As a result of these limitations, you often need to use detective work to find
out where bottlenecks are.

Detective work
~~~~~~~~~~~~~~
--------------

Detective work is a crucial skill for developers (both in terms of performance,
and also in terms of bug fixing). This can include hypothesis testing, and
Expand Down Expand Up @@ -119,7 +122,7 @@ Once you know which of the two halves contains the bottleneck, you can
repeat this process until you've pinned down the problematic area.

Profilers
=========
---------

Profilers allow you to time your program while running it. Profilers then
provide results telling you what percentage of time was spent in different
Expand All @@ -133,7 +136,7 @@ and lead to slower performance.
For more info about using Godot's built-in profiler, see :ref:`doc_the_profiler`.

Principles
==========
----------

`Donald Knuth <https://en.wikipedia.org/wiki/Donald_Knuth>`__ said:

Expand Down Expand Up @@ -163,7 +166,7 @@ optimization is (by definition) undesirable, performant software is the result
of performant design.

Performant design
~~~~~~~~~~~~~~~~~
^^^^^^^^^^^^^^^^^

The danger with encouraging people to ignore optimization until necessary, is
that it conveniently ignores that the most important time to consider
Expand All @@ -178,7 +181,7 @@ will often run many times faster than a mediocre design with low-level
optimization.

Incremental design
~~~~~~~~~~~~~~~~~~
^^^^^^^^^^^^^^^^^^

Of course, in practice, unless you have prior knowledge, you are unlikely to
come up with the best design the first time. Instead, you'll often make a series
Expand All @@ -195,7 +198,7 @@ structures and algorithms for *cache locality* of data and linear access, rather
than jumping around in memory.

The optimization process
~~~~~~~~~~~~~~~~~~~~~~~~
^^^^^^^^^^^^^^^^^^^^^^^^

Assuming we have a reasonable design, and taking our lessons from Knuth, our
first step in optimization should be to identify the biggest bottlenecks - the
Expand All @@ -212,7 +215,7 @@ The process is thus:
3. Return to step 1.

Optimizing bottlenecks
~~~~~~~~~~~~~~~~~~~~~~
^^^^^^^^^^^^^^^^^^^^^^

Some profilers will even tell you which part of a function (which data accesses,
calculations) are slowing things down.
Expand All @@ -237,10 +240,10 @@ positive effect will be outweighed by the negatives of more complex code, and
you may choose to leave out that optimization.

Appendix
========
--------

Bottleneck math
~~~~~~~~~~~~~~~
^^^^^^^^^^^^^^^

The proverb *"a chain is only as strong as its weakest link"* applies directly to
performance optimization. If your project is spending 90% of the time in
Expand Down
Loading