-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Assumptions
- probable culprits:
- chunk generation
- lighting
- we use multithreading
- we don't support dynamic thread scaling
- chunk generation phases:
- Base terrain generation (noise)
- Terrain Additions (facets?)
- Chunk Mesh generation
- Chunk Lighting calcuation
Next Steps
- clarify chunk generation phase assumption using code
- measure duration of chunk generation phases
- confirm which threads are "ours"
- find out which threads are used for what
- visualize chunk generation flow
Collected Insights
Past Insights
Related Issues
- World Generation is slow and errorneous in multiplayer #4877 (comment)
- Substantial delay in connecting a player to a multiplayer server running JoshariasSurvival #4154 (comment)
Concurrency Providers & Consumers
(current state as created by @skaldarnar for Reactor effort)

Time-consuming tasks
(as compiled by @DarkWeird):
Many time takes:
- generating/loading chunks.
- Exposure node
- Shadow map.
- Nui
Many memory takes chunks... but we cannot shrink them almost. Bytewise operation take many time, any object structure (like octotree) take so huge memory.. that current impl is optimal. Java modules can enable agressive optimization if we hide it in separate module (or cannot). Also octotree can be more optimal by memory, when java implement compact class header. (Or we hide chunk in rust)
Most frequently called methods
(as compiled by @BenjaminAmos via JFR)
I don't know if this is right but a quick JFR recording seems to indicate void org.terasology.core.world.generator.facetProviders.DensityNoiseProvider.process(GeneratingRegion, float) as being called an awful lot (24% of the time). That doesn't necesarily mean that it's a bottleneck though (sampling does not measure execution time).
Interestingly, on the slow server recording the most frequently sampled methods were:
com.google.common.collect.MapMakerInternalMap$HashIterator.nextInTable()java.lang.invoke.VarHandleObjects$Array.getVolatile(VarHandleObjects$Array, Object, int)
The HashIterator method was generally (indirectly) called from:
void org.terasology.engine.rendering.logic.LightFadeSystem.update(float)void org.terasology.engine.logic.behavior.BehaviorSystem.update(float)void org.terasology.engine.logic.characters.CharacterSystem.update(float)void org.terasology.engine.logic.common.lifespan.LifespanSystem.update(float)void org.terasology.engine.logic.behavior.CollectiveBehaviorSystem.update(float)
Actually, it's those systems for both frequent methods.
Inside of those methods, the stack generally goes:
boolean com.google.common.collect.Iterators$ConcatenatedIterator.hasNext()boolean java.util.Spliterators$1Adapter.hasNext()boolean java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(Consumer)
Which implicates java.util.stream again (a known big issue in general).
DensityNoiseProvider is not as big of an issue on that machine though. It's only 1.52% of samples.
Could it possibly be related to
Lines 253 to 260 in f907533
| public final Iterable<EntityRef> getEntitiesWith(Class<? extends Component>... componentClasses) { | |
| if (isWorldPoolGlobalPool()) { | |
| return Iterables.concat(globalPool.getEntitiesWith(componentClasses), | |
| sectorManager.getEntitiesWith(componentClasses)); | |
| } | |
| return Iterables.concat(globalPool.getEntitiesWith(componentClasses), | |
| getCurrentWorldPool().getEntitiesWith(componentClasses), sectorManager.getEntitiesWith(componentClasses)); | |
| } |
Lines 287 to 300 in f907533
| public final Iterable<EntityRef> getEntitiesWith(Class<? extends Component>... componentClasses) { | |
| return () -> entityStore.keySet().stream() | |
| //Keep entities which have all of the required components | |
| .filter(id -> { | |
| for (Class<? extends Component> component : componentClasses) { | |
| if (componentStore.get(id, component) == null) { | |
| return false; | |
| } | |
| } | |
| return true; | |
| }) | |
| .map(id -> getEntity(id)) | |
| .iterator(); | |
| } |
Multi-Threading
@BenjaminAmos found the following list of threads indicated by JFR (threads marked with '*' are assumed to be "ours"):
C1
C2
*Chunk-Processing-0
*Chunk-Processing-Reactor
*Chunk-Unloader-0
*Chunk-Unloader-1
*Chunk-Unloader-2
*Chunk-Unloader-3
Common-Cleaner
FileSystemWatchService
FileSystemWatchService
Finalizer
G1
Java2D
JFR
JFR
JFR
JFR:
Logging-Cleaner
*main
nioEventLoopGroup-2-1
nioEventLoopGroup-3-1
nioEventLoopGroup-3-2
nioEventLoopGroup-3-3
Reference
*Saving-0
Service
Signal
SIGTERM
StreamCloser
Sweeper
*Thread
*Thread-1
*Thread-2
VM
Code Areas with Longest Per-Call Durations
Based on the statistical info in https://benjaminamos.github.io/TerasologyPerformanceTracyView/tracy-profiler.html
TODO: Refactor the individual code areas to improve their performance and reduce their per-call run time.
-
LwjglGraphics.java:91(4.4ms per call) -
WorldRendererImpl.java:343(3.83ms per call) -
VoxelWorldSystem.java:91(2.71ms per call) -
StateIngame.java:247(2.14ms per call) -
LocalChunkProvider.java:179(1.08ms per call) -
GameThreat.java:78(0.85ms per call) -
TerasologyEngine.java:506(0.76ms per call) -
TerasologyEngine.java:521(0.60ms per call) -
LwjglGraphics.java:88(0.54ms per call) -
ShadowMapNode.java:156(0.46ms per call)
References
Reactor Effort:
- Umbrella Item: Refactor usage of concurrency with reactor #4798
- Project Board: Reactor Adoption (view)
Potentially Helpful Tooling
- Java Flight Recording
- in-game Performance Monitor (
F3to open,F4to cycle through individual tools)- Means Mode
- Spikes Mode
- Memory Allocations Mode
- Running Threads Mode
- World Renderer Mode
- Rendering Execution Mode
- Enable the
Monitoringoption inSettings->Autoconfig->System Settingsto show this information in a separate window (requires restarting the game). ThePerformancetab will only function when the in-game performance monitor is open (F3) andF4has been pressed once.- Youtube Video showing chunk view of advanced monitoring tool: https://www.youtube.com/watch?v=dXdL9KDQKSg
Information Sources
- World Generation Tutorial: https://terasology.github.io/TutorialWorldGeneration
- Profiling Tutorial: https://terasology.github.io/TutorialProfiling/#/Monitoring%20Metrics%20with%20JMC%20-%20JDK%20Mission%20Control
Performance-related issues:
- World Generation is slow and errorneous in multiplayer #4877
- world loading takes an extremely long time with high CPU when many source modules are present #4393
- Substantial delay in connecting a player to a multiplayer server running JoshariasSurvival #4154
- deadlocked threads won't die after crash from RenderingModulesSettingScreen #3996
- Memory leak in the rendering pipline #2461
- Performance issue when auto-save triggers in constrained memory situations #1929
Tooling-related issues:
- Re-introduce the "Advanced" menu (for monitoring) with multiplayer compatibility #692
- Tutorial: Performance profiling setup and how to hunt improvements #3572
- improve visibility of current threads and tasks #4637
Follow-Up Actions
- improve documentation of in-game debug/analytics tooling