Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upSmart chunk unloading #3151
Conversation
SafwatHalaby
reviewed
Apr 19, 2016
SafwatHalaby
referenced this pull request
Apr 19, 2016
Closed
Save chunks more often to reduce RAM usage #3142
worktycho
reviewed
Apr 20, 2016
worktycho
reviewed
Apr 20, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
worktycho
Apr 20, 2016
Member
This looks good. Other than my two comments regarding the trade-offs taken, this looks like a good feature.
|
This looks good. Other than my two comments regarding the trade-offs taken, this looks like a good feature. |
SafwatHalaby
referenced this pull request
Apr 20, 2016
Closed
What is the purpose of a cChunkLayer? #2559
SafwatHalaby
reviewed
Apr 20, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Schwertspize
Apr 20, 2016
Contributor
But is it possible? Do chunks in ram have a constant size, or a size with a not too wild deviation?
I think it is always possible. If the chunk size (In memory) varies from chunk to chunk, I would just say that we just configure a threshold in the config and if it is exceeded the beat chunk to be unloaded gets selected and unloaded and if the memory limit is still exceeded, recursion. Doesn't that work good enough?
I think it is always possible. If the chunk size (In memory) varies from chunk to chunk, I would just say that we just configure a threshold in the config and if it is exceeded the beat chunk to be unloaded gets selected and unloaded and if the memory limit is still exceeded, recursion. Doesn't that work good enough? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
worktycho
Apr 20, 2016
Member
Chunks can vary in size from a couple of Hundred bytes to ~160KiB. Average size and variance depend on the world generation. The problem with a loop is that you'd have to redo the total memory calculation for every iteration of the loop. You might get an estimate of per chunk usage if we made cChunkData expose the amount of memory used for storing segments though.
|
Chunks can vary in size from a couple of Hundred bytes to ~160KiB. Average size and variance depend on the world generation. The problem with a loop is that you'd have to redo the total memory calculation for every iteration of the loop. You might get an estimate of per chunk usage if we made cChunkData expose the amount of memory used for storing segments though. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
SafwatHalaby
Apr 20, 2016
Member
@Schwertspize The problem is that calculating the allocated RAM accurately seems hard. @worktycho wrote this rough formula:
LoadedChunks * (
sizeof(cChunk) + (Internal Allocations Factor))+ Allocated Segments * sizeof(cChunkData)
)
@worktycho , the "Allocated segments" part confuses me. What is it exactly? Did you mean this?
LoadedChunks * (
sizeof(cChunk) + (Internal Allocations Factor))+ Allocated Sections * sizeof(cChunkData::sChunkSection)
)
|
@Schwertspize The problem is that calculating the allocated RAM accurately seems hard. @worktycho wrote this rough formula:
@worktycho , the "Allocated segments" part confuses me. What is it exactly? Did you mean this?
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
SafwatHalaby
Apr 20, 2016
Member
@worktycho we don't have to calculate per loop. We can do this incrementally, like I did with the chunk counting.
|
@worktycho we don't have to calculate per loop. We can do this incrementally, like I did with the chunk counting. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
worktycho
Apr 20, 2016
Member
The thing about allocated segments is exactly what I meant.
The problem with doing it incrementally is that internal allocations factor. We can get an approximate memory usage by just having a constant value which estimates how much is allocated in internal allocations. This would probably get something like 10% error. If we are happy with that sort of error then doing it incrementally is easy, but you need to expose some way of getting the chunkdata segment counts for each chunk.
If we want anything more precise, we'd need to start tracking slt memory allocations, which would mean a per chunk allocator wrapper if we want per chunk usage statistics.
|
The thing about allocated segments is exactly what I meant. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
What are "Allocated segments"? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
worktycho
Apr 20, 2016
Member
Instances of cChunkData::cSection. They hold a 16x16x16 block section of a chunks block, meta and light data. At 10KiB each, they are the biggest single object in a chunk.
|
Instances of cChunkData::cSection. They hold a 16x16x16 block section of a chunks block, meta and light data. At 10KiB each, they are the biggest single object in a chunk. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
SafwatHalaby
Apr 20, 2016
Member
I think we don't have to get more accurate than that. Totally ignoring Internal Allocations Factor should be fine too. We just need to warn the user that the limit is just an estimation and that chunks will consume more.
|
I think we don't have to get more accurate than that. Totally ignoring |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Schwertspize
Apr 20, 2016
Contributor
I'd say 10% error is good for checking if the next chunk should be freed too. Well, for the threshold we should use accurate, and check them only every 5 minutes or anything, but within that "iteration loop" to get below the threshold we can keep 10%. I mean it's not a maximum ram usage, it's the average ram usage and could peak below and above
|
I'd say 10% error is good for checking if the next chunk should be freed too. Well, for the threshold we should use accurate, and check them only every 5 minutes or anything, but within that "iteration loop" to get below the threshold we can keep 10%. I mean it's not a maximum ram usage, it's the average ram usage and could peak below and above |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
worktycho
Apr 20, 2016
Member
If we want accurate measurement for the threshold, then we still need STL allocation tracking, but we only need to keep one per world allocator, rather than a per chunk allocator.
|
If we want accurate measurement for the threshold, then we still need STL allocation tracking, but we only need to keep one per world allocator, rather than a per chunk allocator. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
SafwatHalaby
Apr 20, 2016
Member
Got the initial RAM implementation in place and rebased. The chunk unload code still does not rely on RAM as of now. Stay tuned.
|
Got the initial RAM implementation in place and rebased. The chunk unload code still does not rely on RAM as of now. Stay tuned. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
SafwatHalaby
Apr 20, 2016
Member
I have a working implementation. We should see how bad the estimation is, and potentially improve it by:
- Calling
Increment/DecrementApproximateChunkRAMwherever appropriate - Taking
Internal Allocations Factorinto account. - @worktycho 's Internal allocator tracking thing?
|
I have a working implementation. We should see how bad the estimation is, and potentially improve it by:
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
SafwatHalaby
Apr 20, 2016
Member
The CI error is strange.
In file included from /home/travis/build/cuberite/cuberite/src/ChunkData.cpp:8:
In file included from /home/travis/build/cuberite/cuberite/src/World.h:29:
/home/travis/build/cuberite/cuberite/src/ClientHandle.h:21:10: fatal error:
'json/json.h' file not found
#include "json/json.h"
|
The CI error is strange.
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
SafwatHalaby
Apr 20, 2016
Member
Can someone help with the CI? I don't think it's my PR that's the problem.
|
Can someone help with the CI? I don't think it's my PR that's the problem. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Builds fine on my server (CentOS 6) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Seadragon91
Apr 21, 2016
Contributor
I compiled fine with clang 3.5. Using clang 3.4 for compiling fails on my side.
|
I compiled fine with clang 3.5. Using clang 3.4 for compiling fails on my side. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Was the clang 3.4 error identical to the CI error described above? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
yay, xoft isn't dead! |
madmaxoft
reviewed
May 16, 2016
| @@ -37,7 +35,7 @@ class cChunkData | ||
| struct sChunkSection; | ||
| cChunkData(cAllocationPool<cChunkData::sChunkSection> & a_Pool); | ||
| cChunkData(cAllocationPool<cChunkData::sChunkSection> & a_Pool, cMemoryCounter & a_WorldMemoryCounter); |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
madmaxoft
May 16, 2016
Member
This could use a bit more commenting - what are the parameters, what they're used for etc.
madmaxoft
May 16, 2016
Member
This could use a bit more commenting - what are the parameters, what they're used for etc.
madmaxoft
reviewed
May 16, 2016
| @@ -513,6 +531,8 @@ class cChunkMap | ||
| cWorld * m_World; | ||
| size_t m_ChunkCounter; | ||
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
madmaxoft
May 16, 2016
Member
xoft kinda wishes he was dead, rather than reading the entirety of the >220 comments here :P
|
xoft kinda wishes he was dead, rather than reading the entirety of the >220 comments here :P |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Well, just you wait for #3115. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Can anything terrible happen when servers crash mid-save? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
SafwatHalaby
May 17, 2016
Member
I've removed my posts discussing the other possible strategies. I think I'll remove strategy 2 and get the "minimum viable product" merged for now.
|
I've removed my posts discussing the other possible strategies. I think I'll remove strategy 2 and get the "minimum viable product" merged for now. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
worktycho
May 18, 2016
Member
If a server crashes mid-save it can corrupt the save files. Unfortunately there is nothing we can do about that without completely redoing the chunk format.
|
If a server crashes mid-save it can corrupt the save files. Unfortunately there is nothing we can do about that without completely redoing the chunk format. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
madmaxoft
May 18, 2016
Member
@worktycho There is a way to make the server crashes mid-save less catastrophic. Currently the chunk data is overwritten in the file in-place. If we changed that so that the chunk data always uses free space in the file, then we lower the risk - the server first writes the new data, then updates the "pointer" to that data in the file's header, much smaller surface for problems. This is, however off-topic, let's not discuss it here further.
|
@worktycho There is a way to make the server crashes mid-save less catastrophic. Currently the chunk data is overwritten in the file in-place. If we changed that so that the chunk data always uses free space in the file, then we lower the risk - the server first writes the new data, then updates the "pointer" to that data in the file's header, much smaller surface for problems. This is, however off-topic, let's not discuss it here further. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
SafwatHalaby
May 19, 2016
Member
That's my main concern with this PR. It can save more often, which may increase the chance of chunk corruption. Especially when maxRam is set low.
Theoritical questions:
The storage thread is seldom the one that crashes. Can we somehow make the storage thread "defer" the termination in the event of a crash, finish saving the current chunk, and only stop afterwards? I am not very knowlegable with OS crash handling or multithreading, so this could be impossible. If the storage thread runs in an entirely different process, I'd imagine it would be possible? Enlighten me.
|
That's my main concern with this PR. It can save more often, which may increase the chance of chunk corruption. Especially when maxRam is set low. Theoritical questions: The storage thread is seldom the one that crashes. Can we somehow make the storage thread "defer" the termination in the event of a crash, finish saving the current chunk, and only stop afterwards? I am not very knowlegable with OS crash handling or multithreading, so this could be impossible. If the storage thread runs in an entirely different process, I'd imagine it would be possible? Enlighten me. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
worktycho
May 19, 2016
Member
I'm not sure about windows, but on Unix at least, you can intercept segfaults. But we would have to make sure that a segfault on a storage thread did not try to continue saving on that thread. And its dangerous to do unless you are triggering segfaults on purpose. Because lots of locks will be held be the dead thread.
|
I'm not sure about windows, but on Unix at least, you can intercept segfaults. But we would have to make sure that a segfault on a storage thread did not try to continue saving on that thread. And its dangerous to do unless you are triggering segfaults on purpose. Because lots of locks will be held be the dead thread. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Ideally the server doesn't crash, but I guess that's an ideal. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Any updates on this one? @LogicParrot |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
SafwatHalaby
Jul 27, 2016
Member
This isn't abandoned, just delayed. I'd like to study how serious is the likelihood of chunk corruption upon crashes when saves are done more often.
|
This isn't abandoned, just delayed. I'd like to study how serious is the likelihood of chunk corruption upon crashes when saves are done more often. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Fixing #2565 conflicts. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Sorry. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
SafwatHalaby
Aug 5, 2016
Member
I decided to divert my attention to implementing the rest of the mobs and hunting cClienthandle crash bugs for now, because those seem far higher priority than this PR.
In the meantime, I propose we alleviate the concern outlined in #3142 by following @tigerw 's advice and simply making the save interval shorter (e.g. 3 minutes) or configurable.
|
I decided to divert my attention to implementing the rest of the mobs and hunting cClienthandle crash bugs for now, because those seem far higher priority than this PR. In the meantime, I propose we alleviate the concern outlined in #3142 by following @tigerw 's advice and simply making the save interval shorter (e.g. 3 minutes) or configurable. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
SafwatHalaby
Aug 11, 2016
Member
And for Cuberite, already limited resources will be diverted to maintaining another extra component which is very unlikely to affect what a user sees, and very likely to cause more bugs.
The more I think about this, the more I lean toward @tigerw 's opinion. This is perfect in theory and it utilizes RAM optimally, but it adds code complexity and we're limited in developer time, and a much simpler solution can do the job well enough and solve e.g. the Raspberry Pi fast flying issues mentioned in #3142.
I don't think a user will notice any major difference between the solution outlined below and this PR, even though the solution below is extremely simple.
A difference will be noticable on high load, huge servers, but we're not there yet. And I think we should opt for the simpler solution for now. I'm closing this, but I'll keep the code, and perhaps this will be useful sometime in the future.
The alternative:
MaxUnusedChunks in settings.ini
If current unused chunks > MaxUnusedChunks, call a save cycle.
MaxUnusedChunks is set low for Raspberry Pis.
The more I think about this, the more I lean toward @tigerw 's opinion. This is perfect in theory and it utilizes RAM optimally, but it adds code complexity and we're limited in developer time, and a much simpler solution can do the job well enough and solve e.g. the Raspberry Pi fast flying issues mentioned in #3142. I don't think a user will notice any major difference between the solution outlined below and this PR, even though the solution below is extremely simple. A difference will be noticable on high load, huge servers, but we're not there yet. And I think we should opt for the simpler solution for now. I'm closing this, but I'll keep the code, and perhaps this will be useful sometime in the future. The alternative: |
SafwatHalaby
closed this
Aug 11, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
SafwatHalaby
Aug 11, 2016
Member
We should salvage the Chunk.cpp state changes though, because I think they make the chunk code clearer.
|
We should salvage the |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
bearbin
Aug 11, 2016
Member
Your proposed system does have the downside of not allowing a chunk/RAM limit which would be attractive for GSPs.
|
Your proposed system does have the downside of not allowing a chunk/RAM limit which would be attractive for GSPs. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
SafwatHalaby
Aug 11, 2016
Member
What's GSP?
I am perfectly aware the proposed system is not as good performance-wise, but it's an order of magnitude simpler and that is attractive.
|
What's GSP? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
bearbin
Aug 11, 2016
Member
https://www.spigotmc.org/wiki/glossary/#gsp
Most GSPs sell packages by RAM, and chunk numbers are effectively proportional to RAM consumption.
|
https://www.spigotmc.org/wiki/glossary/#gsp Most GSPs sell packages by RAM, and chunk numbers are effectively proportional to RAM consumption. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
SafwatHalaby
Aug 11, 2016
Member
Well, if MaxUnusedChunks is set to 0, freeing becomes automatically agressive, so the absolute minimum RAM is used. Couple that with setting a low player limit, and you get a reasonable -though not numerically inputted- RAM limit.
|
Well, if |
SafwatHalaby
referenced this pull request
Aug 29, 2016
Merged
Configurable dirty unused chunk cap to avoid RAM overuse #3359
SafwatHalaby
deleted the
SafwatHalaby:smartChunkRam
branch
Aug 31, 2016
SafwatHalaby
added
the
status/extract useful bits
label
Feb 24, 2017
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
SafwatHalaby
Oct 15, 2017
Member
Ping. The "useful bits" are: Chunk has an enum (saving, loading, etc) rather than several flags. Makes chunk states clearer. If someone has the time, this is useful to extract and merge separately.
|
Ping. The "useful bits" are: Chunk has an enum (saving, loading, etc) rather than several flags. Makes chunk states clearer. If someone has the time, this is useful to extract and merge separately. |
SafwatHalaby commentedApr 19, 2016
•
edited
Work in progress. This is a smarter replacement for #3142
MaxChunkRAMsetting, configurable insettings.ini(default: 30 MiB)MinimizeRamsetting, configurable insettings.ini(default: 1)Terminology:
Dirtiness:
Usage:
Other:
The server has two saving/unloading strategies.
Strategy 1:
MinimizeRamis 0:The server is RAM-greedy, and uses all the RAM specified in
MaxChunkRAMas a chunk cache in order to improve performance. Chunks are saved/unloaded as late as possible. Disk I/O is significantly reduced. Performance is improved. RAM usage is typically static and will not go below or aboveMaxChunkRAM.MaxChunkRAM, a minimum amount of clean unused chunks is unloaded in order to preserve theMaxChunkRAMlimit. The chunks are chosen randomly.MaxChunkRAMstill cannot be maintained, a save cycle is forced, allowing more chunks to be unloaded.MaxChunkRAMMaxChunkRAMlimit cannot be maintained even after all measures were taken, the "agressive" strategy is used. This happens when the RAM required by the used chunks exceedsMaxChunkRAM.Strategy 2:
MinimizeRamis 1: (Might remove this strategy)The server is RAM-economic, attempts to never reach
MaxChunkRAM. Clean unused Chunks are unloaded as soon as possible.MaxChunkRAMmuch harder to reach in comparison with the previous scheme.MaxChunkRAM, a save cycle is forced, allowing more chunks to be unloaded.MaxChunkRAMlimit cannot be maintained, the "agressive" strategy is used. This happens when the RAM required by the used chunks exceedsMaxChunkRAM."Agressive" strategy:
MaxChunkRAMin either strategy.Additionally, this closes #3166