refactor: extract ground generation into dedicated module#848
Conversation
Move the ~670-line ground layer generation loop (surface blocks, vegetation, shorelines, underground fill, region flushing) from data_processing.rs into a new ground_generation.rs module. This keeps data_processing focused on OSM element orchestration while ground_generation owns the terrain pass.
There was a problem hiding this comment.
Pull request overview
Refactors world generation by extracting the full “ground layer” terrain pass (surface/sub-surface selection, shoreline blending, vegetation placement, underground fill, and region flushing) out of data_processing.rs into a dedicated ground_generation module, leaving data_processing focused on OSM element orchestration.
Changes:
- Added a new
ground_generationmodule that owns the chunk-ordered ground/terrain generation loop and incremental region flushing. - Updated
data_processing::generate_world_with_optionsto callground_generation::generate_ground_layer(...)after dropping caches. - Registered the new module in
main.rs.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/main.rs | Registers the new ground_generation module in the crate. |
| src/ground_generation.rs | New module containing the extracted ground generation pass and region flushing logic. |
| src/data_processing.rs | Removes inlined ground-generation logic and delegates to ground_generation::generate_ground_layer. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…rc<Ground> Move the MIN_Y constant from data_processing into world_editor::common (which already had a private duplicate) and re-export it. This removes the reverse dependency from ground_generation back into data_processing. Also narrow generate_ground_layer's ground parameter from &Arc<Ground> to &Ground since it only needs shared read access.
…data When flush_region() saves a region to disk and removes it from memory, subsequent block writes (e.g. tree canopy leaves crossing a region boundary) would silently re-create the region via entry().or_default(). The re-created region contained only the stray blocks, and save() would then overwrite the correct .mca file with this near-empty one — causing entire regions to appear missing in the generated world. Fix: track flushed region coordinates in a HashSet and skip all write operations (set_block, set_block_with_properties, set_block_if_absent, fill_column) targeting already-flushed regions. The skipped writes are cosmetic (tree leaves at region edges) so there is no visual impact. Closes the "partially empty world" bug introduced by the incremental region flushing in PR #793.
|
retrigger-benchmark |
|
⏱️ Benchmark run finished in 0m 44s 📈 Compared against baseline: 30s 📅 Last benchmark: 2026-03-26 20:19:05 UTC You can retrigger the benchmark by commenting |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…nopies Tree canopies extend up to 3 blocks (ROUND3_PATTERN) into neighboring regions. The previous flush logic serialized a region as soon as its last chunk_x completed, before the next chunk could place trees whose canopies spill back. Those spilled writes were silently dropped by the flushed_regions guard, causing truncated trees at every region boundary. Fix: instead of flushing at the last chunk of a region, flush when the first chunk of the NEXT region completes. This one-chunk (16-block) delay exceeds the maximum 3-block tree canopy radius, so all cross- region writes land before the flush. Also add a debug-only warning in the flushed_regions guard so unexpected writes to flushed regions are surfaced during development.
|
retrigger-benchmark |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The is_flushed() HashSet check on every set_block / set_block_if_absent / fill_column / set_block_with_properties call added measurable overhead on the hottest code path (hundreds of millions of invocations). Instead of guarding individual writes, let the rare cross-region writes (tree canopies) re-create stale region entries in memory — they hold only a few leaf blocks and cost negligible RAM. Before save(), call purge_stale_regions() once to remove these entries so the correct .mca files written during incremental flushing are never overwritten. This eliminates the per-write overhead entirely while preserving correctness.
|
retrigger-benchmark |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
retrigger-benchmark |
1 similar comment
|
retrigger-benchmark |
The incremental flush_region approach serialized regions sequentially during ground generation instead of in parallel at save time. Benchmarks show this causes a consistent ~47% time regression (30s → 44s) while actually INCREASING peak memory (~935 MB → ~1060 MB) due to serialization buffer overhead. The memory reduction only helps for extremely large areas where in-memory region data alone exceeds available RAM, but for those areas the time penalty is proportionally worse (linear scaling). This reverts the flush_region calls and all associated bookkeeping (flushed_regions tracking, purge_stale_regions, delayed flush). The flush_region method itself is removed as dead code. Regions are kept in memory during generation and saved in parallel via rayon in save_java(), restoring the original performance characteristics.
|
retrigger-benchmark |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
src/world_editor/mod.rs:734
- This PR removes WorldEditor::flush_region(), which previously enabled incremental region serialization to cap memory usage during ground generation. Since ground_generation now keeps all regions in memory until save_java(), large bboxes may regress in peak RAM (potentially OOM). If streaming/flush behavior is still required, keep flush_region (even as pub(crate)) and have ground generation call it when using JavaAnvil.
/// Sets a block only if no modification has been recorded yet at this
/// position (i.e. the in-memory overlay still holds AIR).
///
/// This is faster than `set_block_absolute` with `None` whitelists/blacklists
/// because it avoids the double HashMap traversal.
#[inline]
pub fn set_block_if_absent_absolute(&mut self, block: Block, x: i32, absolute_y: i32, z: i32) {
if !self.xzbbox.contains(&XZPoint::new(x, z)) {
return;
}
self.world.set_block_if_absent(x, absolute_y, z, block);
}
/// Returns true if a non-AIR block exists at the given absolute coordinates.
#[inline]
pub fn block_exists_absolute(&self, x: i32, absolute_y: i32, z: i32) -> bool {
self.world.get_block(x, absolute_y, z).is_some()
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Move the ~670-line ground layer generation loop (surface blocks, vegetation, shorelines, underground fill) from data_processing.rs into a new ground_generation.rs module. This keeps data_processing focused on OSM element orchestration while ground_generation owns the terrain pass.
Remove flushing since it won't prevent OOM crashes due to still existing memory spikes and massive generation time increase.