Peakbagger

TODO

Benchmark performance
Profile benchmark
Optimize performance (7.91s → 3.35s, 58% faster)
Handle bad SRTM data

Optimizations

Replace `HashMap` with `FxHashMap` in `TileCache`

Profile showed 27.5% of CPU in HashMap hashing (hash_one 20.4% + Hasher::write 7.1%). TileCoord is 4 bytes (two i16s) but HashMap uses SipHash. Switched to FxHashMap from rustc-hash which uses a fast multiply-xor hash.

Baseline: 7.91s
After: 5.16s
Improvement: 35%

Cache current tile pointer in `find_next_peak`

The ring search visits many consecutive cells in the same tile, but each cache.elevation() call did a hash lookup. Added a CachedTile that stores a raw pointer to the current tile's elevation data and skips the hash lookup when the tile coordinate hasn't changed. The pointer is stable because Vec heap data doesn't move when the HashMap resizes.

Baseline: 5.16s
After: 4.61s
Improvement: 11%

Bulk memcpy + vectorized byte-swap in `Tile::from_bytes`

The per-element loop doing i16::from_be_bytes was not vectorized by LLVM. Replaced with copy_nonoverlapping followed by an in-place i16::from_be loop, which LLVM vectorizes into SIMD byte-swap instructions.

Baseline: 4.61s
After: 3.61s
Improvement: 22%

Reorder `ring_cells` for spatial coherence

The iteration interleaved top and bottom ring edges, alternating between opposite ends on every cell. This defeated the CachedTile optimization, forcing a hash lookup on nearly every cell. Reordering to iterate each edge contiguously keeps consecutive cells in the same tile.

Baseline: 3.61s
After: 3.30s
Improvement: 9%

Skip tiles below `min_elevation` threshold

Added max_elevation field to Tile. When the search enters a new tile, if max_elevation ≤ min_elevation, all cells in that tile are skipped without reading elevation data, avoiding cache misses entirely.

Benchmark (35km searches): no significant change (tiles nearby have qualifying peaks)
Bag 55 (207km search, min 2550m): ~5s → 2.7s (46% faster) — most tiles in the search area are below 2550m and get skipped

Fast path in `resolve_global` for same-tile cells

div_euclid / rem_euclid compile to idiv instructions (~30 cycles each). Added a fast path: if global row/col are in [0, stride), the cell is in the center tile and no division is needed. Uses unsigned comparison trick to check both bounds in one instruction.

Baseline: 3.35s
After: 3.27s
Improvement: 2%

Reuse `ring_cells` Vec across rings (reverted)

Attempted to reuse a single Vec across ring iterations instead of allocating a new one per ring. This caused an 8% regression, likely because Vec::with_capacity with exact size produces better-fitting allocations than reusing a growing buffer.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
benches		benches
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Peakbagger

TODO

Optimizations

Replace `HashMap` with `FxHashMap` in `TileCache`

Cache current tile pointer in `find_next_peak`

Bulk memcpy + vectorized byte-swap in `Tile::from_bytes`

Reorder `ring_cells` for spatial coherence

Skip tiles below `min_elevation` threshold

Fast path in `resolve_global` for same-tile cells

Reuse `ring_cells` Vec across rings (reverted)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Peakbagger

TODO

Optimizations

Replace HashMap with FxHashMap in TileCache

Cache current tile pointer in find_next_peak

Bulk memcpy + vectorized byte-swap in Tile::from_bytes

Reorder ring_cells for spatial coherence

Skip tiles below min_elevation threshold

Fast path in resolve_global for same-tile cells

Reuse ring_cells Vec across rings (reverted)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Replace `HashMap` with `FxHashMap` in `TileCache`

Cache current tile pointer in `find_next_peak`

Bulk memcpy + vectorized byte-swap in `Tile::from_bytes`

Reorder `ring_cells` for spatial coherence

Skip tiles below `min_elevation` threshold

Fast path in `resolve_global` for same-tile cells

Reuse `ring_cells` Vec across rings (reverted)

Packages