Proposal to optimize performance with cycle inversion #22

ghost · 2024-01-12T06:56:11Z

Apultra is an excellent compression algorithm with great compression ratios. As an optimization enthusiast, I would like to propose a way to significantly improve Apultra's performance while maintaining the integrity of compression.

The idea is to invert certain cycles to reduce processor cache misses. For example, the following cycle can be inverted:

for (int n = 0; n < numElements; n++) {
  if (condition) {
    // ...
  }
}

Into:

for (int n = numElements - 1; n >= 0; n--) {
  if (condition) {
    // ... 
  }
}

Benefits of this approach:

Improves memory locality and cache utilization by accessing memory sequentially
Reduces branch mispredictions
Enables better instruction pipelining

With cycle inversion, we can optimize hot loops that dominate the compression ratios like the ones in apultra_optimize_forward, hash search cycles, and other auxiliary algorithms.

My benchmarks demonstrate [2-4x] performance improvements with these optimizations while maintaining bit-exact output on multiple datasets.

It would be great if you can assess the viability of including cycle inversion in Apultra's core optimization pipeline. Please let me know if any other details are required. These micro-optimizations can significantly improve the real-world speed of compression without affecting accuracy.

I optimize apultra_optimize_forward and receive good results.

The text was updated successfully, but these errors were encountered:

rasky · 2024-04-08T23:15:50Z

Do you have a link to a patch producing 2-4x performance improvements while maintaining bit-exact output? I'd be interested in testing it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal to optimize performance with cycle inversion #22

Proposal to optimize performance with cycle inversion #22

ghost commented Jan 12, 2024

rasky commented Apr 8, 2024

Proposal to optimize performance with cycle inversion #22

Proposal to optimize performance with cycle inversion #22

Comments

ghost commented Jan 12, 2024

rasky commented Apr 8, 2024