Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

T1 flag optimizations (#172) #945

Merged
merged 19 commits into from
Jun 13, 2017
Merged

Conversation

rouault
Copy link
Collaborator

@rouault rouault commented Jun 2, 2017

This patch set consists in :

  • porting Carl Hetherington T1 patch that optimimzed encoder side
  • fix/implement VSC encoding in it
  • adapt C. Hetherington tricks to the decoder side
  • use macros in MQC decoding and sig/ref/cleanpass for better assembly generation

Results on the performance test suite (ref time is current master):

../../data/input/nonregression/kodak_2layers_lrcp.j2c, 3 iterations, 1 threads, DECOMPRESS: ref_time 2226 ms, new_time 1967 ms, (improvement) -11.6 %
../../data/input/nonregression/kodak_2layers_lrcp.j2c, 5 iterations, 2 threads, DECOMPRESS: ref_time 3019 ms, new_time 2781 ms, (improvement) -7.9 %
../../data/input/nonregression/kodak_2layers_lrcp.j2c, 10 iterations, 4 threads, DECOMPRESS: ref_time 4864 ms, new_time 4548 ms, (improvement) -6.5 %
../../data/input/conformance/p0_07.j2k, 3 iterations, 1 threads, DECOMPRESS: ref_time 5647 ms, new_time 4959 ms, (improvement) -12.2 %
../../data/input/conformance/p0_04.j2k, 10 iterations, 1 threads, DECOMPRESS: ref_time 703 ms, new_time 622 ms, (improvement) -11.5 %
../../data/input/nonregression/X_4_2K_24_185_CBR_WB_000.tif, 3 iterations, 1 threads, COMPRESS: ref_time 6588 ms, new_time 5291 ms, (improvement) -19.7 %
TOTAL: ref_time 23049 ms, new_time 20172 ms, (improvement) -12.5 %

On (private image) MAPA.jp2 (recoded with standard flags), decoding time goes from 50.168 s to 45.030, so a reduction by 10% of decoding time as wel.

rouault added 19 commits May 23, 2017 16:16
…uclouvain#172)

Ported from Carl Hetherington work (actually through Matthieu Darbois's port
on top of OpenJPEG 2.1.0)

Can reduce total encoding time by 10-15%

WARNING: VSC mode is not implemented, and so is a temporary regression
that must be fixed.
…cros, so as to get better register allocation
…th row.

Do not set them when updating flags of the 1st row
This saves comparing the current pointer with the end of buffer pointer.
This results at least in tiny speed improvement for raw decoding, and
smaller code size for MQC as well.

This kills the remains of the raw.h/.c files that were only used for
decoding. Encoding using the mqc structure already.
@rouault
Copy link
Collaborator Author

rouault commented Jun 13, 2017

Has been merged into master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants