Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

T1 flag optimizations (#172) #945

Merged
merged 19 commits into from Jun 13, 2017
Merged

Conversation

@rouault
Copy link
Collaborator

@rouault rouault commented Jun 2, 2017

This patch set consists in :

  • porting Carl Hetherington T1 patch that optimimzed encoder side
  • fix/implement VSC encoding in it
  • adapt C. Hetherington tricks to the decoder side
  • use macros in MQC decoding and sig/ref/cleanpass for better assembly generation

Results on the performance test suite (ref time is current master):

../../data/input/nonregression/kodak_2layers_lrcp.j2c, 3 iterations, 1 threads, DECOMPRESS: ref_time 2226 ms, new_time 1967 ms, (improvement) -11.6 %
../../data/input/nonregression/kodak_2layers_lrcp.j2c, 5 iterations, 2 threads, DECOMPRESS: ref_time 3019 ms, new_time 2781 ms, (improvement) -7.9 %
../../data/input/nonregression/kodak_2layers_lrcp.j2c, 10 iterations, 4 threads, DECOMPRESS: ref_time 4864 ms, new_time 4548 ms, (improvement) -6.5 %
../../data/input/conformance/p0_07.j2k, 3 iterations, 1 threads, DECOMPRESS: ref_time 5647 ms, new_time 4959 ms, (improvement) -12.2 %
../../data/input/conformance/p0_04.j2k, 10 iterations, 1 threads, DECOMPRESS: ref_time 703 ms, new_time 622 ms, (improvement) -11.5 %
../../data/input/nonregression/X_4_2K_24_185_CBR_WB_000.tif, 3 iterations, 1 threads, COMPRESS: ref_time 6588 ms, new_time 5291 ms, (improvement) -19.7 %
TOTAL: ref_time 23049 ms, new_time 20172 ms, (improvement) -12.5 %

On (private image) MAPA.jp2 (recoded with standard flags), decoding time goes from 50.168 s to 45.030, so a reduction by 10% of decoding time as wel.

rouault added 19 commits May 20, 2017
…#172)

Ported from Carl Hetherington work (actually through Matthieu Darbois's port
on top of OpenJPEG 2.1.0)

Can reduce total encoding time by 10-15%

WARNING: VSC mode is not implemented, and so is a temporary regression
that must be fixed.
…cros, so as to get better register allocation
…th row.

Do not set them when updating flags of the 1st row
This saves comparing the current pointer with the end of buffer pointer.
This results at least in tiny speed improvement for raw decoding, and
smaller code size for MQC as well.

This kills the remains of the raw.h/.c files that were only used for
decoding. Encoding using the mqc structure already.
@detonin detonin added the in progress label Jun 2, 2017
@rouault rouault merged commit 9a9b069 into uclouvain:master Jun 13, 2017
2 checks passed
2 checks passed
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@detonin detonin removed the in progress label Jun 13, 2017
@rouault
Copy link
Collaborator Author

@rouault rouault commented Jun 13, 2017

Has been merged into master.

@detonin detonin added the enhancement label Aug 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants
You can’t perform that action at this time.