Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Omnibus #683

Closed
wants to merge 8 commits into from
Closed

Omnibus #683

wants to merge 8 commits into from

Conversation

boxerab
Copy link
Contributor

@boxerab boxerab commented Jan 5, 2016

Omnibus

This branch is going to pull in all of my pending pull requests into one single branch.
It adds performance improvements and better memory management.
Also, memory mapped and memory buffered streams are supported.

Measurements

I compared master and omnibus branches for memory usage and performance.
(my system is a 4 year old 4-core i7 3770 with DDR3, running Windows 7 64)

Peak Memory Usage

For decode of 100 DCI frames:

  1. Master : 98 MB
  2. Omnibus 35 MB

So, memory usage has dropped to almost one third.

Performance:

Omnibus uses OpenMP to multi-thread the library.

For encode/decode of 100 DCI frames:

  1. Master (single threaded)

Encode: 0.85 FPS
Decode : 2.5 FPS

  1. Omnibus (multi threaded)

Encode: 5.5 FPS
Decode : 11.2 FPS

On upcoming 8 core FinFET desktop CPUs, this should get real time decode, and
~15 FPS encode.

@XenonofArcticus
Copy link

This is fantastic work and I look forward to being able to use it.

@boxerab
Copy link
Contributor Author

boxerab commented Jan 5, 2016

Thanks, Xenon !

@homme
Copy link

homme commented Jan 5, 2016

This looks great - many thanks! I'm particularly interested in using these improvements in conjunction with GDAL.

@boxerab
Copy link
Contributor Author

boxerab commented Jan 5, 2016

Glad to be of service, Homme. I've got a few more tricks up my sleeve, will be checking them in shortly.

@klokan
Copy link

klokan commented Jan 5, 2016

This looks like cool performance improvement of OpenJPEG for the first glimpse!

@boxerab
Copy link
Contributor Author

boxerab commented Jan 5, 2016

Thanks, Petr. 8 core CPUs are coming to the desktop this year; we need to make use of them.

@mayeut
Copy link
Collaborator

mayeut commented Jan 5, 2016

@boxerab Seems nice !
Could you edit your post to include performances before the patch on your system ? That would give a better idea of the improvement. (The same could be done for memory consumption)

BTW, there are many warnings on the build: http://my.cdash.org/viewBuildError.php?type=1&buildid=887331 Do you think you can get this down a bit ?

@boxerab
Copy link
Contributor Author

boxerab commented Jan 6, 2016

Thanks @mayeut. I will update perf and memory consumption stats from my system.

As for the warnings, yes, there are quite a lot. Will look into getting rid of them.

@boxerab
Copy link
Contributor Author

boxerab commented Jan 6, 2016

By the way, my last commit on this branch made a small change to opj_image_compt_t, so it looks like I have changed the ABI. I need to do this because in certain cases, I am passing the tile data back inside the user image, and this tile data is allocated using opj_aligned_malloc, so it must
be freed with opj_aligned_free. This change is one of the keys to getting the memory down for certain work flows i.e. DCI decoding.

Currently, peak memory usage for DCI decoding on my branch is around 1/3 of master (!) so it is now much more reasonable. But, the price we pay is we need to track whether image data was allocated with opj_aligned_malloc.

@boxerab
Copy link
Contributor Author

boxerab commented Jan 6, 2016

Also, @mayeut , thanks again for setting up Travis and Appveyor. It made working out the problems with the branch pretty easy, although time consuming. It helped me a lot.

@mayeut
Copy link
Collaborator

mayeut commented Jan 9, 2016

@boxerab, even though the API/ABI check passes, some new declarations in openjpeg.h make me think that behavior has changed. There seems to be a need to call at least some of those new functions even when using only "old" API.

@boxerab
Copy link
Contributor Author

boxerab commented Jan 9, 2016

Do you mean these methods?

OPJ_API void OPJ_CALLCONV opj_image_all_components_data_free(opj_image_t* image);

OPJ_API void OPJ_CALLCONV opj_image_single_component_data_free(opj_image_comp_t* image);

Yes, I suppose you are right. The user needs to call these when freeing up image data, they can't just call free(image->comps[0].data), for example. Or the program will crash.

If this is a problem, I can make a temporary change so these methods are not required. We will lose around 10% performance gain and add 1/3 memory usage back for single tile, 3 component decode.

Also, there is an initialize call that is needed if OpenMP is turned on, but this is a new feature.

image->comps[2].data = (int*)calloc((size_t)max, sizeof(int));
new_image->comps[0].data= NULL;
new_image->comps[1].data = NULL;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Git reports a whitespace warning here (spaces at end of line).

@boxerab
Copy link
Contributor Author

boxerab commented Jan 9, 2016

@stweil Thanks.

I would like to get this merged in, and then we can worry about whitespace.
Or, you could make a patch to fix this for my omnibus branch, and then we can merge it in with whitespace fixed.

@boxerab
Copy link
Contributor Author

boxerab commented Jan 9, 2016

@mayeut I have temporarily reversed some of my memory optimizations; now a user doesn't need to worry about allocating and de-allocating image memory using the library. So, nothing should break if a user upgrades to the upcoming release; no behaviour changes.

I will put these optimizations back in for the following release.

@detonin
Copy link
Contributor

detonin commented Jan 10, 2016

Hi Aaron, thanks for this great work ! I really would like to merge it before the upcoming release but I have a few comments / questions:

  • I see that although your changes do not break ABI, you added some functions to the API. This means we would go for version 2.2 rather than 2.1.1, according to http://semver.org/. This is ok for me, unless anyone has a comment on this.
  • You added 4 new files to the library. We had in the past an issue concerning the OpenJPEG copyright which is a complete mess right now. A proposal was made (Simplify openjpeg copyright #318) to change the copyright of all files to the two following lines:

Copyright (c) 2002-2016, Universite catholique de Louvain (UCL), Belgium
Copyright (c) 2002-2016, OpenJPEG contributors

UCL is there as initiator, host, maintainer, funder, and main contributor of the project. A file would list all OpenJPEG contributors. For now, we did not yet manage to achieve this simplification but concerning the 4 new files you added, would you agree to:

  1. Add the UCL copyright line
  2. Give your consent to the proposal in Simplify openjpeg copyright #318 (add your comment there) so that these new files will not prevent us to move forward on this later on.
  • There are plenty of changes in the modified files that are due to space / tab addition / removal. Do you have an easy way to fix this ? Otherwise I would merge your PR in a new branch and fix this from there. Or keep this for when we tackle Uniformize code syntax  #128 :-)

Cheers and thanks again.

@boxerab
Copy link
Contributor Author

boxerab commented Jan 10, 2016

Hi Antonin,

I will try to remove what I can of the spurious whitespace in the branch.

Also, I am happy to change the copyright notice on my new files.

Cheers,
Aaron

@boxerab
Copy link
Contributor Author

boxerab commented Jan 11, 2016

@detonin well, I've changed the copyright notices, and I've done what I could to fix the whitespace. How does it look now?

Aaron Boxer added 2 commits January 10, 2016 22:22
the user should NOT free image data on their own; they must call
opj_image_destroy.
@boxerab
Copy link
Contributor Author

boxerab commented Jan 15, 2016

This PR has been superseded by my decode_region PR. Closing.

@boxerab boxerab closed this Jan 15, 2016
@boxerab boxerab deleted the omnibus branch January 15, 2016 20:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants