Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the size of GIFs saved using saveGif() #4339

Merged
merged 3 commits into from
Mar 13, 2020

Conversation

akshay-99
Copy link
Member

@akshay-99 akshay-99 commented Mar 1, 2020

Resolves #3871

Changes:

  • Choose the most frequently occuring palette as global palette, and use it wherever possible to avoid having a local palette for each frame.
  • Slice the buffer at the end of the written GIF as actual filesize will almost always be less than the initial buffer size.
  • Also reintroduces transparency wherever possible to reduce size further.

The sizes produced by 3 variants of saveGif(), taking the same 8.7 MB GIF as input:

  • The one currently on master: 23 MB
  • The one with the buffer sliced at the end and using global palettes wherever possible: 12.9 MB
  • The one using transparency: 8.9 MB

In an effort to reduce the size of GIFs, I did increase the size of this function a lot 😅

PR Checklist

Choose the most frequently occuring palette as global palette,
and use it wherever possible to avoid having a local palette for
each frame. Slice the buffer at the end of the written GIF.
Also make pixels transparent wherever possible.
@stalgiag
Copy link
Contributor

stalgiag commented Mar 1, 2020

This is exciting @akshay-99. I will review this tomorrow. Thanks!

@stalgiag
Copy link
Contributor

stalgiag commented Mar 2, 2020

Looks good @akshay-99 . I am seeing a drastic reduction in GIF save size on this end.

A few quick questions and notes.

I didn't find a quick way to test a GIF with no disposal to previous frames when adding the transparency optimization, but this might be an issue with this approach. I think you would need to be able to change the disposal code at encoding for save. I don't know though, it is possible that omggif is just using a default disposal code that works with the transparency optimization.

This second point might make the first one inconsequential. I think the transparency optimization might push too far into compression at the sacrifice of speed. This is really nice for getting small file size and goes above & beyond the goal for the issue. The problem is that with longer GIFs, I am experiencing a grinding halt on save. Short stalls for computation are fine, but there should be some compromise between the speed of computation and compression. I think this current solution favors compression a bit too much.

@akshay-99
Copy link
Member Author

@stalgiag omggif uses disposal code 0 when none is passed to it. Ideally, the disposal code for transparency should be 1 but as it turns out 0 works as well.

As for the second point, almost all GIFs that I found on the internet use transparency to achieve their low size. I think the ideal behaviour for saveGif is to be able to preserve this optimization. Without this, the size of the downloaded GIF will always end up being a lot more than the original.

There are 3 main iterations that happen over all the pixels.

  • To build the global palette
  • To build the get the indices
  • For transparency.

I found that each of these takes about 30% of the entire time ( with addFrame from omggif taking the other 10% ).
However, I also noticed that using global palette instead of local palettes everywhere gave very little to a negligible difference in size ( almost all of the difference in the second variant was due slicing the buffer at the end )
So, even if this part is skipped we would obtain the same GIF size in two-thirds of the time.

While it is true that saveGif would finish in one-third of the time with no optimization at all, I don't think this is ideal.
We could have an optional argument fast which when passed as true, skip over all optimization to yield the result in one-third of the time

@stalgiag
Copy link
Contributor

stalgiag commented Mar 3, 2020

Thank you for the thorough breakdown and performance testing @akshay-99 .

With what you have said in mind, I think this approach is good. I understand that we would have a further cut if we skipped the local palettes but then I think we would have problems with specific but rare GIFS with large local palettes.

This solution runs relatively fast, and I think that saving a large file is one of those situations where a user is likely prepared for a small road bump in performance.

I wish there was an effective way to add unit tests to this PR. I have been looking around but I don't see any obvious way to test these operations with the testing libraries that we use currently.

@akshay-99
Copy link
Member Author

we would have problems with specific but rare GIFS with large local palettes.

@stalgiag I didn't get what this means 😅 . From what I have read, a GIF palette, local or global, can only have a maximum of 256 colors. So even a ridiculously large gif with 1000 frames can have 256000 colors spread across its local palettes at max. That translates to exactly a megabyte of extra space.

I wish there was an effective way to add unit tests to this PR. I have been looking around but I don't see any obvious way to test these operations with the testing libraries that we use currently.

I haven't worked with automated testing before but can I give it a shot with this? It would be a good opportunity for me to learn. I have been looking at p5 tests to get an idea

@stalgiag
Copy link
Contributor

stalgiag commented Mar 4, 2020

I think we have a misunderstanding about local palettes. No worries though the point that I am attempting to make is that this way of handling it is correct.

Yeah if you want to you should try adding a unit test. There may some hard limits with testing a file that needs to be downloaded but at the very least you can make sure that the functions run without errors. This would have caught the previous typo which is more helpful than nothing. If you can find a way to ensure the contents of the datablob are correct that would be ideal. You can get a feeling for how the unit testing works by reading the contributor doc for it. Those reference links at the top should be helpful as well.

@akshay-99
Copy link
Member Author

@stalgiag I couldn't find anything to test for the downloaded file.
But to make saveGif more testable, would it make sense to split it, creating a function that does all the work and returns a buffer, and saveGif only acts to call this function, creates the blob and downloads it using p5.prototype.downloadFile.

That way we can at least test the buffer to verify the contents of the output GIF and the only untested step remaining would be the actual download.

@stalgiag
Copy link
Contributor

stalgiag commented Mar 6, 2020

@akshay-99 yes good idea!

@akshay-99
Copy link
Member Author

@stalgiag I finished the initial part of checking if the functions runs without errors.

My idea was to test the buffer by passing it once again to _createGif and then comparing the color of each pixel of each frame with the original, but as _createGif is private, I can't call it from the tests.

I tried to use the code from _createGif in the tests by copying it, then again to use omggif in the tests, I have to import it using <script src="../node_modules/omggif/omggif.js" type="text/javascript"></script> in the html files. (test.html , test-minified.html )

Then I was finally able to run omggif directly inside the tests.

But this is all looking a bit too many modifications for this pull request 😅

@stalgiag
Copy link
Contributor

Hi @akshay-99 I think that for this particular pull request it is sufficient to just check to ensure that the functions run without errors. Thank you!

@akshay-99
Copy link
Member Author

@stalgiag I have added the basic tests for saveGif. Please take a look. I will continue to explore what can be done for more in-depth testing

@akshay-99
Copy link
Member Author

akshay-99 commented Mar 13, 2020

@stalgiag Also, kinda unrelated to this PR, but the testing framework seems to think that the downloading animated gifs lies in the p5.Element.prototype.position suite.

DOM
    p5.Element.prototype.position
      downloading animated gifs
        p5.prototype.saveGif
          ✓ should be a function
          ✓ should not throw an error

I think it's because of a format issue here, where the expect must be enclosed in a test inside the suite

@stalgiag stalgiag merged commit f5269f6 into processing:master Mar 13, 2020
@stalgiag
Copy link
Contributor

Great thank you @akshay-99 ! I am going to merge this now. Will you open a separate issue for the test format issue? You can just quote your comment.

@akshay-99
Copy link
Member Author

@stalgiag there is a minor issue with this patch that I just noticed. Since I am not setting the disposal value for any frame, it is set to 0 by default. Now, most gif readers treat 0 and 1 similarly so the difference is not noticed when we open the GIF. However, now if we load this output with loadImage(), as _createGif will perform bliting only if the disposal is 1, the frames are not built properly this time around. So downloading this second version would give us quite a wacky gif

This can be fixed either by setting the disposal as 1 for the frames before the transparent ones.
Another way is by using a <= instead of a === here

I think both of these should be done.
I can make these but would I need to create an issue first or just send a pull request?

@akshay-99
Copy link
Member Author

@stalgiag here is what I am talking about

Original Gif
earth

Downloaded with saveGif
eout

Loading the above output and downloading again

untitled (2)

@stalgiag
Copy link
Contributor

Hi @akshay-99 since it is related to work that you have already completed you can open a pull request directly. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GIF files often save larger than they load
2 participants