Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Scene caching #166

Merged
merged 59 commits into from
Aug 12, 2020
Merged

Conversation

huguesdevimeux
Copy link
Member

@huguesdevimeux huguesdevimeux commented Jun 27, 2020

Closes #8

First of all, big thank you to @Aathish04 who helped me and allowed me to get inspiration from his previous attempt to build a Scene Cacher.

Additions :

Scene-caching : if a play or wait invocation has already been rendered, manim now take the already rendreded video instead of rendering it again! If the cache is full (too much partial movie files), manim will remove some files. See below
--disable_caching flag : self explanatory. Will render the invocation anyway, and will save the partial movie file. This won't use any hash and name the partial movie file with a incremental index.
--flush_cache Will delete all the files cached.

How it works (up to date)

It computes three hashes and then it names the partial_movies files generated with the concatenation of these hashes (separated by "_").

  • The first hash corresponds to the arguments passed to scene.play (or scene.wait). It converts the list of the arguments in a JSON type array, and then create a hash (with crc32). Everything is included in this array, as the arguments are recursively "serialized" i.e the objects are converted to their __dict__, functions to their definition codes, etc. See get_hash_from_play_call and get_hash_from_wait_call in utils/hashing.py l.52.
  • The second hash is obtained with all the mobjets present on the scene, including those not involved in the current play/wait invokation. Same method as above, the mobjects are converted into JSON array.
  • The last one is obtained with the camera_config, with the same process describing above.
    Note: All of the process is -of course- recursive. So it's able to handle for example nested dicts.

I also have to serialize functions. To do this, I took the source code (taking dissassamelby of the function won't work, as it uses sometimes references (object <> at [memory address]) in the Bytecode that appear in the str representation of the function's instructions, and we want to avoid that as memory addresses are random so it will affect the hash every time. I also took the non-locals vars that are used by the function in the hash: So if something external to the function is modified but not its source code directly, the hash will be affected. (Note: As the whole process, this is recursive.)

To use these cached files, we just check if the file with the hash exists; if yes, we use it when combining the partial_movie files : if not, we generate the partial_movie file.

If the cache is full (if there are more than x files in the directory that contains the partial movie files, when x is a parameter that can be put by the user), then manim removes the partial movie that was used (i;e opened by manim) the longest ago (and not created the longest ago). Note that in a lot of system, the last_accessed_time of a file is not refreshed automatically, so I had to do this manually (see utils.file_ops.modify_atime (l.61)). Maybe this is too much hacky ?

If disable_caching is True: we don't even compute a hash. We just use an incremental index for the partial_movie_files as it used to be.

Testing

Given that we don't at the moment have a way to test video this PR won't come with tests implemented.
Nevertheless, this have been manually tested many times on many different configurations.

TODO :

  • Very probably some docs things. Pinging Captain Docs, (his real identity is @PgBiel)

@leotrs leotrs mentioned this pull request Jun 27, 2020
13 tasks
@huguesdevimeux huguesdevimeux added enhancement Additions and improvements in general new feature Enhancement specifically adding a new feature (feature request should be used for issues instead) labels Jun 27, 2020
@TonyCrane
Copy link
Contributor

TonyCrane commented Jun 27, 2020

Hello, I am "@tony_wong" :)

What I said that day was just a very small suggestion. Random scenes are rarely used in manim, such as randomly generating point sets and random movement when simulating sir.
It doesn't matter if you don't implement special options for this.

But I think it is necessary to add a command line option to clear the cache of the scene.
In addition, do we need to add a command line option to automatically transfer the cache folder to another (rename the cache folder when the scene name is changed)

btw, maybe it's better to add an argument in cli to set whether use cache or not.

manim/scene/scene_file_writer.py Outdated Show resolved Hide resolved
@huguesdevimeux
Copy link
Member Author

huguesdevimeux commented Jun 28, 2020

Hi :D

But I think it is necessary to add a command line option to clear the cache of the scene.

This is a very good idea. I will implement a such thing.

In addition, do we need to add a command line option to automatically transfer the cache folder to another (rename the cache folder when the scene name is changed)

I see it can be useful to render a huge scene with previously rendered ones, but this is IMO complicated to explain to user and I'm sure very few people will use this thing. But you are way more experienced with manim than I am, so tell me if it would really be useful. In fact, I think that transferring the cache should be not manim's job but the user's.

btw, maybe it's better to add an argument in cli to set whether use cache or not.

I'm aware of that, but as #98 will change a lot of things related to CLI I prefer to wait for it to be merged first

Thank you for those suggestions !

@TonyCrane
Copy link
Contributor

I see it can be useful to render a huge scene with previously rendered ones, but this is IMO complicated to explain to user and I'm sure very few people will use this thing. But you are way more experienced with manim than I am, so tell me if it would really be useful.

In fact, I mean that sometimes the same Scene is renamed in the source code, causing the previous cache to be invalid. But it doesn't matter if you don't implement this, this can be done manually. :)

@huguesdevimeux
Copy link
Member Author

I see it can be useful to render a huge scene with previously rendered ones, but this is IMO complicated to explain to user and I'm sure very few people will use this thing. But you are way more experienced with manim than I am, so tell me if it would really be useful.

In fact, I mean that sometimes the same Scene is renamed in the source code, causing the previous cache to be invalid. But it doesn't matter if you don't implement this, this can be done manually. :)

I see then. This is a good point. Tbh I don't know what to do, if it would be better to let the user do it or to implement a thing that does it automatically (which would be cool ngl, but maybe too much)

@TonyCrane
Copy link
Contributor

And I also have some small suggestions, such as checking the __dict__ dictionary in the cache, but I think it is enough to only pay attention to and render the content on the screen, something such as "name", "target", maybe we can delete if from __dict__ temporarily before hashing.

And I am not sure whether it is necessary to establish a global cache (that is, it does not depend on the file name and the scene name), maybe this’s more like caching?

@huguesdevimeux
Copy link
Member Author

huguesdevimeux commented Jun 28, 2020

And I also have some small suggestions, such as checking the __dict__ dictionary in the cache, but I think it is enough to only pay attention to and render the content on the screen, something such as "name", "target", maybe we can delete if from __dict__ temporarily before hashing.

Hmm I partially agree. My plan was to do a truncation of CONFIG as you said and only select a few things (like "camera_orientation", etc..) for the scene CONFIG because hashing everything is way more too long and heavy. For the two other hashes - mobjects and play args- I don't think this is worth it. Eventhough it is a beautiful optimization, the current hashing process is faster enough in my opinion and doing what you said could cause some issues when hashing everything assures us to never have issues. (although, I agree, this is not very beautiful .

And I am not sure whether it is necessary to establish a global cache (that is, it does not depend on the file name and the scene name), maybe this’s more like caching?

And I like this idea, but I'm not sure of the way to do it. Personnaly, I don't want a big folder that serves as a global cache with tons of partial movies, because this would quickly become a big mess.
What I thought was maybe to index all the hashes of the partial movies files in a file (in CSV, SQL, or TXT whatever) and then when manim uses the cache it looks in this file if the hash is contained, rather than looking directly if the file exists in the directory.

@huguesdevimeux
Copy link
Member Author

huguesdevimeux commented Jul 10, 2020

UPDATE : I think this is getting more and more stable.
I recently added an option to serialize functions to hash them, and it seems to work. It was tough ngl, so I would really like to get criticisms about what I tried to do.

Now what we have to do is to test it as much as we can, with every scene we can. In particular complex cases with rate_func, camera moves, updatefromalpha func, etc. The idea is the following: render a scene, render it again and see if it used cached data, change a little bit the scene and see if a new cache is generated.

PS : I didn't forget the suggestions and the TODO stuff :)

manim/config.py Outdated Show resolved Hide resolved
Comment on lines 1 to 2
INFO Read configuration files: ['C:\\Users\\User\\OneDrive\\Bureau\\Programmation\\PYTHON\\MANIM-DEV\\manim\\manim\\default.cfg', 'C:\\Users\\User\\AppData\\Roaming\\Manim\\manim.cfg', config.py:
'C:\\Users\\User\\OneDrive\\Bureau\\Programmation\\PYTHON\\MANIM-DEV\\manim\\tests\\tests_data\\manim.cfg']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of the way the tests are run, the test requires that no file paths should be in the expected log.

This is because the paths will differ from user to user and operating system to operating system.

I suggest you remove all file paths completely.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm dumb.
I removed that first, but idk what comes through my head but I forget to remove that again.

Thank you for the catch !

Comment on lines 3 to 9
DEBUG Animation : Partial movie file written in scene_file_writer.py:

DEBUG Animation : Partial movie file written in scene_file_writer.py:

DEBUG Animation : Partial movie file written in scene_file_writer.py:

INFO scene_file_writer.py:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it absolutely guaranteed that there will only be these many debug messages?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep.
hashing won't be triggered as the media_temp direcyoty (where partial movies file are contained) is cleaned after the test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. So, if I haven't forgotten anything, and if the expected.txt has been updated properly, the "logging" test must not fail.

I am not sure about the subcommand one. Does it work?

manim/default.cfg Show resolved Hide resolved
manim/scene/scene_file_writer.py Outdated Show resolved Hide resolved
Comment on lines -442 to -447
if file_writer_config["from_animation_number"] is not None:
kwargs["min_index"] = file_writer_config["from_animation_number"]
if file_writer_config["upto_animation_number"] not in [None, np.inf]:
kwargs["max_index"] = file_writer_config["upto_animation_number"]
else:
kwargs["remove_indices_greater_than"] = self.scene.num_plays - 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: where did these go?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been removed.
This is now depreceted as the system with integreerds index names for partial movie files (ie 0001.mp4, 0002.mp4) does not exist in a post 166 world.
As kwargs["min_index"] and all this stuff was used to handle the files named with that these names, it's now useless.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait so the names of the files in partial_movie_files are changing too?

manim/scene/scene.py Show resolved Hide resolved
manim/scene/scene.py Show resolved Hide resolved
manim/scene/scene.py Show resolved Hide resolved
manim/scene/scene.py Outdated Show resolved Hide resolved
manim/utils/hashing.py Outdated Show resolved Hide resolved
Fixes logging tests.
manim/scene/scene.py Outdated Show resolved Hide resolved
manim/scene/scene.py Outdated Show resolved Hide resolved
manim/utils/file_ops.py Outdated Show resolved Hide resolved
manim/utils/hashing.py Outdated Show resolved Hide resolved
manim/utils/hashing.py Outdated Show resolved Hide resolved
manim/utils/hashing.py Outdated Show resolved Hide resolved
manim/utils/hashing.py Outdated Show resolved Hide resolved
manim/utils/hashing.py Outdated Show resolved Hide resolved
manim/utils/hashing.py Outdated Show resolved Hide resolved
Thank you Captain Docs !

Co-authored-by: Pg Biel <9021226+PgBiel@users.noreply.github.com>
@huguesdevimeux
Copy link
Member Author

huguesdevimeux commented Aug 11, 2020 via email

Copy link
Member

@PgBiel PgBiel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nitpicks

manim/utils/hashing.py Outdated Show resolved Hide resolved
manim/utils/hashing.py Outdated Show resolved Hide resolved
huguesdevimeux and others added 2 commits August 12, 2020 14:48
Co-authored-by: Pg Biel <9021226+PgBiel@users.noreply.github.com>
@huguesdevimeux huguesdevimeux merged commit 61d5e46 into ManimCommunity:master Aug 12, 2020
@huguesdevimeux huguesdevimeux deleted the scene-caching branch August 12, 2020 13:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Additions and improvements in general new feature Enhancement specifically adding a new feature (feature request should be used for issues instead)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Adding a Scene Cacher
7 participants