Move cache context management to the builder #1112

tbekolay · 2016-06-26T16:34:11Z

Description:
This makes the cache more robust in a few ways. First, if the cache is used when not in the cache's context, it no longer crashes, just warns and does nothing. Second, I moved the cache context management (i.e., the with model.decoder_cache block) from Simulator.__init__ to build_network.

Motivation and context:
@PeterSuma noticed that his model would fail with nengo_ocl 1.0 and nengo 2.1.1. This is because nengo_ocl.Simulator.__init__ hasn't updated to include the with model.decoder_cache block. However, there isn't really any reason why backends should have to update on account of our cache changes; this should be localized to the builder. So I moved it to the builder.

In doing so, I also noticed that we don't do anything defensive to ensure that the cache isn't used outside of a with block, so I added those things in too. I don't believe that invalidate or the other public methods require early exits, but let me know if I'm wrong about that.

How has this been tested?
I ran all of the cache tests with

py.test nengo/tests/test_cache.py --slow --plots --analytics --logs

in both Python 2 and 3. The benchmarking test uses the actual cache so this should be a sufficient test.

Where should a reviewer start?
I would read through this commit-by-commit; each commit is pretty small and understandable. The first commit just switches the order or some things according to our style guide; no other changes were made, so that commit can be mostly ignored.

How long should this take to review?

Average (neither quick nor lengthy)

Note that, if people are okay with these changes, I'd like to do a quick 2.1.2 release after merging to fix compatibility with nengo_ocl 1.0. So prioritize this if you can!

Types of changes:

Bug fix (non-breaking change which fixes an issue)

Checklist:

I have read the CONTRIBUTING.rst document.
I have updated the documentation accordingly (not necessary).
I have included a changelog entry.
I have added tests to cover my changes.
All new and existing tests passed.

Still to do:

Not a todo per se, but a discussion point:

In my initial version of this PR, I did this with finer granularity:

Do with cache in builder/connection.py right before wrap_solver.
Do with cache in builder/network.py at the end (if it's the toplevel network) to shrink the cache.

In the end I switched to the current solution, which should act exactly the same as what is in master now, only without requiring changes to backends that use the builder. However, would it be better to do with this finer granularity, as just described? I figured that continually reacquiring the index lock would end up causing more conflicts when two builds are happening simultaneously, and take more time when building one model. Does that make sense or are my assumptions wrong?

tbekolay · 2016-06-26T16:36:35Z

Just to emphasize since the original message is pretty long:

If this looks OK to people I'd like to get it in into master then immediately do a 2.1.2 release, as this will fix compatibility with nengo_ocl 1.0.

jgosmann · 2016-06-26T17:41:03Z

nengo/cache.py

@@ -321,6 +321,11 @@ def shrink(self, limit=None):
            Maximum size of the cache in bytes.
        """
        if self.readonly:
+            warnings.warn("Cannot shrink a readonly cache.")


This might give more warnings than we want. There are basically two situations where the cache might be readonly:

The lock cannot be acquired. In that we already give a warning.

The user explicitly configures the cache to be readonly. In that case the simulator (or builder with this PR I assume) will still call shrink and the user cannot deactivate this call (and doesn't need to), but will get an irrelevant warning.

However, it might be a useful warning for developers when debugging?

Hmm, what do you think about changing it to a logger.info in that case? The benefit of the warning is that it'll only go off once, but you're right that it's not something that the user necessarily needs to see.

Sounds good to me. Not sure if I would go with logger.info or logger.debug, so either of those is fine with me. (Cache misses/hits are debug.)

jgosmann · 2016-06-26T18:04:23Z

Added some discussion points inline.

Another thing to consider: For spaopt I'll (probably) need¹ some way to deactivate the shrink for individual Simulator or builder instances. My preliminary solution was to add an argument to the Simulator (which might not be the best solution considering that other backends have to do the same). Any thoughts on how this could be done with this PR?

Regarding the granularity: I think one with block per model (build) is best. That way all the decoders get written to the same file instead of creating a lot of smaller files (which has been an issue before). Also, with multiple builds in parallel performance should be better as you said. It also requires less loading and syncing of the index.

Note: I only looked at the code so far. I have yet to test it (running other stuff at the moment).

¹: Not really need, but it's pretty inefficient otherwise.

tbekolay · 2016-06-26T18:41:53Z

some way to deactivate the shrink for individual Simulator or builder instances

The two ways I could see it would be to store something on a Model instance that you pass in (as you would do to turn off the decoder cache programmatically), or in network.config. The builder is set up so that you have access to it, but we haven't used network.config for anything yet. Of course, that's not really Simulator local since it modifies the network, but you could modify the network before constructing the Simulator?

jgosmann · 2016-06-26T18:43:25Z

Because I'm constructing an entirely new network that seems like a good approach. :)

tbekolay · 2016-06-27T15:19:50Z

Made the discussed changes in 41bb52a

jgosmann · 2016-06-27T16:11:27Z

LGTM 🍰

It is possible that the cache can be used improperly; i.e., not in a `with cache` context. If that happened, certain methods -- namely `shrink` and `wrap_solver` -- would fail. This commit makes it so that we don't crash in those cases. Instead, we warn the user and do nothing.

Previously, the cache was entered in Simulator.__init__. This was causing the current nengo_ocl's Simulator to fail as it calls the builder without entering the cache. There's no reason why the cache couldn't be entered in build_network instead, so I moved it there so that nengo_ocl doesn't have to modify its Simulator.

tbekolay · 2016-06-27T18:17:31Z

@drasmuss can you do the second review and merge if it looks good? I'll do a 2.1.2 release once it's merged (but no pressure / rush).

Better class member order for DecoderCache

baddaa3

tbekolay added bug needs review labels Jun 26, 2016

tbekolay force-pushed the cache-in-builder branch from bea6763 to d78cfec Compare June 26, 2016 16:52

jgosmann self-assigned this Jun 26, 2016

jgosmann reviewed Jun 26, 2016
View reviewed changes

jgosmann removed their assignment Jun 27, 2016

jgosmann added needs second review and removed needs review labels Jun 27, 2016

tbekolay added 2 commits June 27, 2016 14:16

tbekolay force-pushed the cache-in-builder branch from 41bb52a to e369625 Compare June 27, 2016 18:16

drasmuss assigned drasmuss and unassigned drasmuss Jun 27, 2016

drasmuss merged commit e369625 into master Jun 27, 2016

drasmuss deleted the cache-in-builder branch June 27, 2016 18:46

drasmuss added reviewed and removed needs second review labels Jun 27, 2016

jgosmann mentioned this pull request Jun 29, 2016

Release planning #1114

Closed

jgosmann mentioned this pull request Aug 30, 2016

Make PR template more concise. #1158

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move cache context management to the builder #1112

Move cache context management to the builder #1112

tbekolay commented Jun 26, 2016

tbekolay commented Jun 26, 2016

jgosmann Jun 26, 2016

tbekolay Jun 26, 2016

jgosmann Jun 26, 2016

jgosmann commented Jun 26, 2016

tbekolay commented Jun 26, 2016

jgosmann commented Jun 26, 2016

tbekolay commented Jun 27, 2016

jgosmann commented Jun 27, 2016

tbekolay commented Jun 27, 2016

Move cache context management to the builder #1112

Move cache context management to the builder #1112

Conversation

tbekolay commented Jun 26, 2016

tbekolay commented Jun 26, 2016

jgosmann Jun 26, 2016

Choose a reason for hiding this comment

tbekolay Jun 26, 2016

Choose a reason for hiding this comment

jgosmann Jun 26, 2016

Choose a reason for hiding this comment

jgosmann commented Jun 26, 2016

tbekolay commented Jun 26, 2016

jgosmann commented Jun 26, 2016

tbekolay commented Jun 27, 2016

jgosmann commented Jun 27, 2016

tbekolay commented Jun 27, 2016