Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support non-disruptive serialization of Network #417

Closed
EronWright opened this issue May 1, 2016 · 14 comments
Closed

Support non-disruptive serialization of Network #417

EronWright opened this issue May 1, 2016 · 14 comments

Comments

@EronWright
Copy link
Contributor

The act of serializing the Network causes it to halt because shouldDoHalt is true by default. The ideal would be to not halt the network in preSerialize, or to allow shouldDoHalt to be set by the user.

The checkpoint-to-file functionality in HTM.java actually suppresses the halting behavior by briefly setting shouldDoHalt to false. In the flink-htm library, Flink takes care of higher-level checkpointing to a Flink-provided storage backend; we desire to use only the SerializerCore interface for flink-htm.

@cogmission
Copy link
Collaborator

cogmission commented May 1, 2016

The store() operation halts the network before storing, this is by design. The checkpoint() method should be used if halting is not desired.

@EronWright For flink-htm, please use the methods outlined in the Gist I sent you which use the SerializerCore specifically. You can refer to them here: https://gist.github.com/cogmission/25c4d5935aa0fc6e65ccafd26a4410a8

Unless there is any further concern, I will close this once I get your feedback.

@EronWright
Copy link
Contributor Author

@cogmission yep I am using the KryoSerializer from the gist. Basically I need to serialize the network without halting it. Flink has some complicated state management code that uses the supplied serializer to checkpoint at a time of its choosing.

I cannot use the checkpoint method, I just need non-destructive serialization. If shouldDoHalt could be set to false, the problem would be solved. Can you help?

@cogmission
Copy link
Collaborator

cogmission commented May 1, 2016

@EronWright Sure... Let me remind myself of what's going on, because I need to see why the KryoSerializer would be using the store() or checkpoint() method (it shouldn't). As far as I remember, the KryoSerializer serializes the Persistable directly? Give me a second to refresh my memory...

@EronWright
Copy link
Contributor Author

EronWright commented May 2, 2016

I've prepared a PR to show how flink-htm will be updated to support 0.6.7 incl. checkpointing functionality.
htm-community/flink-htm#15

Here's the hack needed to keep the network alive after serialization occurs:
https://github.com/nupic-community/flink-htm/pull/15/files#diff-759340e2d25e9d01974e22f96f1cfc03R201

@cogmission
Copy link
Collaborator

I have a question, since you aren't using Sensors, are you calling start() on the Network at all? I need to know if you are running the Network in threaded mode?

@EronWright
Copy link
Contributor Author

Correct I am not using sensors, and not calling start.

@cogmission
Copy link
Collaborator

cogmission commented May 2, 2016

@EronWright If I am correct. shouldDoHalt() shouldn't have any effect on your code then? Does it present a problem?

halt() seems to only do some preparation work for threaded operation? It shouldn't affect you as far as I see? Tell me if I'm wrong?

@EronWright
Copy link
Contributor Author

EronWright commented May 2, 2016

The issue appears to be that Network.preSerialize calls halt if shouldDoHalt is true. So I would prefer not to halt the network during serialization. Note that access to the network instance is synchronized by Flink during checkpointing.

@cogmission
Copy link
Collaborator

Yes but have you tested it? Because halt() is only germane if you are running the Network in threaded mode? There is nothing being halted if you aren't feeding it data? "halt" refers to the internal operation thread, if that thread isn't running then calling "halt" should be a no-op?

@EronWright
Copy link
Contributor Author

Yep this came after some investigation; if I override shouldDoHalt then everything seems to work. The observer between the layer and the algorithm appears to become disconnected during halt.

@cogmission
Copy link
Collaborator

I see. Ok, what I'll do is check for a running thread in addition to checking shouldDoHalt, since I really need to control serialization of the Network in threaded mode, I don't want to circumvent that - but since you aren't using it in threaded mode, I can just check shouldDoHalt && isThreadRunning, should be an easy fix and should avoid calling halt() in your scenario.

@cogmission
Copy link
Collaborator

I'm going to update the code, and push a new release... stay tuned... ;)

@cogmission
Copy link
Collaborator

@EronWright Ok, no problem... I have a question then (please see that issue #418)...

@cogmission
Copy link
Collaborator

#Fixed in #419

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants