Study and discuss design and implementation of online measurements and actions #44

tsirif · 2018-02-27T01:25:43Z

Discussion from #36.

X:

Based on trial_to_tuple code, only hyper-parameter values are passed to the score_handle, not the results? The score_handle need to keep internally results information to match hyper-parameters with them?

C:

I restrict that we should score a predictive performance scalar for trials which have not been evaluated yet. Algorithm will score based on what it has seen already from observe.

X:

score_handle could receive suspended or interrupted trials, which means there could be measurements available for the scoring functions. An example where this is necessary is FreezeThaw.

C:
So what kind of interface do you propose for this? Consider that algorithms speak Python data structures and Numpy only. I move this to an issue, because we should study FreezeThaw and other possible client-server stuff, and how are we going to save online measurements and replies (I suggest reusing Trial objects and trials database).

Also, we should have in mind that future exploitation by RL projects, like BabyAI Game, is possible. So that an environment (user script-client) could be used to train asynchronously and distributed agents (algorithm-server). A static trial (set of hyperparameters usually) means the training environment's variation factors and a game instance's initial state (this is params); results is possibly episode's return. A dynamic trial means an observation tensor from the environment + a reward scalar + possibly a set of eligible actions (this is results), and an eligible action chosen as a response of sensing this information (this is params).

The text was updated successfully, but these errors were encountered:

tsirif · 2018-03-06T22:23:06Z

X:

@tsirif Do you think observe and judge could be merged? Otherwise I think they should be next to each other with a clear distinction between them in the doc. Beside the fact that one is on the results and the other on intermediate results, what is the difference?

The essential differences are only of purpose, not of structure.
So that:

observe takes points and results and returns None; however suggest is called to return new samples.
judge takes point and measurements and possibly returns serializable data.

Having said that, it can be the case that whatever part of the algorithm, is expected to act dynamically with a trial under evaluation, could be thought of as a subclass of BaseAlgorithm and do exactly the same stuff as it.

So an alternative to what exists right now could be:

DynamicAlgo inherits BaseAlgorithm and implements default reactions, also mixes in property should_suspend and possibly score. In its init parameters needs necessarily as positional arguments a Space object (as a BaseAlgorithm) and a Channel object (I will come to that earlier).
PrimaryAlgo inherits DynamicAlgorithm. It holds necessarily a BaseAlgorithm and possibly a DynamicAlgorithm. It is delegated also to route calls to observe/suggest correctly to its components, based on the question "is a trial currently active?".
An appropriate Channel object exposed by implementations of BaseAlgorithm and it fulfills an API proposed by a concrete implementation of a DynamicAlgo.

Example:

FreezeThaw implements and inherits DynamicAlgo. It also exposes an API through an abstract class to-be-implemented FreezeThaw.Channel. FreezeThaw necessarily owns an object from this Channel.

A certain algorithm, implementation of BaseAlgorithm, wants to expose its state information in a manner which is useful to FreezeThaw. The developer of this algorithm expresses compatibility with FreezeThaw by implementing the interface class FreezeThaw.Channel. A concrete instance is retrieved by calling a BaseAlgorithm's property (perhaps BaseAlgorithm.channel) which will conditionally instantiate on the optional existence of FreezeThaw algorithm on the system (poll for DynamicAlgo.__subclasses__() or a corresponding Factory class for its types!)

bouthilx · 2018-03-07T02:05:14Z

I agree with all that. I agree there would be two different methods, one for final observations and one for runtime exchanges with the process. However, the documentation of the methods is not clear enough about that. It should be cristal clear that observe will take a result, which is what the process send when it is completed, and might change the algorithm internal state while judge will take measurements, which are send from the process during it's lifetime, which might change the algorithm internal state. The doc of each method should also refer to each other to help understand the difference.

I insist however on the fact that the method names observe, judge and score are not explicit enough. I'll try to find alternatives, you are welcome to suggest (:laughing:) some.

By the way, the discussion diverged from the initial point. The initial discussion was solely about what algorithms need to score trials. Current implementation pass hyper-parameters. I believe it should pass entire history, including results and measurements.

bouthilx · 2018-07-21T18:36:52Z

Discussion closed. Implementation of dynamic algorithms in progress

tsirif mentioned this issue Feb 27, 2018

Feature: Algorithms, Dimensions, Space and Embeddings #36

Merged

tsirif added the enhancement Improves a feature or non-functional aspects (e.g., optimization, prettify, technical debt) label Feb 27, 2018

tsirif added the feedback Discuss the design decision in the development of something label Mar 6, 2018

bouthilx mentioned this issue Mar 7, 2018

Obverse vs Judge vs Score #50

Closed

bouthilx closed this as completed Jul 21, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Study and discuss design and implementation of online measurements and actions #44

Study and discuss design and implementation of online measurements and actions #44

tsirif commented Feb 27, 2018 •

edited

Loading

tsirif commented Mar 6, 2018 •

edited

Loading

bouthilx commented Mar 7, 2018

bouthilx commented Jul 21, 2018

Study and discuss design and implementation of online measurements and actions #44

Study and discuss design and implementation of online measurements and actions #44

Comments

tsirif commented Feb 27, 2018 • edited Loading

tsirif commented Mar 6, 2018 • edited Loading

bouthilx commented Mar 7, 2018

bouthilx commented Jul 21, 2018

tsirif commented Feb 27, 2018 •

edited

Loading

tsirif commented Mar 6, 2018 •

edited

Loading