New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review Request : G. Viejo, B. Girard, M. Khamassi #14

Closed
wants to merge 4 commits into
from

Conversation

Projects
None yet
4 participants
@gviejo

gviejo commented Jan 20, 2016

AUTHOR

Dear @ReScience/editors,

I request a review for the replication of the following paper:

Keramati, M., Dezfouli, A., & Piray, P. (2011). Speed/accuracy trade-off between the habitual and the goal-directed processes. PLoS computational biology, 7(5), e1002055.

I believed the original results have been faithfully replicated as explained in the accompanying article

The repository lives at https://github.com/gviejo/ReScience-submission/tree/VIEJO-GIRARD-KHAMASSI-2016

$ git clone https://github.com/gviejo/ReScience-submission.git
$ cd ReScience-submission
$ git checkout VIEJO-GIRARD-KHAMASSI-2016
$ cd code

EDITOR

  • Editor acknowledgement (@rougier) Jan 20, 2016
  • Reviewer 1 (@vitay) Jan 20, 2016
  • Reviewer 2 (@gdetor) Jan 21, 2016
  • Review 1 decision [accept] Feb 9, 2016
  • Review 2 decision [accept] Feb 8, 2016
  • Editor decision [accept] Feb 9, 2016
@rougier

This comment has been minimized.

Show comment
Hide comment
@rougier

rougier Jan 20, 2016

Member

EDITOR

Thank you for the submission. I will soon assign two reviewers.

In the meantime, could you edit your first comment such as to copy the structure of #3 ?

And I think the link to the PDF you provided is wrong.

Member

rougier commented Jan 20, 2016

EDITOR

Thank you for the submission. I will soon assign two reviewers.

In the meantime, could you edit your first comment such as to copy the structure of #3 ?

And I think the link to the PDF you provided is wrong.

@rougier rougier changed the title from Review Request to Review Request : G. Viejo, B. Girard, M. Khamassi Jan 20, 2016

@rougier

This comment has been minimized.

Show comment
Hide comment
@rougier

rougier Jan 20, 2016

Member

EDITOR

Julien Vitay @vitay has accepted to review your submission

Member

rougier commented Jan 20, 2016

EDITOR

Julien Vitay @vitay has accepted to review your submission

@rougier

This comment has been minimized.

Show comment
Hide comment
@rougier

rougier Jan 21, 2016

Member

EDITOR

Georgios Detorakis @gdetor has accepted to review your submission

Member

rougier commented Jan 21, 2016

EDITOR

Georgios Detorakis @gdetor has accepted to review your submission

@rougier rougier added the 02 - Review label Jan 21, 2016

@rougier

This comment has been minimized.

Show comment
Hide comment
@rougier

rougier Jan 23, 2016

Member

EDITOR

@gviejo Before the actual review starts, you might consider cleaning your repo (you added some temporary files):

article/missfont.log
article/rescience-template.log
article/texput.log
article/viejo_girard_khamassi.bcf
article/viejo_girard_khamassi.log
article/viejo_girard_khamassi.out
article/viejo_girard_khamassi.run.xml
Member

rougier commented Jan 23, 2016

EDITOR

@gviejo Before the actual review starts, you might consider cleaning your repo (you added some temporary files):

article/missfont.log
article/rescience-template.log
article/texput.log
article/viejo_girard_khamassi.bcf
article/viejo_girard_khamassi.log
article/viejo_girard_khamassi.out
article/viejo_girard_khamassi.run.xml
@gviejo

This comment has been minimized.

Show comment
Hide comment
@gviejo

gviejo Feb 3, 2016

AUTHOR

@rougier
The repository should be clean.

gviejo commented Feb 3, 2016

AUTHOR

@rougier
The repository should be clean.

@vitay

This comment has been minimized.

Show comment
Hide comment
@vitay

vitay Feb 8, 2016

REVIEWER 1
Alert: @gviejo

Article

The article describes precisely what the reproduced model does, what were the main difficulties in reimplementing the model (missing parameter values, typos) and where the reproduction differs from the original. The authors justify sufficiently the small difference in probabilities of action.

The authors only reproduce the first figure of the original figure, which is sufficient. If the authors find the time to implement the other tasks, or at least to hint what should be changed in order to reproduce them, it would be of great value.

The manuscript has some formatting issues:

  1. The parameters table is not correct. By suppressing the first dashed line, it looks much nicer on my computer, but it may depend on pandoc's version:
Name               Description                               Value
------------------ ----------------------------------------- -------------------------
$\sigma$           Updating rate of the average reward       0.02
$\eta$             Variance of evolution noise               0.0001
$P_n$              Variance of observation noise             0.05
$\beta$            Rate of exploration                       1.0
$\rho$             Update rate of the reward function        0.1
$\gamma$           Discount factor                           0.95
$\tau$             Time step of graph exploration            0.08
depth              Depth of search in graph exploration      3
$\phi$             Update rate of the transition function    0.5
init cov           Initialisation of covariance matrice       1.0
$\kappa$           Unscentered transform parameters          0.1
------------------ ----------------------------------------- ------------------------- 

Notice the typo on "Initialisation" for init cov.

  1. The algorithm should not be indented that much to the right, it feels almost centered. Some equations are centered in the initialization phase, what impairs reading. Below is a modification that looks more consistent:
> **Initialization**

>> $Q(s, a)^{Goal-Directed} = \{ 0, \ldots \}$

>> $Q(s, a)^{Habitual} = \{ 0, \dots \}$

>> $\Sigma = \left(
    \begin{array}{*4{c}}
    cov \times \eta &  0 &  \ldots & 0 \\
     0 &  cov \times \eta &  \ldots & \vdots \\
    \vdots &  \ldots &  \ddots & 0 \\
     0 &  \ldots &  0 & cov \times \eta \\     
  \end{array}\right)$

>> $R(S1, EM) = 1$

>> $\bar{R} = 0$

>> $\hat{R}(s,a) = \{0,\ldots\}$

> **Main Loop**

>> $\textbf{FOR}\ i = 1:T$

>>> $s_t = S_0$

>>> $\textbf{IF}\ i = T_{devaluation}$

>>>> $R(S1, EM) = 0$

>>>> $\hat{R}(S1, EM) = -1$

>>> $\textbf{WHILE}\ s_t \neq S1 \bigwedge a_t \neq EM$

>>>> $a_t = \textbf{Selection}(s_t)$

>>>> $r_t = R(s_t,a_t)$

>>>> $s_{t+1} = transition(s_t, a_t)$

>>>> $\textbf{Update}(s_t,a_t, s_{t+1}, r_t)$

> **Selection** 

>> $\{a_1,\ldots,ai,\ldots\} \leftarrow sort(Q(s_t,a_i))$

>> $VPI(s_t, a_1) = (Q(s_t,a_2)^{H}-Q(s_t,a_1)^{H})P(Q(s_t,a_1)^{H}<Q(s_t,a_2)^{H})+ \frac{\sigma(s_t,a_t)}{\sqrt{2\pi}} e^{-\frac{(Q(s_t,a_2)^H - Q(s_t,a_1)^H)^2}{2\sigma(s_t,a_t)^2}}$

>> $VPI(s_t, a_i) = (Q(s_t,a_i)^{H}-Q(s_t,a_1)^{H})P(Q(s_t,a_i)^{H}>Q(s_t,a_1)^{H}) + \frac{\sigma(s_t,a_t)}{\sqrt{2\pi}} e^{-\frac{(Q(s_t,a_1)^H - Q(s_t,a_i)^H)^2}{2\sigma(s_t,a_t)^2}}$

>> $\textbf{FOR}\ i \in \{a_1, a_2,\ldots, a_i, \ldots\}$

>>> $\textbf{IF}\ VPI(s_t, a_i) \geq \tau \bar{R}$

>>>> $Q(s_t,a_i) = \hat{R}(s_t,a_i) + \gamma \sum\limits_{s'}p_{T}(\{s,a\}\rightarrow s') \max\limits_{b \in A} Q(s',b)^{Goal-directed}$

>>> $\textbf{ELSE}$

>>>> $Q(s_t,a_i) = Q(s_t,a_i)^{Habitual}$

>> $a_t \leftarrow \textit{SoftMax}(Q(s_t,a), \beta)$

> **Update**

>> $\bar{R} = (1-\sigma) \bar{R} + \sigma r_t$

>> $\hat{R}(s_t,a_t) =(1 - \rho) \hat{R} + \rho r_t$

>> $p_{T}(s_t, a_t, s_{t+1}) = (1 - \phi) p_{T}(s_t, a_t, s_{t+1}) + \phi$

>> Specific to Kalman Q-Learning

>> $\Theta = \{ \theta_j, 0 \geq j \geq 2|S.A|\}$

>> $\check{W} = \{ w_j, 0 \geq j \geq 2|S.A| \}$

>> $\check{R} = \{ \check{r}_j = \theta_j(s_t,a_t) - \gamma \max\limits_{b \in A} \theta_j(s_{t+1},b),\ 0 \geq j \geq 2|S.A|\}$

>> $r_{predicted} = \sum\limits_{j=0}^{2|S.A|} w_j \check{r}_j$

>> $P_{\theta_j \check{r}_j} = \sum\limits_{j=0}^{2|S.A|} w_j (\theta_j - Q^{Habitual}_t)(\check{r}_j - r_{predicted})$

>> $P_{\check{r}_j} = \sum\limits_{j=0}^{2|S.A|} w_j (\check{r}_j - r_{predicted})^2 + P_n$

>> $K_t = P_{\theta_j \check{r}_j} P_{\check{r}_j}^{-1}$

>> $\delta_t = r_t - r_{predicted}$

>> $Q_{t+1}^{Habitual} = Q_{t}^{H} + K_t \delta_t$

>> $P_{t+1}^{H} = P_{t}^{H} - K_t P_{\Sigma_t} K_t^T$

As the algorithm is quite complex, it would be good to add more comments inside the algorithm, such as:

>> # Sort the Q-values in descending order

>> $\{a_1,\ldots,ai,\ldots\} \leftarrow sort(Q(s_t,a_i))$

Code

  • The code runs correctly on Linux Mint 17.3 (python 2.7.6, numpy 1.8.2, matplotlib 1.3.1, scipy 0.13.3), with only the mentioned dependencies required. It reproduces the figure of the article as expected.
  • At the small cost of importing the future module at the beginning of run.py:
from __future__ import print_function

and calling the print function like this:

print(exp, nb_trials, deval_time) # in run.py
print("Parameters not found") # in models.py

as well as using range() instead of xrange():

from __future__ import print_function
try:
    import xrange as range # Overrides range in Python2, does nothing in Python 3
except Exception:
    pass
import numpy as np
...
for i in range(nb_blocs):
    for exp, nb_trials, deval_time in zip(['mod','ext'], [nb_iter_mod, nb_iter_ext], [deval_mod_time, deval_ext_time]):
        ...
        for j in range(nb_trials):
        ...

the code would run with both Python 2 and Python 3. I suggest to make this change, as the README specifies "at least python 2.7.3", what includes python 3.x.

  • run.py is well structured, with the parameters grouped and commented, before the main loop is actually performed. However, the two other files contain many methods which are not documented at all. The method names are sufficiently well chosen so that one understands what they are supposed to do. In order to allow users to adapt and expand your code, it would be great if you could add some docstrings to each method, even if it sounds dummy, as in:
def createQValuesDict(states, actions):
    """
    Returns a dictionary containing the Q-values of all state-action pairs.

    The keys are either:
    * states, in which case the vaue is a list action indices.
    * (state, action) tuples, in which case ...
    * (state, index) tuples, in which case...

    Parameters:
    * states: list of states
    * actions: list of actions
    """
    values = {0:np.zeros(len(states)*len(actions))}
    tmp = 0
    for s in states:
        values[s] = []
        n = 0
        for a in actions:
            values[(s,a)] = tmp
            values[s].append(tmp)
            values[(s,n)] = a
            tmp = tmp + 1
            n = n + 1
    return values

Even inside a single method, it would be good for comprehension to add comments to know what is currently done, for example:

    def updateValue(self, reward):
        # Retrieve the current state, action and reward
        r = (reward==1)*1        
        s = self.state
        a = self.action
        # Compute the sigma points
        sigma_points, weights = computeSigmaPoints(self.values[0], self.covariance['cov'], self.kappa)
        # Predicted rewards
        rewards_predicted = (sigma_points[:,self.values[(s,a)]]-self.gamma*np.max(sigma_points[:,self.values[s]], 1)).reshape(len(sigma_points), 1)
        reward_predicted = np.dot(rewards_predicted.flatten(), weights.flatten())
        # Covariance
        cov_values_rewards = np.sum(weights*(sigma_points-self.values[0])*(rewards_predicted-reward_predicted), 0)
        cov_rewards = np.sum(weights*(rewards_predicted-reward_predicted)**2) + self.var_obs
        # Kalman gains
        kalman_gain = cov_values_rewards/cov_rewards
        # Update the value
        self.values[0] = self.values[0] + kalman_gain*(reward-reward_predicted)
        # Update the covariance
        self.covariance['cov'][:,:] = self.covariance['cov'][:,:] - (kalman_gain.reshape(len(kalman_gain), 1)*cov_rewards)*kalman_gain

This may be redundant with the method names, but it allows a much faster comprehension of the code.

  • The code is released under the BSD licence (as defined by the header of the python files), but no LICENSE.txt file is present in the directory. Not fully sure, but I think you should add one.

vitay commented Feb 8, 2016

REVIEWER 1
Alert: @gviejo

Article

The article describes precisely what the reproduced model does, what were the main difficulties in reimplementing the model (missing parameter values, typos) and where the reproduction differs from the original. The authors justify sufficiently the small difference in probabilities of action.

The authors only reproduce the first figure of the original figure, which is sufficient. If the authors find the time to implement the other tasks, or at least to hint what should be changed in order to reproduce them, it would be of great value.

The manuscript has some formatting issues:

  1. The parameters table is not correct. By suppressing the first dashed line, it looks much nicer on my computer, but it may depend on pandoc's version:
Name               Description                               Value
------------------ ----------------------------------------- -------------------------
$\sigma$           Updating rate of the average reward       0.02
$\eta$             Variance of evolution noise               0.0001
$P_n$              Variance of observation noise             0.05
$\beta$            Rate of exploration                       1.0
$\rho$             Update rate of the reward function        0.1
$\gamma$           Discount factor                           0.95
$\tau$             Time step of graph exploration            0.08
depth              Depth of search in graph exploration      3
$\phi$             Update rate of the transition function    0.5
init cov           Initialisation of covariance matrice       1.0
$\kappa$           Unscentered transform parameters          0.1
------------------ ----------------------------------------- ------------------------- 

Notice the typo on "Initialisation" for init cov.

  1. The algorithm should not be indented that much to the right, it feels almost centered. Some equations are centered in the initialization phase, what impairs reading. Below is a modification that looks more consistent:
> **Initialization**

>> $Q(s, a)^{Goal-Directed} = \{ 0, \ldots \}$

>> $Q(s, a)^{Habitual} = \{ 0, \dots \}$

>> $\Sigma = \left(
    \begin{array}{*4{c}}
    cov \times \eta &  0 &  \ldots & 0 \\
     0 &  cov \times \eta &  \ldots & \vdots \\
    \vdots &  \ldots &  \ddots & 0 \\
     0 &  \ldots &  0 & cov \times \eta \\     
  \end{array}\right)$

>> $R(S1, EM) = 1$

>> $\bar{R} = 0$

>> $\hat{R}(s,a) = \{0,\ldots\}$

> **Main Loop**

>> $\textbf{FOR}\ i = 1:T$

>>> $s_t = S_0$

>>> $\textbf{IF}\ i = T_{devaluation}$

>>>> $R(S1, EM) = 0$

>>>> $\hat{R}(S1, EM) = -1$

>>> $\textbf{WHILE}\ s_t \neq S1 \bigwedge a_t \neq EM$

>>>> $a_t = \textbf{Selection}(s_t)$

>>>> $r_t = R(s_t,a_t)$

>>>> $s_{t+1} = transition(s_t, a_t)$

>>>> $\textbf{Update}(s_t,a_t, s_{t+1}, r_t)$

> **Selection** 

>> $\{a_1,\ldots,ai,\ldots\} \leftarrow sort(Q(s_t,a_i))$

>> $VPI(s_t, a_1) = (Q(s_t,a_2)^{H}-Q(s_t,a_1)^{H})P(Q(s_t,a_1)^{H}<Q(s_t,a_2)^{H})+ \frac{\sigma(s_t,a_t)}{\sqrt{2\pi}} e^{-\frac{(Q(s_t,a_2)^H - Q(s_t,a_1)^H)^2}{2\sigma(s_t,a_t)^2}}$

>> $VPI(s_t, a_i) = (Q(s_t,a_i)^{H}-Q(s_t,a_1)^{H})P(Q(s_t,a_i)^{H}>Q(s_t,a_1)^{H}) + \frac{\sigma(s_t,a_t)}{\sqrt{2\pi}} e^{-\frac{(Q(s_t,a_1)^H - Q(s_t,a_i)^H)^2}{2\sigma(s_t,a_t)^2}}$

>> $\textbf{FOR}\ i \in \{a_1, a_2,\ldots, a_i, \ldots\}$

>>> $\textbf{IF}\ VPI(s_t, a_i) \geq \tau \bar{R}$

>>>> $Q(s_t,a_i) = \hat{R}(s_t,a_i) + \gamma \sum\limits_{s'}p_{T}(\{s,a\}\rightarrow s') \max\limits_{b \in A} Q(s',b)^{Goal-directed}$

>>> $\textbf{ELSE}$

>>>> $Q(s_t,a_i) = Q(s_t,a_i)^{Habitual}$

>> $a_t \leftarrow \textit{SoftMax}(Q(s_t,a), \beta)$

> **Update**

>> $\bar{R} = (1-\sigma) \bar{R} + \sigma r_t$

>> $\hat{R}(s_t,a_t) =(1 - \rho) \hat{R} + \rho r_t$

>> $p_{T}(s_t, a_t, s_{t+1}) = (1 - \phi) p_{T}(s_t, a_t, s_{t+1}) + \phi$

>> Specific to Kalman Q-Learning

>> $\Theta = \{ \theta_j, 0 \geq j \geq 2|S.A|\}$

>> $\check{W} = \{ w_j, 0 \geq j \geq 2|S.A| \}$

>> $\check{R} = \{ \check{r}_j = \theta_j(s_t,a_t) - \gamma \max\limits_{b \in A} \theta_j(s_{t+1},b),\ 0 \geq j \geq 2|S.A|\}$

>> $r_{predicted} = \sum\limits_{j=0}^{2|S.A|} w_j \check{r}_j$

>> $P_{\theta_j \check{r}_j} = \sum\limits_{j=0}^{2|S.A|} w_j (\theta_j - Q^{Habitual}_t)(\check{r}_j - r_{predicted})$

>> $P_{\check{r}_j} = \sum\limits_{j=0}^{2|S.A|} w_j (\check{r}_j - r_{predicted})^2 + P_n$

>> $K_t = P_{\theta_j \check{r}_j} P_{\check{r}_j}^{-1}$

>> $\delta_t = r_t - r_{predicted}$

>> $Q_{t+1}^{Habitual} = Q_{t}^{H} + K_t \delta_t$

>> $P_{t+1}^{H} = P_{t}^{H} - K_t P_{\Sigma_t} K_t^T$

As the algorithm is quite complex, it would be good to add more comments inside the algorithm, such as:

>> # Sort the Q-values in descending order

>> $\{a_1,\ldots,ai,\ldots\} \leftarrow sort(Q(s_t,a_i))$

Code

  • The code runs correctly on Linux Mint 17.3 (python 2.7.6, numpy 1.8.2, matplotlib 1.3.1, scipy 0.13.3), with only the mentioned dependencies required. It reproduces the figure of the article as expected.
  • At the small cost of importing the future module at the beginning of run.py:
from __future__ import print_function

and calling the print function like this:

print(exp, nb_trials, deval_time) # in run.py
print("Parameters not found") # in models.py

as well as using range() instead of xrange():

from __future__ import print_function
try:
    import xrange as range # Overrides range in Python2, does nothing in Python 3
except Exception:
    pass
import numpy as np
...
for i in range(nb_blocs):
    for exp, nb_trials, deval_time in zip(['mod','ext'], [nb_iter_mod, nb_iter_ext], [deval_mod_time, deval_ext_time]):
        ...
        for j in range(nb_trials):
        ...

the code would run with both Python 2 and Python 3. I suggest to make this change, as the README specifies "at least python 2.7.3", what includes python 3.x.

  • run.py is well structured, with the parameters grouped and commented, before the main loop is actually performed. However, the two other files contain many methods which are not documented at all. The method names are sufficiently well chosen so that one understands what they are supposed to do. In order to allow users to adapt and expand your code, it would be great if you could add some docstrings to each method, even if it sounds dummy, as in:
def createQValuesDict(states, actions):
    """
    Returns a dictionary containing the Q-values of all state-action pairs.

    The keys are either:
    * states, in which case the vaue is a list action indices.
    * (state, action) tuples, in which case ...
    * (state, index) tuples, in which case...

    Parameters:
    * states: list of states
    * actions: list of actions
    """
    values = {0:np.zeros(len(states)*len(actions))}
    tmp = 0
    for s in states:
        values[s] = []
        n = 0
        for a in actions:
            values[(s,a)] = tmp
            values[s].append(tmp)
            values[(s,n)] = a
            tmp = tmp + 1
            n = n + 1
    return values

Even inside a single method, it would be good for comprehension to add comments to know what is currently done, for example:

    def updateValue(self, reward):
        # Retrieve the current state, action and reward
        r = (reward==1)*1        
        s = self.state
        a = self.action
        # Compute the sigma points
        sigma_points, weights = computeSigmaPoints(self.values[0], self.covariance['cov'], self.kappa)
        # Predicted rewards
        rewards_predicted = (sigma_points[:,self.values[(s,a)]]-self.gamma*np.max(sigma_points[:,self.values[s]], 1)).reshape(len(sigma_points), 1)
        reward_predicted = np.dot(rewards_predicted.flatten(), weights.flatten())
        # Covariance
        cov_values_rewards = np.sum(weights*(sigma_points-self.values[0])*(rewards_predicted-reward_predicted), 0)
        cov_rewards = np.sum(weights*(rewards_predicted-reward_predicted)**2) + self.var_obs
        # Kalman gains
        kalman_gain = cov_values_rewards/cov_rewards
        # Update the value
        self.values[0] = self.values[0] + kalman_gain*(reward-reward_predicted)
        # Update the covariance
        self.covariance['cov'][:,:] = self.covariance['cov'][:,:] - (kalman_gain.reshape(len(kalman_gain), 1)*cov_rewards)*kalman_gain

This may be redundant with the method names, but it allows a much faster comprehension of the code.

  • The code is released under the BSD licence (as defined by the header of the python files), but no LICENSE.txt file is present in the directory. Not fully sure, but I think you should add one.
@gdetor

This comment has been minimized.

Show comment
Hide comment
@gdetor

gdetor Feb 8, 2016

REVIEWER 2
Alert: @gviejo

Article

The text explains the main goal of this replication and provides all the necessary parameters and the algorithm of the implementation as well. The table is not so clear. It's better to put parameters on separate rows in the first column, in a second column the comments regarding each parameter and finally in a third column the arithmetic values of parameters. It would be nice if authors could explain (or speculate) why there is a slight difference in time steps and probabilities of action (different parameters, different implementation of Kalman Q-learning).

Source codes

The code runs smoothly on a Lenovo Tinkpad X250 laptop - Linux Fedora 22. It consumes ~44 MB of physical memory and it takes ~274,0s to finish execution and plot the results. The results are qualitatively similar to the ones in the original article.

Here are some comments regarding the source codes:

  • Use range instead of xrange
  • params dictionary in run.py is defined but never used.
  • run.py file is well structured and commented.
  • models.py and fonctions.py are well structured but there are not at all comments. Is good to comment at least the functions and the basic methods of your classes.
  • In models.py for the two classes you can use the configuration file parser (ConfigParser see: https://docs.python.org/2/library/configparser.html) for loading the parameters of the model.
  • In models.py, fonctions.py and run.py there are long lines of code. Is better to break them into multiple lines.

Typos

In the introduction:
"We replicate the results of the first task i.e. the devaluation experiment with two states
and two actions." Put a comma before i.e.

License

Authors use a BSD license for their source codes, although they do not provide any LICENSE along with their sources. In addition, they have not used the proper statement of BSD within their *.py files (see https://opensource.org/licenses/BSD-3-Clause).

gdetor commented Feb 8, 2016

REVIEWER 2
Alert: @gviejo

Article

The text explains the main goal of this replication and provides all the necessary parameters and the algorithm of the implementation as well. The table is not so clear. It's better to put parameters on separate rows in the first column, in a second column the comments regarding each parameter and finally in a third column the arithmetic values of parameters. It would be nice if authors could explain (or speculate) why there is a slight difference in time steps and probabilities of action (different parameters, different implementation of Kalman Q-learning).

Source codes

The code runs smoothly on a Lenovo Tinkpad X250 laptop - Linux Fedora 22. It consumes ~44 MB of physical memory and it takes ~274,0s to finish execution and plot the results. The results are qualitatively similar to the ones in the original article.

Here are some comments regarding the source codes:

  • Use range instead of xrange
  • params dictionary in run.py is defined but never used.
  • run.py file is well structured and commented.
  • models.py and fonctions.py are well structured but there are not at all comments. Is good to comment at least the functions and the basic methods of your classes.
  • In models.py for the two classes you can use the configuration file parser (ConfigParser see: https://docs.python.org/2/library/configparser.html) for loading the parameters of the model.
  • In models.py, fonctions.py and run.py there are long lines of code. Is better to break them into multiple lines.

Typos

In the introduction:
"We replicate the results of the first task i.e. the devaluation experiment with two states
and two actions." Put a comma before i.e.

License

Authors use a BSD license for their source codes, although they do not provide any LICENSE along with their sources. In addition, they have not used the proper statement of BSD within their *.py files (see https://opensource.org/licenses/BSD-3-Clause).

@rougier

This comment has been minimized.

Show comment
Hide comment
@rougier

rougier Feb 8, 2016

Member

EDITOR

Dear @gviejo

Could you address reviewer's comment ?
For the code license, have a look at http://rescience.github.io/write/, you'll find explanation for including a license in your code repository.

Member

rougier commented Feb 8, 2016

EDITOR

Dear @gviejo

Could you address reviewer's comment ?
For the code license, have a look at http://rescience.github.io/write/, you'll find explanation for including a license in your code repository.

@gviejo

This comment has been minimized.

Show comment
Hide comment
@gviejo

gviejo Feb 9, 2016

AUTHOR

@rougier

I changed the repositiry according to the rewiever's comments except for the use of ConfigParser. I choosed to clean some functions related to parameters dictionnary that were not used.
I added LICENSE.md in the code repository but I am not sure to understand the comments about the LICENSE within the py files.

gviejo commented Feb 9, 2016

AUTHOR

@rougier

I changed the repositiry according to the rewiever's comments except for the use of ConfigParser. I choosed to clean some functions related to parameters dictionnary that were not used.
I added LICENSE.md in the code repository but I am not sure to understand the comments about the LICENSE within the py files.

@rougier

This comment has been minimized.

Show comment
Hide comment
@rougier

rougier Feb 9, 2016

Member

EDITOR

I'm not sure either what @gdetor refers to exactly. I think you're supposed to include the full test, but a reference fo the (new) BSD license might be ok. You may need to add the all rights reserved though.

# Copyright (c) 2015, Guillaume Viejo
# Distributed under the (new) BSD License.
# Copyright (c) 2015, Guillaume Viejo. All rights reserved.
# Distributed under the (new) BSD License.

@gdetor, @vitay Could have a look at changes to check if it ok for you ?

Member

rougier commented Feb 9, 2016

EDITOR

I'm not sure either what @gdetor refers to exactly. I think you're supposed to include the full test, but a reference fo the (new) BSD license might be ok. You may need to add the all rights reserved though.

# Copyright (c) 2015, Guillaume Viejo
# Distributed under the (new) BSD License.
# Copyright (c) 2015, Guillaume Viejo. All rights reserved.
# Distributed under the (new) BSD License.

@gdetor, @vitay Could have a look at changes to check if it ok for you ?

@vitay

This comment has been minimized.

Show comment
Hide comment
@vitay

vitay Feb 9, 2016

REVIEWER 1

The changes are great, thanks. It is good to publish.

vitay commented Feb 9, 2016

REVIEWER 1

The changes are great, thanks. It is good to publish.

@gdetor

This comment has been minimized.

Show comment
Hide comment
@gdetor

gdetor Feb 9, 2016

REVIEWER 2
Everything's fine. It can be published. Thank you.

gdetor commented Feb 9, 2016

REVIEWER 2
Everything's fine. It can be published. Thank you.

@rougier

This comment has been minimized.

Show comment
Hide comment
@rougier

rougier Feb 9, 2016

Member

EDITOR

@gviejo Congratulations, your submission has been accepted. Please do not modify your repository until further notice.

Member

rougier commented Feb 9, 2016

EDITOR

@gviejo Congratulations, your submission has been accepted. Please do not modify your repository until further notice.

@rougier

This comment has been minimized.

Show comment
Hide comment
@rougier

rougier Feb 10, 2016

Member

This submission has been accepted for publication, and has been published at https://github.com/ReScience/ReScience/wiki/Current-Issue

DOI

Member

rougier commented Feb 10, 2016

This submission has been accepted for publication, and has been published at https://github.com/ReScience/ReScience/wiki/Current-Issue

DOI

@ReScience ReScience locked and limited conversation to collaborators Feb 10, 2016

@rougier rougier closed this Feb 10, 2016

@rougier rougier added the Python label Jul 3, 2016

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.