New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about reward function and __pack_samples
#22
Comments
|
Hang on a sec. In the paper, Eq. (1) says that In the definition of There seems to be a discrepancy between the code and the paper. |
From my understanding,
|
Or are you confused about commission fee calculation? The |
Well, if
No, the commission fee calculation is fine. I am trying to understand what exact periods you are using for [Feedback for the paper] |
Yes, both t's in (1) and (18) mean the same thing: time, and both v_t's mean the same thing: the prices of all assets at time t. update: please refer to Figure 1 in the paper, that should help clarify these messes. |
Good! Then is it correct to say that If it is, then in the code you are not applying the same loss function as in the paper. |
Only for the last column of V_t (one of the three matrices in X_t), yes. Maybe the confusion here is that in (1) 'now' is t-1, but in X 'now' is t. |
Oh, this is very relevant with the whole issue. So, to clarify: if the actual wall clock time is |
Yes.
For the Reward/Loss function of the decision w_t at time t, we need y_{t+1}. |
Ok, so what was confusing me is that in the paper the loss contains y_t, w_{t-1} and mu_t (so I thought that w_t enters this loss only because you need it compute mu_t). |
All correct. Thank you for your careful reading of our poorly written paper. |
Thank you for explaining! 🙂 |
I'm having trouble reconciling what I read in the paper and what I read in the code.
The reward function in a single period in the paper (Eq. (10)) is \log(\mu_t y_t \cdot w_{t-1}). But in the code, it seems that the reward is instead \log(mu_t y_{t+1} \cdot w_{t}). Am I correct?
Because
__pack_samples
(indatamatrices.py
) makes the price tensorX
usingM[..., :-1]
and the relative price vectory
usingM[...,-1]/M[...,-2]
, soy
is one period ahead ofX
.The text was updated successfully, but these errors were encountered: