Skip to content
This repository has been archived by the owner on Aug 4, 2020. It is now read-only.

Do you have any idea why model didn't learn much? #2

Closed
jd0713 opened this issue Aug 29, 2017 · 5 comments
Closed

Do you have any idea why model didn't learn much? #2

jd0713 opened this issue Aug 29, 2017 · 5 comments

Comments

@jd0713
Copy link

jd0713 commented Aug 29, 2017

As you know, portfolio weights appear to static at test set.
Why didn't model learn so much?

I am assuming several reasons.

  1. there are no ensemble in DDPG network
  2. input doesn't include previous portfolio weight
  3. there are no online stochastic batch learning when applying network to test set
  4. different universe
  5. no PVM
  6. Used ddpg instead of dpg

because author said that EIIE is there core idea in this paper, I think if I solve 1 and 2, there will be
some improvement.

Will you give some idea of yours?

Thank you very much.

@wassname
Copy link
Owner

wassname commented Aug 29, 2017

Are you using one of my notebooks or your own? And what do you mean by the test set or are you applying it to recent data?

This happened for me during training when:

  1. The critic network wasn't large enough to model the reward, so I added more parameters
  2. It learns from each action in sequence. But then the algorithm can just learn the current market and forget past markets. Lots of new algorithms try to fix this by either having multiple actors learning at once (I think the original paper did this), or using replay or prioritised replay memory. If this doesn't make sense then the prioritised experience paper explains it well.
  3. It failed to converge because SGD has the wrong parameters, in this case try RMSProp since it converges easier (or Adam but adam isn't as good for moving targets).

And for your list of ideas I think 2 and 5 couldn't be the cause since it makes it easier for it to give varied predictions. It shouldn't be 3. since it shouldn't be learning on the test set.

But yeah I reckon it could be 1., 4., or 6.

@jd0713
Copy link
Author

jd0713 commented Aug 30, 2017

I used your notebook, and data you uploaded.
I mean that there are only slight difference between equal-weighted portfolio and learned policy.
Also, there are almost no rewards gained while training.
I think that portfolio weight, accumulated return, like these should be different if 'something' is learned, and result doesn't seemed to.
I am wondering if there's way to model learn 'something'

In short, why there's no episode reward?

I am also curious if this phenomenon is derived from limit of observation space.
Because observation space is prices divide by last open price, it can be over 1.
Thanks.

Because I'm newbie at deep learning, my question might be vague.

@wassname
Copy link
Owner

I ran into similar problems, I found that the DDPG model was less likely to learn, but the VPG notebook was more stable. So my advice would be to try the VPG notebook and if that doesn't work maybe start of with with a simpler project. Deep reinforcement learning is pretty new and lots of libraries are broken and lots of algorithms are unstable. So maybe start with image classification projects if this is frustrating you?

If your determined to do this though, it's probably worth reading the papers belonging to the algorithms since they explain all the parameters and limitations.

This project didn't get good results for me (although another guy did a pytorch implementation and got decent results, but I'm not sure if he plans on publishing it). I'm wondering if there is a problem in my data or my environment since I tried many algorithms but kept the data and environment constant. Or maybe it's just difficult dataset.

@jd0713
Copy link
Author

jd0713 commented Aug 30, 2017

Thanks for your nice comment.
And I also read conversation between you and goodulusaurs after you comment.
I should better use DPG and ensemble them which is just same as the paper.
I will update if I have good result! Thank you again!

@wassname
Copy link
Owner

I'd be interested to know if you get any results using DPG! Good luck.

@jd0713 jd0713 closed this as completed Sep 1, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants