Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deep Q Learning #23

Closed
wants to merge 7 commits into from
Closed

Deep Q Learning #23

wants to merge 7 commits into from

Conversation

tejank10
Copy link
Contributor

Implemented DQN model on CartPole-v1

@iblislin
Copy link
Contributor

There is a CartPole env implementation in pure Julia: Reinforce.jl
Consider changing to it?

@MikeInnes
Copy link
Member

Yes, the Python dependency is a bit of an issue here, but otherwise it looks great. Will review in more detail soon.

@iblislin
Copy link
Contributor

iblislin commented Feb 15, 2018

@MikeInnes Do you have the committer bit of JuliaML? I want to drop support of julia 0.5 of Reinforce.jl...

@tejank10
Copy link
Contributor Author

tejank10 commented Feb 15, 2018

@iblis17 , Reinforce.jl (and OpenAIGym.jl) was my first option but I guess that's not updated. There were some issues with old functions-new functions there.

@iblislin
Copy link
Contributor

That was my first option but I guess that's not updated. There were some issues with old functions-new functions there.

Could you file an issue on that repo (and we can move discussion)? I want to improve it.

@tejank10
Copy link
Contributor Author

Cool, I'll file an issue

@tejank10
Copy link
Contributor Author

tejank10 commented Feb 15, 2018

@iblis17 , #6 in OpenAIGym.jl, which is built on Reinforce.jl

@iblislin
Copy link
Contributor

iblislin commented Feb 15, 2018

OpenAIGym still requires PyCall 🤦‍.
Is there any environment spec of CartPole-v1?
There is an implement of CartPole-v0 in Reinforce.jl already. I wondering what the main difference is.

@tejank10
Copy link
Contributor Author

tejank10 commented Feb 15, 2018

Ohh yes, that's where I got the idea to use PyCall! I was getting confused between OpenAIGym.jl and Reinforce.jl as former is based on the latter. Sorry about that.

I have re-implemented the model using Reinforce.jl on CartPole-v0.

@iblislin
Copy link
Contributor

Ohh yes, that's where I got the idea to use PyCall! I was getting confused between OpenAIGym.jl and Reinforce.jl. Sorry about that.

Haha. I just tried to install OpenAIGym, and it surpised me.

And it will be nice if we port CartPole-v1. (but I still cannot found any detail from googling, maybe I should read the source code.

@tejank10
Copy link
Contributor Author

tejank10 commented Feb 15, 2018

CartPole-v0 defines "solving" as getting average reward of 195.0 over 100 consecutive trials.

Whereas

CartPole-v1 defines "solving" as getting average reward of 475.0 over 100 consecutive trials.

From CartPole-v0 and CartPole-v1

@iblislin
Copy link
Contributor

@tejank10 😎

gr()

#Define custom policy for choosing action
type CartPolePolicy <: Reinforce.AbstractPolicy end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mutable struct ?

Copy link
Contributor Author

@tejank10 tejank10 Feb 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the CartPolePolicy?
Also, would like to know the advantage that would offer. AFAIK (and I may be wrong, as I am new to Julia), mutable struct for that would not have any fields.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mutable struct CartPolePolicy <: Reinforce.AbstractPolicy end

Keyword type is deprecated syntax since Julia 0.6.

Copy link
Contributor Author

@tejank10 tejank10 Feb 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. Thanks for that. I've changed it now. :)

@MikeInnes
Copy link
Member

Is this still relevant, or did you move the code to flux-baselines?

[I actually don't think it will be as bad to have the extra dependency on Reinforce.jl once we have a Pkg3 setup, but we can figure that out later]

@tejank10
Copy link
Contributor Author

Yes, it has been moved to Flux-baselines.

@MikeInnes
Copy link
Member

Great, let's leave it there for now then.

@MikeInnes MikeInnes closed this Jun 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants