Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

about the first version of the software #5

Open
smilesun opened this issue Dec 10, 2017 · 16 comments
Open

about the first version of the software #5

smilesun opened this issue Dec 10, 2017 · 16 comments

Comments

@smilesun
Copy link
Owner

@berndbischl @markdumke I think we do not have that much time to wait until a perfect design, I would suggest

  • I help Markus to finish the first version (at least we wrote those class designs together so this is a common agreement) as soon as possible
  • I will keep my design and maybe at some point we can integrate them. I strongly suggest we keep this design as I want to use it for autmatic arch search for nn and auto mlr and high dim feat sel in the future.

How do you two think ?

We could start next week and have some form as soon .

@smilesun
Copy link
Owner Author

Hi, not sure why there is no feedback. I think we have the following aims

  1. Makus need a Cran packakge
  2. Markus need a publication
  3. We want a maintainable package to use for future projects

Based on those goals

  1. we should push the cran package as soon as possible, since it always make sense to be the first one, even if we can not make it perfect. I think it will be a big loss for us if it is not the first R package that can do RL ? And the previous version Markus handled as Mater thesis is already enough to push to cran.

  2. We need to use at least the design Markus presented last week for the publication, and in this way I am confident there would not be too many critism

  3. For future use, I myself think I prefer something I could afect the design of the software but this has nothing to do with our current goal. and I think this should not slow down the first two goals, right ? And maybe in somepoint I could merge my branch into Markus Branch or we just keep as it is.

I need some feedback from you two.

@smilesun
Copy link
Owner Author

@berndbischl @markdumke

@markusdumke
Copy link
Collaborator

Good points!
I thought about merging the stuff from the thesis over into the new design I presented to you and then push to CRAN. I probably need about two weeks to do this (cause I cannot work fulltime on this).
But of course we can also push the master thesis version right now to CRAN.

@berndbischl
Copy link

waiting for a few weeks is ok, but we should really target jan for a first upload

@smilesun
Copy link
Owner Author

Thanks for the reply for both :)
I am not sure if the following is a good idea.

  1. upload the master thesis version to cran so we are sure that it is the first R package to do complex reinforcement learning but not just MDP. This will be saved in CRAN history anyway. And Cran does not care about the design of the software at all but it matters if it is the first I think(the for the R community, the mater thesis version is already very fantastic for most people). We could do this if there is not too much extra work for Markus?
  2. In January, we have another version for the software and since nobody work during the Christmas time, nobody would be surprised if a lot of things get changed, right? and also the user API could be mostly the same although the inside turns into an object-oriented way.
  3. For the publication, we need
    • Change the design of the software is because we want to reduce the criticism from the reviewer
    • Benchmark over a batch of problems and a batch of algorithms which is needed in the paper.

Basically my point is maybe we could generate the above mentioned output (cran, oo design, benchmark, paper) stagewise so we are safe to deliver those in time. Otherwise maybe it becomes to complex for us to manage this project and we might get disappointed at the end if it lasts too long.

@markdumke @berndbischl , how do you think ?
In this way, we could join force and do not waste time doing repetivive work.

@markusdumke
Copy link
Collaborator

I think it's a good plan.
The master thesis version works, so I can push it to CRAN anytime, that is no problem.

and also the user API could be mostly the same although the inside turns into an object-oriented way.

I think there will be a lot of changes also to the user api. But that doesn't matter too much if it is an improvement in usability.

@markusdumke
Copy link
Collaborator

So should I push to CRAN now?

@smilesun
Copy link
Owner Author

@berndbischl , what do you think ?

@markusdumke
Copy link
Collaborator

@smilesun @berndbischl
I have an improved draft for the user interface of the first version online in the reinforcelearn work branch. I would suggest that I finish it this week with at least some basic functionality (Qlearning with or without exp replay, table and keras nn) and then push this to CRAN as version 0.1.0.

For the second version we can then include a better internal oo structure based on Xudong's ideas, and add more algorithms etc., but probably don't have to change too much of the user interface.

# qlearning eligibility traces
env = makeEnvironment("windy.gridworld")
val = makeValueFunction("table", n.states = env$n.states, n.actions = env$n.actions)
policy = makePolicy("epsilon.greedy", epsilon = 0.1)
alg = makeAlgorithm("qlearning", lambda = 0.9, traces = "accumulate")
agent = makeAgent(policy, val, alg)
interact(env, agent, n.episodes = 100L)

# character arguments
env = makeEnvironment("windy.gridworld")
agent = makeAgent("softmax", "table", "qlearning")
interact(env, agent, n.steps = 10L)

@smilesun
Copy link
Owner Author

@markdumke , there is nothing in the work branch now except for the Rxd folder I pushed, did you push correctly to the right branch?

@markusdumke
Copy link
Collaborator

Ah, yes I've pushed to my own repo: https://github.com/markdumke/reinforcelearn/tree/work

@smilesun
Copy link
Owner Author

I think that is a good plan

@markusdumke
Copy link
Collaborator

Very good, I will upload tomorrow then

@markusdumke
Copy link
Collaborator

@smilesun
Copy link
Owner Author

@markusdumke Great work! So what is your plan about the paper now?

@markusdumke
Copy link
Collaborator

markusdumke commented Jan 16, 2018

Thanks :)

I've just started working in a new job, so sadly there's not too much time I can spend on this project right now.

But I think it would be great to merge our code together at some point, so to have a maintainable and extendable package. So maybe you can make a list of changes you'd like to make to the current code at https://github.com/markusdumke/reinforcelearn, so that I can review these maybe on the weekend? And then we can merge and submit to JOSS?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants