Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade atari_wrapper to tf2 #452

Merged
merged 11 commits into from
Aug 9, 2020
Merged

Conversation

MichaelSolotky
Copy link

  1. There were 2 functions in the class TFSummary, they used tf1 API, I replaced them with corresponding functions from tf2 API (both are referenced here https://www.tensorflow.org/api_docs/python/tf/summary/scalar).
  2. There was unnecessary import in the add_summary_scalar function. Tensorflow has to be imported by the time this method is called cause importing is in the constructor of the TFSummary class.
  3. There was no alternative in natrure_dqn to use NumPy summary even if a corresponding argument is set into a corresponding state -- bug fix.
  4. Different style of putting (or not putting) spaces between an argument name and its value in a function call in one file -- code style fix

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

Review Jupyter notebook visual diffs & provide feedback on notebooks.


Powered by ReviewNB

@MichaelSolotky
Copy link
Author

MichaelSolotky commented Aug 4, 2020

If everything's fine, #181 can be closed

@dniku
Copy link
Collaborator

dniku commented Aug 5, 2020

LGTM at a glance. Two questions:

  1. Did you copy actor-critic theory directly from week08_pomdp/practice_pytorch.ipynb without any modifications?
  2. Have you tested this?

@MichaelSolotky
Copy link
Author

  1. Yep, there was one markdown cell, I think it's enough to explain the actor-critic theory.
  2. No, I thought somebody else among teachers has some code that uses TFSummary class and you can call it and see whether everything's fine. I also waited for @justheuristic to tell whether it's easy for you to test it or we should write those tests.

@dniku
Copy link
Collaborator

dniku commented Aug 5, 2020

I don't have any code for testing summaries, but those who worked on checking homework assignments this spring probably do, and @justheuristic should be able to handle this. Regarding writing tests: that would be very nice, and if you feel that you can get them working with little effort, then by all means please implement them. However, if that requires some effort, we'd better direct it towards getting other assignments working with TF2.

@MichaelSolotky
Copy link
Author

MichaelSolotky commented Aug 5, 2020

Ok, I'll come back to that soon. Maybe it's not that difficult

@MichaelSolotky
Copy link
Author

MichaelSolotky commented Aug 7, 2020

Ok, I've tested the updated version. It doesn't work. But it seems like the previous version didn't work as well. It looks like the reason of using TFSummaries is to eventually look at plots in tensorboard, but there's no writing in files anywhere in this class. Also the step variable here https://github.com/yandexdataschool/Practical_RL/blob/master/week06_policy_based/atari_wrappers.py#L287
is not incremented, but it's the value of x axis, it should be incremented. I've fixed these to things in my local version and everything started to work.

@MichaelSolotky
Copy link
Author

MichaelSolotky commented Aug 7, 2020

With @justheuristic we decided to remove TFSummaries, it looks like NumpySummaries should be enough. Plots can be built in pyplot instead of tensorboard.

@mknbv
Copy link
Collaborator

mknbv commented Aug 7, 2020

I was pretty sure that the previous version worked. Are you sure you enabled recording and flushed the writer after tf.contrib.summary.scalar function is called (notice also that it's only called at the end of episodes)?

IMO having a way to write summaries to tensorboard is good because it simplifies things quite a bit. Notice that there are many things suggested for plotting and monitoring during training in this task and it could be quite messy when only using matplotlib.

@MichaelSolotky
Copy link
Author

Well, I'm not insisting on removal. It's easy enough to make a working version with tf2 with some additions to the previous version. First of all, in the previous version summaries were only collected and weren't written anywhere (even logdir wasn't mentioned anywhere) and I think that would stop people who wanted to use tensorboard. So I think the writer should be a member of the TFSummaries class.
And I've tried to run it once again and I'm not sure not whether the previous version was working.

@MichaelSolotky
Copy link
Author

@michaelkonobeev what do you think about this version? (I've tested it)

@mknbv
Copy link
Collaborator

mknbv commented Aug 8, 2020

My reasoning for not including writer into TFSummaries was that the caller might want to log other things, such as components of the loss function and in the earlier version of tensorflow/tensorboard it was not possible to have two writers that write into the same directory (and I don't think that it's likely that this has changed). Maybe it would be better to keep the writer out of the TFSummaries class and document how it could be used? This would also make the interface a bit simpler since there is no need to specify log_dir argument for TFSummaries which is consistent with NumpySummaries.

Also, currently global_step counts only the number of finished episodes by some batched environment which is not very intuitive. I think it is quite a bit more useful to count the total number of interactions since we typically limit it rather than the number of finished episodes which could change drastically depending on episode length which in turn grows with time if agent learns successfully. So I would suggest adding self.nenvs to global_step before checking self.should_write_summaries.

@MichaelSolotky
Copy link
Author

MichaelSolotky commented Aug 8, 2020

Mm, yeah, I think it's possible someone would like to log some additional stuff. And do you think it's ok to write some code for logging that stuff outside the TFSummaries class? For me, it seems counterintuitive, like you want to store some additional metrics in tf.summaries and do it outside the TFSummaries class. Like you call env.step and then after that you call some additional function write_loss_components and you also shouldn't forget about the writer. If I were a student solving that task I would like to not keep in mind that for every call of env.step I should manually write summaries in the log_dir. I would just redefine the TFSummary.add_summaries method right in the notebook and add there a call of the write_loss_components and that's it.
Yeah, I think, we should document, how the TFSummaries could be used. A small tutorial for beginners can be left there as well https://www.tensorflow.org/tensorboard/get_started
And yeah, the interface becomes a bit more inconsistent with NumpySummaries, which isn't well, but not that bad I think.
And yeah, that's a good point to count the total number of interactions, but how exactly you want to measure it? What you mean by adding self.nenvs to global_step?

@MichaelSolotky
Copy link
Author

MichaelSolotky commented Aug 8, 2020

And this small PR has taken more time than I expected, so can we just have the last round of conversations and changes and do sth with it?

@mknbv
Copy link
Collaborator

mknbv commented Aug 8, 2020

I think it's totally ok to have additional code for summaries outside the class. The confusion seems to stem from how summaries in tensorboard work more generally. A student could call tf.summary.add_scalar in the definition of A2C losses, then when writing a training loop define the writer and set it as default. This seems to me like the most typical way of writing summaries in tensorflow (to which I think you've already linked) so it should not be counterintuitive if you're familiar with it, and there is no need to call env.step manually here, it should be done through EnvRunner. So I suggest removing log_dir argument and adding either to docstring of TFSummaries or in the notebook itself that other summaries could be added in A2C or elsewhere and summary writer should be created and set as default before the training loop.

@MichaelSolotky
Copy link
Author

Ok, I think the code is ready now. There should only be problems with a dirty commit history.

@dniku
Copy link
Collaborator

dniku commented Aug 9, 2020

Um, guys. This is some code to simplify Tensorboard logging. Its existence is not even documented in the notebook, and I guess 95% of students won't bother to read the entire atari_wrappers.py (I certainly wouldn't). I don't know what's causing this much heat here, but I think this PR is safe to merge. If there are any specific student-visible problems with this code, I would like to respectfully ask @michaelkonobeev to file issues for them, but I do not think any more effort should be spent on this particular PR.

@dniku dniku merged commit edfdd1c into yandexdataschool:tf2.x Aug 9, 2020
@mknbv
Copy link
Collaborator

mknbv commented Aug 9, 2020

@dniku I understand that it's tempting to just merge whatever looks even a little bit reasonable and just move on filing new issues later on, but having different notation/confusing suggestions for hyperparameters, etc is not going to help anybody in solving this assignment. We could've found your suggestions, for example about TFSummaries, useful previously instead of having no participation in the discussion and the previous work and just merging the results of it. There seemed to be nothing of what you call heat to me as well.

@MichaelSolotky thanks for the PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants