Upgrade atari_wrapper to tf2 #452

MichaelSolotky · 2020-08-04T22:29:31Z

There were 2 functions in the class TFSummary, they used tf1 API, I replaced them with corresponding functions from tf2 API (both are referenced here https://www.tensorflow.org/api_docs/python/tf/summary/scalar).
There was unnecessary import in the add_summary_scalar function. Tensorflow has to be imported by the time this method is called cause importing is in the constructor of the TFSummary class.
There was no alternative in natrure_dqn to use NumPy summary even if a corresponding argument is set into a corresponding state -- bug fix.
Different style of putting (or not putting) spaces between an argument name and its value in a function call in one file -- code style fix

review-notebook-app · 2020-08-04T22:29:34Z

Check out this pull request on

Review Jupyter notebook visual diffs & provide feedback on notebooks.

Powered by ReviewNB

MichaelSolotky · 2020-08-04T22:53:36Z

If everything's fine, #181 can be closed

dniku · 2020-08-05T11:11:51Z

LGTM at a glance. Two questions:

Did you copy actor-critic theory directly from week08_pomdp/practice_pytorch.ipynb without any modifications?
Have you tested this?

MichaelSolotky · 2020-08-05T11:30:05Z

Yep, there was one markdown cell, I think it's enough to explain the actor-critic theory.
No, I thought somebody else among teachers has some code that uses TFSummary class and you can call it and see whether everything's fine. I also waited for @justheuristic to tell whether it's easy for you to test it or we should write those tests.

dniku · 2020-08-05T12:27:15Z

I don't have any code for testing summaries, but those who worked on checking homework assignments this spring probably do, and @justheuristic should be able to handle this. Regarding writing tests: that would be very nice, and if you feel that you can get them working with little effort, then by all means please implement them. However, if that requires some effort, we'd better direct it towards getting other assignments working with TF2.

MichaelSolotky · 2020-08-05T12:43:27Z

Ok, I'll come back to that soon. Maybe it's not that difficult

week06_policy_based/atari_wrappers.py

MichaelSolotky · 2020-08-07T14:45:51Z

Ok, I've tested the updated version. It doesn't work. But it seems like the previous version didn't work as well. It looks like the reason of using TFSummaries is to eventually look at plots in tensorboard, but there's no writing in files anywhere in this class. Also the step variable here https://github.com/yandexdataschool/Practical_RL/blob/master/week06_policy_based/atari_wrappers.py#L287
is not incremented, but it's the value of x axis, it should be incremented. I've fixed these to things in my local version and everything started to work.

MichaelSolotky · 2020-08-07T15:12:47Z

With @justheuristic we decided to remove TFSummaries, it looks like NumpySummaries should be enough. Plots can be built in pyplot instead of tensorboard.

mknbv · 2020-08-07T18:49:56Z

I was pretty sure that the previous version worked. Are you sure you enabled recording and flushed the writer after tf.contrib.summary.scalar function is called (notice also that it's only called at the end of episodes)?

IMO having a way to write summaries to tensorboard is good because it simplifies things quite a bit. Notice that there are many things suggested for plotting and monitoring during training in this task and it could be quite messy when only using matplotlib.

MichaelSolotky · 2020-08-07T20:38:07Z

Well, I'm not insisting on removal. It's easy enough to make a working version with tf2 with some additions to the previous version. First of all, in the previous version summaries were only collected and weren't written anywhere (even logdir wasn't mentioned anywhere) and I think that would stop people who wanted to use tensorboard. So I think the writer should be a member of the TFSummaries class.
And I've tried to run it once again and I'm not sure not whether the previous version was working.

MichaelSolotky · 2020-08-07T22:17:02Z

@michaelkonobeev what do you think about this version? (I've tested it)

mknbv · 2020-08-08T04:47:33Z

My reasoning for not including writer into TFSummaries was that the caller might want to log other things, such as components of the loss function and in the earlier version of tensorflow/tensorboard it was not possible to have two writers that write into the same directory (and I don't think that it's likely that this has changed). Maybe it would be better to keep the writer out of the TFSummaries class and document how it could be used? This would also make the interface a bit simpler since there is no need to specify log_dir argument for TFSummaries which is consistent with NumpySummaries.

Also, currently global_step counts only the number of finished episodes by some batched environment which is not very intuitive. I think it is quite a bit more useful to count the total number of interactions since we typically limit it rather than the number of finished episodes which could change drastically depending on episode length which in turn grows with time if agent learns successfully. So I would suggest adding self.nenvs to global_step before checking self.should_write_summaries.

MichaelSolotky · 2020-08-08T09:09:37Z

Mm, yeah, I think it's possible someone would like to log some additional stuff. And do you think it's ok to write some code for logging that stuff outside the TFSummaries class? For me, it seems counterintuitive, like you want to store some additional metrics in tf.summaries and do it outside the TFSummaries class. Like you call env.step and then after that you call some additional function write_loss_components and you also shouldn't forget about the writer. If I were a student solving that task I would like to not keep in mind that for every call of env.step I should manually write summaries in the log_dir. I would just redefine the TFSummary.add_summaries method right in the notebook and add there a call of the write_loss_components and that's it.
Yeah, I think, we should document, how the TFSummaries could be used. A small tutorial for beginners can be left there as well https://www.tensorflow.org/tensorboard/get_started
And yeah, the interface becomes a bit more inconsistent with NumpySummaries, which isn't well, but not that bad I think.
And yeah, that's a good point to count the total number of interactions, but how exactly you want to measure it? What you mean by adding self.nenvs to global_step?

MichaelSolotky · 2020-08-08T09:15:22Z

And this small PR has taken more time than I expected, so can we just have the last round of conversations and changes and do sth with it?

mknbv · 2020-08-08T20:04:51Z

I think it's totally ok to have additional code for summaries outside the class. The confusion seems to stem from how summaries in tensorboard work more generally. A student could call tf.summary.add_scalar in the definition of A2C losses, then when writing a training loop define the writer and set it as default. This seems to me like the most typical way of writing summaries in tensorflow (to which I think you've already linked) so it should not be counterintuitive if you're familiar with it, and there is no need to call env.step manually here, it should be done through EnvRunner. So I suggest removing log_dir argument and adding either to docstring of TFSummaries or in the notebook itself that other summaries could be added in A2C or elsewhere and summary writer should be created and set as default before the training loop.

week06_policy_based/atari_wrappers.py

MichaelSolotky · 2020-08-08T23:39:02Z

Ok, I think the code is ready now. There should only be problems with a dirty commit history.

week06_policy_based/a2c-optional.ipynb

dniku · 2020-08-09T11:23:08Z

Um, guys. This is some code to simplify Tensorboard logging. Its existence is not even documented in the notebook, and I guess 95% of students won't bother to read the entire atari_wrappers.py (I certainly wouldn't). I don't know what's causing this much heat here, but I think this PR is safe to merge. If there are any specific student-visible problems with this code, I would like to respectfully ask @michaelkonobeev to file issues for them, but I do not think any more effort should be spent on this particular PR.

mknbv · 2020-08-09T13:51:59Z

@dniku I understand that it's tempting to just merge whatever looks even a little bit reasonable and just move on filing new issues later on, but having different notation/confusing suggestions for hyperparameters, etc is not going to help anybody in solving this assignment. We could've found your suggestions, for example about TFSummaries, useful previously instead of having no participation in the discussion and the previous work and just merging the results of it. There seemed to be nothing of what you call heat to me as well.

@MichaelSolotky thanks for the PR!

MichaelSolotky added 3 commits August 5, 2020 00:56

add actor-critic theory

5520368

update tf functions for tf2 api

550f027

bug + code style fix

7ceeaa4

mknbv reviewed Aug 5, 2020

View reviewed changes

week06_policy_based/atari_wrappers.py Outdated Show resolved Hide resolved

revert the import of tf in a method

eb39f84

remove TFSummaries

a2af208

add tf2 summaries

b1cdba9

mknbv reviewed Aug 8, 2020

View reviewed changes

week06_policy_based/atari_wrappers.py Show resolved Hide resolved

mknbv reviewed Aug 8, 2020

View reviewed changes

week06_policy_based/atari_wrappers.py Outdated Show resolved Hide resolved

MichaelSolotky force-pushed the tf2.x branch from caf59bd to 7b63a7f Compare August 8, 2020 23:25

remove log_dir from TFSummaries

4f10ae2

MichaelSolotky force-pushed the tf2.x branch from 7b63a7f to 4f10ae2 Compare August 8, 2020 23:30

bring back links to a2c algo description

517e4de

mknbv reviewed Aug 9, 2020

View reviewed changes

week06_policy_based/a2c-optional.ipynb Show resolved Hide resolved

week06_policy_based/a2c-optional.ipynb Outdated Show resolved Hide resolved

MichaelSolotky and others added 3 commits August 9, 2020 12:11

unify notation in formulas

a9837c2

Add add_summary_scalar() stub to SummariesBase

f2b9fdc

Replace default False value with a more idiomatic None

30547f8

dniku merged commit edfdd1c into yandexdataschool:tf2.x Aug 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade atari_wrapper to tf2 #452

Upgrade atari_wrapper to tf2 #452

MichaelSolotky commented Aug 4, 2020

review-notebook-app bot commented Aug 4, 2020

MichaelSolotky commented Aug 4, 2020 •

edited

Loading

dniku commented Aug 5, 2020

MichaelSolotky commented Aug 5, 2020

dniku commented Aug 5, 2020 •

edited

Loading

MichaelSolotky commented Aug 5, 2020 •

edited

Loading

MichaelSolotky commented Aug 7, 2020 •

edited

Loading

MichaelSolotky commented Aug 7, 2020 •

edited

Loading

mknbv commented Aug 7, 2020

MichaelSolotky commented Aug 7, 2020

MichaelSolotky commented Aug 7, 2020

mknbv commented Aug 8, 2020

MichaelSolotky commented Aug 8, 2020 •

edited

Loading

MichaelSolotky commented Aug 8, 2020 •

edited

Loading

mknbv commented Aug 8, 2020

MichaelSolotky commented Aug 8, 2020

dniku commented Aug 9, 2020

mknbv commented Aug 9, 2020

Upgrade atari_wrapper to tf2 #452

Upgrade atari_wrapper to tf2 #452

Conversation

MichaelSolotky commented Aug 4, 2020

review-notebook-app bot commented Aug 4, 2020

MichaelSolotky commented Aug 4, 2020 • edited Loading

dniku commented Aug 5, 2020

MichaelSolotky commented Aug 5, 2020

dniku commented Aug 5, 2020 • edited Loading

MichaelSolotky commented Aug 5, 2020 • edited Loading

MichaelSolotky commented Aug 7, 2020 • edited Loading

MichaelSolotky commented Aug 7, 2020 • edited Loading

mknbv commented Aug 7, 2020

MichaelSolotky commented Aug 7, 2020

MichaelSolotky commented Aug 7, 2020

mknbv commented Aug 8, 2020

MichaelSolotky commented Aug 8, 2020 • edited Loading

MichaelSolotky commented Aug 8, 2020 • edited Loading

mknbv commented Aug 8, 2020

MichaelSolotky commented Aug 8, 2020

dniku commented Aug 9, 2020

mknbv commented Aug 9, 2020

MichaelSolotky commented Aug 4, 2020 •

edited

Loading

dniku commented Aug 5, 2020 •

edited

Loading

MichaelSolotky commented Aug 5, 2020 •

edited

Loading

MichaelSolotky commented Aug 7, 2020 •

edited

Loading

MichaelSolotky commented Aug 7, 2020 •

edited

Loading

MichaelSolotky commented Aug 8, 2020 •

edited

Loading

MichaelSolotky commented Aug 8, 2020 •

edited

Loading