Skip to content

Conversation

nicolas-dufour
Copy link
Contributor

@nicolas-dufour nicolas-dufour commented Jul 11, 2022

Description

Here is a PR for the tensordictmodule tutorial

Motivation and Context

Tutorial to explain how to use TensorDictModule

Types of changes

What types of changes does your code introduce? Put an x in all the boxes that apply:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds core functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (update in the documentation)
  • Example (update in the folder of examples)

Implemented Tasks

  • Added initial notebook

Checklist

Go over all the following points, and put an x in all the boxes that apply.
If you are unsure about any of these, don't hesitate to ask. We are here to help!

  • I have read the CONTRIBUTION guide (required)
  • My change requires a change to the documentation.
  • I have updated the tests accordingly (required for a bug fix or a new feature).
  • I have updated the documentation accordingly.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 11, 2022
@vmoens vmoens added the documentation Improvements or additions to documentation label Jul 12, 2022
Copy link
Collaborator

@vmoens vmoens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make sure that the hierarchy is well defined (# => ## etc)
Let's make sure the lint is ok everywhere.
Let's use code format for things that are extracted from code: nn.Module and not nn.Module etc.
We need a section about functorch: make_functional_with_buffer, vmap.
Let's keep it all human-readable.

People in RL will be interested in Actor, ProbabilisticTensorModule, ProbabilisticActor, ActorCritic stuff.
You can pretty much copy-paste the docstring, but it's important they are in the tuto.

Copy link
Collaborator

@vmoens vmoens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not gone through the whole nb, many comments from my previous review have not been addressed. Can you check that they are?
Thanks for this work!

Copy link
Collaborator

@vmoens vmoens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel the transformer part is slightly too long. Once we have pointed what TensorDictSequence does, do we gain anything specific in showing how the full architecture is coded? I'm afraid people will miss the last part which is more RL-oriented because the transformer is too long.

Maybe we can put the second part of the transformer (encoder and decoder) somewhere else? Or create a separate tuto (ie separate nb) for transformers?

@nicolas-dufour
Copy link
Contributor Author

I put the transformer part since it's a good way to prove that you can do any kind of model with TensorDictModule. Maybe what we can do is inverse the RL part and the transformers. That way, the people interested in RL will see the RL part before the transformers part

Copy link
Collaborator

@vmoens vmoens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have a look at these comments

"tags": []
},
"source": [
"### `ProbabilisticTensorDictModule`"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add here:

  1. When you print the tensordict, put a few words in the print to say what it is (otherwise the print is not very readable)
  2. Add something about interaction mode: how do you switch from "random" to "mode"? What is the difference? What sampling mode exist?
  3. Add something about get_dist
  4. How do you return the parameters of the dist in the tensordict? And the log_prob? Give examples for each
  5. What distributions are available in torchrl?

This is ultra important for RL people! You must spend time on these classes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not getting a bit far from the tensordictmodule initial tutorial? Won't this need it's own tutorial on its own to dig deeper on RL modules?

"id": "dbd48bb2-b93b-4766-b7a7-19d500f17e2d",
"metadata": {},
"source": [
"### `ActorCriticOperator`"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Show the various ActorCritic and ActorValue operators, show what they do, what the differences are.
Also, ValueOperator is missing.

@vmoens
Copy link
Collaborator

vmoens commented Jul 17, 2022

We have added a tutorials/README.md, can you update it with this nb?

Copy link
Collaborator

@vmoens vmoens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comments

"id": "664adff3-1466-47c3-9a80-a0f26171addd",
"metadata": {},
"source": [
"We can see on this minimal example that the overhead introduced by `TensorDictModule` is minimal."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

introduced by TensorDictModule is minimal.

to

introduced by TensorDictModule is marginal.

Also, I would remove the init time (where TensorDictModule clearly underperforms), it's irrelevant (of course it takes more time, but it's not an operation we will repeat often)

After this we're good to go

Copy link
Collaborator

@vmoens vmoens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks!

@vmoens vmoens merged commit f07015d into pytorch:main Jul 22, 2022
@nicolas-dufour nicolas-dufour deleted the tensordictmodule_tutorial branch September 16, 2022 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants