Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use custom loss ? #58

Closed
duburcqa opened this issue May 28, 2020 · 8 comments
Closed

How to use custom loss ? #58

duburcqa opened this issue May 28, 2020 · 8 comments
Assignees
Labels
documentation refactoring No change to functionality

Comments

@duburcqa
Copy link
Collaborator

duburcqa commented May 28, 2020

I would like to add the following extra term to the loss function,
|| y_{pred} - y_{ref} ||_2^2
where y_{pred} is the action sampled by the distribution, and y_{ref} can be computed by the actor.

What is the best way to do it using your framework ? The point is being able to take advantage of the analytical gradient computation.

The only way I can think of is to overwrite the whole learn method of the policy (i.e. PPO algorithm), but it feels inconvenient just to add an extra line of code...

Thank you in advance,

Best,
Alexis

@Trinkle23897
Copy link
Collaborator

Trinkle23897 commented May 29, 2020

That's a good question. Currently, you can either inherit the policy class (as you mentioned) or change the original framework's code to meet your expectations.

It can be discussed further. Some existing frameworks (like RLlib) modularized the loss function part. But in my opinion, this could be inconvenient for further development. Since the loss function is highly customizable, making the abstraction of the loss function will double the code complexity.

@Trinkle23897 Trinkle23897 added the question Further information is requested label May 29, 2020
@Trinkle23897 Trinkle23897 added this to TODO in Issue/PR Categories via automation May 29, 2020
@Trinkle23897 Trinkle23897 moved this from TODO to Usage in Issue/PR Categories May 29, 2020
@duburcqa
Copy link
Collaborator Author

duburcqa commented May 29, 2020

Ok, I got your point and I agree with you. But what about adding a loss_fn in the abstract base class for policies, that is basically doing nothing by default but can be overridden by the user? Because I really don't like the idea to have to overwrite 'learn' itself since it is a major source of error.

In this case, it is not a custom loss strictly speaking, but rather additional component to the original loss function (regularization), that may depend on the actor. So that it only consists in an extra function call before calling backward. I don't know if doing so is usual or not.

@duburcqa
Copy link
Collaborator Author

@Trinkle23897 Up !

@Trinkle23897
Copy link
Collaborator

@Trinkle23897 Up !

I have no time after #106 before this Friday...Many things to do

@duburcqa
Copy link
Collaborator Author

No problem ! I can do it ! But what do you think about the idea ?

@Trinkle23897
Copy link
Collaborator

I think that add loss_fn is okay, but what's its input?

@oldcricket
Copy link

@duburcqa It's a great idea to make it easier with a customized loss. I wondered if you have made any progress on that. Thanks!

@MischaPanch
Copy link
Collaborator

The loss is an integral part of the algorithm, so maybe inheriting and overriding is better than allowing users to pass custom losses. It's a central design question, I don't see it being necessary for the 1.0.0 release, but would keep the issue open

@MischaPanch MischaPanch added documentation and removed question Further information is requested labels Oct 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation refactoring No change to functionality
Projects
No open projects
Development

No branches or pull requests

4 participants