Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alpha value less than one? #20

Closed
kayuksel opened this issue Jan 19, 2021 · 8 comments
Closed

Alpha value less than one? #20

kayuksel opened this issue Jan 19, 2021 · 8 comments

Comments

@kayuksel
Copy link

Can alpha value be less than one?

I basically need it to be sum-normalized sigmoids in that case

(e.g. rather than softmax that is the case where alpha = 1.0).

@bpopeters
Copy link
Collaborator

bpopeters commented Jan 20, 2021

It might be possible mathematically, but as far as I know our bisection algorithm only works for alpha>1.

@bpopeters
Copy link
Collaborator

Could you explain what you mean about sum-normalized sigmoids? I'm not sure what the connection is to entmax.

@kayuksel
Copy link
Author

By sum-normalized sigmoids, I mean just taking the sigmoids of logits and then dividing them to their sum. Here (On Controllable Sparse Alternatives to Softmax) I believe that it is referred as sum normalization.
https://papers.nips.cc/paper/2018/file/6a4d5952d4c018a1c1af9fa590a10dda-Paper.pdf

I basically want to learn the sparsity and temperature parameter of the sparsemax at the same time. I thought entmax if it was possible to use alpha < 1.0 would be equivalent to that. I have tried but got NaNs

@andre-martins
Copy link
Contributor

Hi @kayuksel , it is possible to use alpha instead of a temperature parameter to control the propensity for sparsity of entmax, and gradients with respect to alpha are supported, hence alpha can be learned (this was done here: https://arxiv.org/pdf/1909.00015.pdf). However I believe the current code only supports alpha >= 1 (I believe it should not be very hard to extend bisection for alpha < 1 though, but this won't be a sparse transformation). Is alpha < 1 crucial in your problem? I didn't quite get the connection with sum-normalized sigmoids, is the idea to consider sum-normalized "entmoids" with alpha < 1?

@kayuksel
Copy link
Author

@andre-martins Since that I am able to use entmax to learn the alpha parameter, I would like to use it for learning both the optimal sparsity and temperature at the same time with a single alpha parameter. Yes, it would make a great addition to what I am working on (financial portfolio optimization). I would be more than glad if you can extend it for alpha < 1.0 in the future on your convenience.

@andre-martins
Copy link
Contributor

Hi @kayuksel I made a pull request (#22) that I think solves this problem - it should work with alpha < 1.0, and it's passing the tests. It would be great if you could try it and let us know if it worked.

@bpopeters
Copy link
Collaborator

It's merged on master now.

@kayuksel
Copy link
Author

Hello @andre-martins & @bpopeters , sorry for late response due to extreme congestion, I will surely let you know about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants