-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parametrization of softmax-augmented #43
Comments
How is RMSE computed here? The reason behind the choice of The choice of I'm trying to work out a more principled choice of |
In fact, the augmented simplex and the ILR are very, very similar. If I remove the Helmert matrix thing I get out this transform, except that it's parameterized nicely for HMC. Here's the code. I'll make a pr for both and we can discuss what we want to do. Since the ILR is just a linear scaling of the input vector, I don't see how it is any different. The below Stan model seems to work for all N > 1. The main thing is to set the base to 0 and update the log-abs-determinant for this. data {
int<lower=0> N;
vector<lower=0>[N] alpha;
}
transformed data {
real half_logN = 0.5 * log(N);
}
parameters {
vector[N - 1] y;
}
transformed parameters {
real<lower=0> logr = log_sum_exp(append_row(y, 0));
simplex[N] x = exp(append_row(y, 0) - logr);
}
model {
target += sum(y) - N * logr + half_logN;
// target += target_density_lp(x, alpha);
} |
thanks! makes sense, we could actually write it as augmented ilr or
something like that. going by the convention of alr,ilr,cle etc
…On Mon, Jul 25, 2022 at 10:25 AM Sean Pinkney ***@***.***> wrote:
In fact, the augmented simplex and the ILR are very, very similar. If I
remove the Helmert matrix thing I get out this transform, except that it's
parameterized nicely for HMC.
Here's the code. I'll make a pr for both and we can discuss what we want
to do. Since the ILR is just a linear scaling of the input vector, I don't
see how it is any different.
The below Stan model seems to work for all N > 1. The main thing is to set
the base to 0 and update the log-abs-determinant for this.
data {
int<lower=0> N;
vector<lower=0>[N] alpha;
}transformed data {
real half_logN = 0.5 * log(N);
}parameters {
vector[N - 1] y;
}transformed parameters {
real<lower=0> logr = log_sum_exp(append_row(y, 0));
simplex[N] x = exp(append_row(y, 0) - logr);
}model {
target += sum(y) - N * logr + half_logN;// target += target_density_lp(x, alpha);
}
—
Reply to this email directly, view it on GitHub
<#43 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANEZILBHCDPQXBU7JWTHBATVV2PUTANCNFSM54NH6TTA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Actually, this is pretty funny. What I just did is the softmax parameterization just with a more efficient log-abs-det calculation. It is equivalent to this. Let me close that PR and make a new one that updates the softmax code. data {
int<lower=0> N;
vector<lower=0>[N] alpha;
}
transformed data {
real half_logN = 0.5 * log(N);
}
parameters {
vector[N - 1] y;
}
transformed parameters {
simplex[N] x = softmax(append_row(y, 0));
}
model {
target += sum(y) - N * log_sum_exp(append_row(y, 0)) + half_logN;
// target += target_density_lp(x, alpha);
} |
@sethaxen if i remember correctly you suggested using p=1/N for the augmented softmax, I can see that the RMSE plots for that version are near straight lines or weird curves in some parametrizations and alright in some, the error isn't high or something but yeah. they come out similar to the rest when i take p=0.5 or something.
in contrast with this for p=0.5. do you have any thoughts about which values of p we should we go for in the actual paper
The text was updated successfully, but these errors were encountered: