parametrization of softmax-augmented #43

mjhajharia · 2022-07-23T02:41:30Z

@sethaxen if i remember correctly you suggested using p=1/N for the augmented softmax, I can see that the RMSE plots for that version are near straight lines or weird curves in some parametrizations and alright in some, the error isn't high or something but yeah. they come out similar to the rest when i take p=0.5 or something.

in contrast with this for p=0.5. do you have any thoughts about which values of p we should we go for in the actual paper

sethaxen · 2022-07-23T09:35:46Z

How is RMSE computed here?

The reason behind the choice of $p=1/N$ is that it empirically decorrelates the $y_i$ values. However, what I didn't look at is the effect it has on position and variance of the marginals. The choice of $p$ doesn't seem to impact marginal variance, but it shifts the mean by a lot, which is probably making adaptation hard.

Given this, I'm not surprised it's failing for large $N$.

The choice of $p=1$ seems to always center the draws around the origins regardless of $N$. In fact, increasing $N$ leaves the marginal distribution of $y_i$ completely unchanged:

I'm trying to work out a more principled choice of $p$ using some of the ideas in #9 (comment).

sethaxen · 2022-07-23T09:36:34Z

I also plan to look into @spinkney's observation in #37, which is interesting.

spinkney · 2022-07-25T14:25:03Z

In fact, the augmented simplex and the ILR are very, very similar. If I remove the Helmert matrix thing I get out this transform, except that it's parameterized nicely for HMC.

Here's the code. I'll make a pr for both and we can discuss what we want to do. Since the ILR is just a linear scaling of the input vector, I don't see how it is any different.

The below Stan model seems to work for all N > 1. The main thing is to set the base to 0 and update the log-abs-determinant for this.

data {
 int<lower=0> N;
 vector<lower=0>[N] alpha;
}
transformed data {
  real half_logN = 0.5 * log(N);
}
parameters {
 vector[N - 1] y;
}
transformed parameters {
 real<lower=0> logr = log_sum_exp(append_row(y, 0));
 simplex[N] x = exp(append_row(y, 0) - logr);
}
model {
 target += sum(y) - N * logr + half_logN;
// target += target_density_lp(x, alpha);
}

mjhajharia · 2022-07-25T14:27:01Z

thanks! makes sense, we could actually write it as augmented ilr or something like that. going by the convention of alr,ilr,cle etc

…

On Mon, Jul 25, 2022 at 10:25 AM Sean Pinkney ***@***.***> wrote: In fact, the augmented simplex and the ILR are very, very similar. If I remove the Helmert matrix thing I get out this transform, except that it's parameterized nicely for HMC. Here's the code. I'll make a pr for both and we can discuss what we want to do. Since the ILR is just a linear scaling of the input vector, I don't see how it is any different. The below Stan model seems to work for all N > 1. The main thing is to set the base to 0 and update the log-abs-determinant for this. data { int<lower=0> N; vector<lower=0>[N] alpha; }transformed data { real half_logN = 0.5 * log(N); }parameters { vector[N - 1] y; }transformed parameters { real<lower=0> logr = log_sum_exp(append_row(y, 0)); simplex[N] x = exp(append_row(y, 0) - logr); }model { target += sum(y) - N * logr + half_logN;// target += target_density_lp(x, alpha); } — Reply to this email directly, view it on GitHub <#43 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ANEZILBHCDPQXBU7JWTHBATVV2PUTANCNFSM54NH6TTA> . You are receiving this because you authored the thread.Message ID: ***@***.***>

spinkney · 2022-07-25T14:47:42Z

Actually, this is pretty funny. What I just did is the softmax parameterization just with a more efficient log-abs-det calculation.

It is equivalent to this. Let me close that PR and make a new one that updates the softmax code.

data {
 int<lower=0> N;
 vector<lower=0>[N] alpha;
}
transformed data {
  real half_logN = 0.5 * log(N);
}
parameters {
 vector[N - 1] y;
}
transformed parameters {
 simplex[N] x = softmax(append_row(y, 0));
}
model {
 target += sum(y) - N * log_sum_exp(append_row(y, 0)) + half_logN;
// target += target_density_lp(x, alpha);
}

mjhajharia added discussion Paper-writing Plot labels Jul 23, 2022

spinkney mentioned this issue Jul 25, 2022

Create augmented_ilr.stan #45

Merged

spinkney mentioned this issue Jul 25, 2022

Update softmax.stan #46

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

parametrization of softmax-augmented #43

parametrization of softmax-augmented #43

mjhajharia commented Jul 23, 2022

sethaxen commented Jul 23, 2022

sethaxen commented Jul 23, 2022

spinkney commented Jul 25, 2022

mjhajharia commented Jul 25, 2022 via email

spinkney commented Jul 25, 2022 •

edited

Loading

parametrization of softmax-augmented #43

parametrization of softmax-augmented #43

Comments

mjhajharia commented Jul 23, 2022

sethaxen commented Jul 23, 2022

sethaxen commented Jul 23, 2022

spinkney commented Jul 25, 2022

mjhajharia commented Jul 25, 2022 via email

spinkney commented Jul 25, 2022 • edited Loading

spinkney commented Jul 25, 2022 •

edited

Loading