Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sobol index considering MvNormal distribution #364

Open
goghino opened this issue Nov 17, 2021 · 3 comments
Open

Sobol index considering MvNormal distribution #364

goghino opened this issue Nov 17, 2021 · 3 comments
Labels

Comments

@goghino
Copy link

goghino commented Nov 17, 2021

Describe your problem
Hello, I am trying to compute the sensitivity of my model to parameters which are correlated. I use cp.MvNormal() to model the distribution of the parameters, using the Cholesky decorrelation algorithm to construct the orthogonal polynomials. But with this approach I get stuck at calling cp.Sens_m(uhat, joint) which requires joint to be stochastically independent.

Initial implementation
Roughly, this is the code that I am using, failing at the last step calling cp.Sens_m()

mu = [0.05, 20]
cov = [[6.4e-05, 0.005], [0.005, 2.25]]
joint = cp.MvNormal(mu, cov)
expansion =  cp.expansion.cholesky(P, joint, normed=True)
nodes = joint.sample(N, rule="M")
data = model(nodes)
uhat = cp.fit_regression(expansion, nodes, data)
cp.Sens_m(uhat, joint)

Additional context
I've also tried the approach with Rosenblatt transformation, but the 1st order Sobol indices I get seem to be wrong. The sum across the parameters is less than 1, while the total order is greater than 1 (somehow mirror image of the first order sums).

mu = [0.05, 20]
cov = [[6.4e-05, 0.005], [0.005, 2.25]]
joint = cp.MvNormal(mu, cov)
joint_U = cp.J(cp.Normal(), cp.Normal())
expansion =  cp.expansion.stieltjes(P, joint_U, normed=True)
nodes_U = joint_U.sample(N, rule="M")
nodes = joint.inv(joint_U.fwd(nodes_U))
data = model(nodes)
uhat = cp.fit_regression(expansion, nodes_U, data)
cp.Sens_m(uhat, joint_U)

Sobol_sum zone_21

Do you have any advice/recommendations what else should I try? How should I interpret the Sobol indices I've got with the Rosenblatt approach?

@jonathf
Copy link
Owner

jonathf commented Nov 17, 2021

So there are a few of things to note.

When it comes to sum smaller or larger than one, your results seems to be valid. Sum of main indices can be smaller than 1 and Sum of total indices can be larger.
See wikipedia on Sobol indices.

Second, as a general rule of thumb, Sobol indices for dependent random variables is a hard problem.
The basic formula breaks down as conditioning for dependent variables can quickly become paradoxes.

Third, Rosenblatt tranformation technique does not work with Sobol indices as you are now measuring the wrong thing. Sobol is a measure between input and output, but with the transformation you effectively replacing your input with new more managable ones.

So what are you left with then?

I think your best bet is Saltellis method (also described in the wikipedia article).
In Chaospy these are available through cp.Sens_m_sample and cp.Sens_t_sample.

@goghino
Copy link
Author

goghino commented Nov 29, 2021

Hi Jonathan,

I see your point about the sums, it completely makes sense!

Regarding your 3rd point, I have run an experiment, comparing 1st order Sobol indices obtained by Quasi-Monte Carlo using Saltelli method and PCE with the Rosenblatt transformation approach described above. Both approaches seem to produce the same result, expect at t=0. The model I've used is the Coffee cup, where the dependency between kappa and T_env parameters is introduced using cp.MvNormal(). The correlation between the variables is quite large, with Pearson coeff. 0.83. Could you please extend your argument why this 'Rosenblatt transformation technique does not work with Sobol indices'? Could you maybe point me to some more mathematical reasoning behind this?

SAerror_t_env.pdf
SAerror_kappa.pdf

@jonathf
Copy link
Owner

jonathf commented Dec 7, 2021

Sorry the late reply. My kid's kindergarten has been hit by the omicron variant and we're stuck in quarentine.

Rosenblatt-transformation is in its essence a variable switch. Instead of looking at f(q) as a function of q, you are instead looking at f(T(r)) as a function of r. This works well in the PCE context because q = T(r), and r is typically a simpler (uncorrelated) variable compare to q.

Sobol indices is by its design an interaction between input and output. For f(q) this is straight forward, as it is obvious what is input and what is output. f(T(r)) however is not so simple. Yes, you can absolutly calculate this and get an answer, but the result is with respect to r, not q which is the original problem you want to solve for.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants