-
Notifications
You must be signed in to change notification settings - Fork 231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Divided by zero in the Sobol's confidence level calculation #619
Comments
Thank you for the report. I am not sure about the check that you propose. This is a machine precision issue and so I would lean on updating the check to the following instead What do you think @ConnectedSystems? |
I agree with you @tupui As an aside, perhaps I don't follow the exact context (haven't had the time to run a few checks and the MWE) but I thought https://numpy.org/doc/stable/reference/generated/numpy.ptp.html Just puzzled as to how this specific case got past the check... Or is |
Thanks for follow up the issue. Then in this line of code What I'm proposing is not use the check that flatten the array, we should check So I think the What do you think? |
I would actually prefer if we worked toward using the new API we added to SciPy: https://github.com/scipy/scipy/blob/main/scipy/stats/_sensitivity_analysis.py Here we handle this issue by letting NaN propagate and then replace all NaN with 0. Simpler and vectorized operation. |
Hi Tupui, I agree with you that replace all NaN with 0 is more simpler and clean. |
Thanks for the links. Out of curiosity, did you check the result that SciPy gives? Looks like we might have to also fix that in SciPy then. We need to take into account machine precision there as we are not computing anything in arbitrary precision. |
Hi again, When i last looked into this, most, if not all, packages did not handle this issue well. I personally do not like the propagate and replace approach (meaning propagating the NaNs and replacing those with zeros). My thinking is that this issue arises when there is a no activity (no influence of Which is why I put in the use of Hope this thought process makes sense, and happy to be convinced otherwise! |
Hi Tupui I haven't had time to try scipy yet. My local environment is window 10 x64. My service on cloud is based on Debian 11 docker image. |
Hi @tupui I've tried the scipy sobol_indices. It's not easy to reproduce the issue use my own data, so I use another synthetic data with blow code: import numpy as np
from scipy.stats import sobol_indices, uniform
rng = np.random.default_rng()
def test_func(x):
f_eval = np.ones(x.shape[1])
f_eval[0] = -139.8
return f_eval
indices = sobol_indices(
func=test_func, n=128,
dists=[
uniform(loc=-np.pi, scale=2*np.pi),
uniform(loc=-np.pi, scale=2*np.pi),
uniform(loc=-np.pi, scale=2*np.pi),
uniform(loc=-np.pi, scale=2*np.pi),
uniform(loc=-np.pi, scale=2*np.pi)
],
random_state=rng
) This way, the I don't know the computation step here, but seems like the small variance value finally cause devided 0 in the So the issue will not showed in SciPy. It's just my investigation, hope you can give more insights about this. Thanks! |
Hi SALib team,
I have run the sobol analyze and get extremely large confidence interval like this:
![image](https://private-user-images.githubusercontent.com/26685122/330383779-a18c1ce5-955b-43b1-bf85-4df098f0ac23.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTg3OTkzNDUsIm5iZiI6MTcxODc5OTA0NSwicGF0aCI6Ii8yNjY4NTEyMi8zMzAzODM3NzktYTE4YzFjZTUtOTU1Yi00M2IxLWJmODUtNGRmMDk4ZjBhYzIzLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MTklMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjE5VDEyMTA0NVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTk1MTJhYzI2YTQ1NmZiZjQyMTAyZDExYjcwZDNiMGY0Y2M2M2ZiZDUzYzUxYzFlNjNmNjQ5MTRiNGI2NzlkMWYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.dKo_lu9nnceD0FJF0mi0aWl3EXBFbFqlHk9xQQNGVNE)
This is the an MWE:
I found that this is due to the validation below can only cover if y is 1d array, but in confidence level calculation the y's shape is like (sample number, 100):
![image](https://private-user-images.githubusercontent.com/26685122/330385196-f7023d6d-ac1c-45f8-a0e8-4829f958ae0c.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTg3OTkzNDUsIm5iZiI6MTcxODc5OTA0NSwicGF0aCI6Ii8yNjY4NTEyMi8zMzAzODUxOTYtZjcwMjNkNmQtYWMxYy00NWY4LWEwZTgtNDgyOWY5NThhZTBjLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MTklMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjE5VDEyMTA0NVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTJlNjYwMDRhZDg0YWQ1ZjA3MzZiZmQ5NDc4ODg5NGEyYjQyNjIxOWEzNzMzYTg0YTFmYzYyN2ZkODQ2NTJiYzcmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.ko2FEHi_LYID6DQpnZpA-AQCmeo1Dq5uuAfCursZ20k)
One strange thing is that in some row the np.ptp(y, axis=0) equals 0, but the np.var(y, axis=0) is 9.75002034e-32 instead of 0, so the program never get exception. It takes me a lot of time to find this.
![image](https://private-user-images.githubusercontent.com/26685122/330397664-4a115ac4-9b0e-4b8d-a612-c33efb38fa49.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTg3OTkzNDUsIm5iZiI6MTcxODc5OTA0NSwicGF0aCI6Ii8yNjY4NTEyMi8zMzAzOTc2NjQtNGExMTVhYzQtOWIwZS00YjhkLWE2MTItYzMzZWZiMzhmYTQ5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MTklMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjE5VDEyMTA0NVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTg0ODJmOThjYmIxNjIzYjc2YTcwYjgyNzZjNTliMWRmMjg4Y2MxZGFkMDFiNTljZGE1MDMyOGFhM2YxODkyNWImWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.PyapcdkiP5hOZ_p4q9uHlDs5wxNWjcReaNrIR1Wiq1A)
So I suggested to validate every row in y, and give 0 value for any row that y.ptp is 0:
This works perfectly for my case:
![image](https://private-user-images.githubusercontent.com/26685122/330398032-d21d9ced-6e69-4a96-be87-b0432fe5e593.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTg3OTkzNDUsIm5iZiI6MTcxODc5OTA0NSwicGF0aCI6Ii8yNjY4NTEyMi8zMzAzOTgwMzItZDIxZDljZWQtNmU2OS00YTk2LWJlODctYjA0MzJmZTVlNTkzLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MTklMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjE5VDEyMTA0NVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWI5NDRiMzljNzA2YjExOWExMTM4ZWY4NGUwOTVjMjYxYjJlMzI2NDBmNGYxYmJmZWFhNDg5YWJkMzc5YjE1MTImWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Wb5ZmdxP4PMgsTqpRY01A3gONugaQo9HM8r5aHmUtDM)
![image](https://private-user-images.githubusercontent.com/26685122/330398201-451b77fd-43e2-47df-82e0-ce0776da66f0.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTg3OTkzNDUsIm5iZiI6MTcxODc5OTA0NSwicGF0aCI6Ii8yNjY4NTEyMi8zMzAzOTgyMDEtNDUxYjc3ZmQtNDNlMi00N2RmLTgyZTAtY2UwNzc2ZGE2NmYwLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MTklMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjE5VDEyMTA0NVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWQzOGFkNjU0ODY3NGI3NzFmNmQ0ODUwOTQ1MDAyYzc0YmMyNzIyYTVmZmQxZmEwNmM4MGIyMzhkY2Y4YmVlOGUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.6GQPsEw7_xHxs66SD71VQv26s-1W_8hL5pru2FY68hs)
Result include second order:
If you are ok with this solution, I can submit a PR later.
Please let me know if you have any concern about this.
Thanks
The text was updated successfully, but these errors were encountered: