Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: segmentation fault MC3 package after running light curves #628

Closed
2 tasks done
evamariaa opened this issue Feb 27, 2024 · 20 comments
Closed
2 tasks done

[Bug]: segmentation fault MC3 package after running light curves #628

evamariaa opened this issue Feb 27, 2024 · 20 comments
Assignees
Labels
bug Something isn't working LC Fit

Comments

@evamariaa
Copy link
Collaborator

evamariaa commented Feb 27, 2024

FAQ check

  • Yes, I checked the FAQ and my question has not been addressed.

Instrument

Light curve fitting (Stages 4-6)

What happened?

I get a segmentation fault after running light curve fits (same for dynesty and emcee, and not on all light curves) and I traced down to MC3 package.
It's the function from MC3 from mc3.stats import time_avg that causes it, previous Allan plot looks fine so I'm not sure what the issue is that MC3 crashes - is it a Eureka! or a MC3 problem? Either way it's flagged this way.
I created a new conda environment and new installation and it still pops up.

Error traceback output

When running with gdb:

EMCEE RESULTS:
rp: 0.09314383900504214 (+0.00023746832716464716, -0.00024390231091873937)
t0: 60336.16629344476 (+0.0001366072174278088, -0.00014108874165685847)
inc: 86.14041037419365 (+0.03564806637537288, -0.03959720206140105)
a: 7.5178380816512576 (+0.012216578350765062, -0.012060384627009313)
u2: 0.0596876616903001 (+0.01720078126260373, -0.016858445839115672)
c0: 1.0037328901325697 (+3.041406496073762e-05, -2.8784564057193762e-05)
c1: -0.00018125230315689576 (+0.00021138404946226995, -0.00022132731635158144)
scatter_mult: 4.604809762772209 (+0.1430173800715835, -0.12412840317554785); 489.802873463214 (+15.212425121351847, -13.203248708624503) ppm

Program received signal SIGSEGV, Segmentation fault.
0x00007ffe987d7c1e in invgamma ()
from /home/ahrer/.conda/envs/eureka-dev/lib/python3.9/site-packages/mc3/lib/_time_averaging.cpython-39-x86_64-linux-gnu.so

What operating system are you using?

Linux

What version of Python are you running?

Python 3.9.7

What Python packages do you have installed?

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@evamariaa evamariaa added the bug Something isn't working label Feb 27, 2024
@taylorbell57
Copy link
Collaborator

@evamariaa, can you see if Kevin's tweak in PR #609 for the file src/eureka/S5_lightcurve_fitting/plots_s5.py fixes this issue for you?

@taylorbell57 taylorbell57 added this to To do in Stage 5: Light Curve Fitting via automation Feb 27, 2024
@taylorbell57 taylorbell57 added this to To do in Road to v1.0 via automation Feb 27, 2024
@evamariaa
Copy link
Collaborator Author

Ok, just installed from PR #609 and running through Stage 5 now, so far I didn't get a segmentation fault yet so I'm cautiously optimistic and will let you know if it stays like this! Thanks!

@rluquer
Copy link

rluquer commented Mar 4, 2024

Same problem, but using MacOS.

@taylorbell57
Copy link
Collaborator

Can one or both of you please try the code in PR #609 and confirm that the problem is resolved for you if you use that version of the code? We'll hopefully have this bug patched ASAP

@evamariaa
Copy link
Collaborator Author

Nothing changed since my previous comment, as in it seems to run fine with the PR #609 you mentioned, no segmentation faults so far. But I haven't run Stage 5 much since then because of maintenance on our server. I should be able to run more by the end of the week and let you know!

@kevin218
Copy link
Owner

kevin218 commented Mar 5, 2024

I just merged #609 into the Main branch. Try updating your version of Eureka! and let me know if you still have problems.

@taylorbell57 taylorbell57 moved this from To do to In progress in Road to v1.0 Mar 5, 2024
@evamariaa
Copy link
Collaborator Author

Reinstalled Eureka! from the main branch and getting an error now when importing Eureka:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ahrer/.conda/envs/eureka/lib/python3.9/site-packages/eureka/__init__.py", line 28, in <module>
    from . import S5_lightcurve_fitting
  File "/home/ahrer/.conda/envs/eureka/lib/python3.9/site-packages/eureka/S5_lightcurve_fitting/__init__.py", line 12, in <module>
    from . import fitters
  File "/home/ahrer/.conda/envs/eureka/lib/python3.9/site-packages/eureka/S5_lightcurve_fitting/fitters.py", line 21, in <module>
    from . import plots_s5 as plots
  File "/home/ahrer/.conda/envs/eureka/lib/python3.9/site-packages/eureka/S5_lightcurve_fitting/plots_s5.py", line 5, in <module>
    from mc3.stats import time_avg
  File "/home/ahrer/.conda/envs/eureka/lib/python3.9/site-packages/mc3/__init__.py", line 4, in <module>
    from .sampler_driver import *
  File "/home/ahrer/.conda/envs/eureka/lib/python3.9/site-packages/mc3/sampler_driver.py", line 20, in <module>
    from .fit_driver import fit
  File "/home/ahrer/.conda/envs/eureka/lib/python3.9/site-packages/mc3/fit_driver.py", line 11, in <module>
    from . import stats as ms
  File "/home/ahrer/.conda/envs/eureka/lib/python3.9/site-packages/mc3/stats/__init__.py", line 5, in <module>
    from .stats import *
  File "/home/ahrer/.conda/envs/eureka/lib/python3.9/site-packages/mc3/stats/stats.py", line 31, in <module>
    import _binarray as ba
ImportError: /usr/local/autofs.home/home/ahrer/.conda/envs/eureka/lib/python3.9/site-packages/mc3/lib/_binarray.cpython-39-x86_64-linux-gnu.so: undefined symbol: __pow_finite

Problem is our cluster was under maintenance for a couple of days and they did lots of updates and there are still some bugs so I'm not sure if it's even a Eureka! issue or not. I will keep you posted.

@taylorbell57
Copy link
Collaborator

Did your installation of mc3 succeed? It looks to me like you might've been missing a C compiler or something which meant mc3 didn't finish installing or something

@taylorbell57
Copy link
Collaborator

@evamariaa, can you confirm whether this issue is resolved now?

@evamariaa
Copy link
Collaborator Author

I wish, but no I can't confirm either way. Our cluster here still has issues, particularly with the file and storage system and is completely unusable right now. An external company has been brought in this week to fix it so hopefully I can give you an update sometime next week.

@kevin218
Copy link
Owner

kevin218 commented Apr 3, 2024

I just updated to the latest version of Eureka! on my Linux machine and I'm getting a similar error.

ImportError: /opt/anaconda3/envs/eureka/lib/python3.9/site-packages/mc3/lib/_binarray.cpython-39-x86_64-linux-gnu.so: undefined symbol: _ZGVbN2v_exp

I can confirm that I have a valid C compiler, so maybe this is a Linux issue.

@taylorbell57
Copy link
Collaborator

Interesting, I'm running Ubuntu on Windows Subsystem for Linux which should be sufficiently similar to have encountered such an issue on my end if it were a Linux-based issue. I can try reproducing this with a Docker build though to reduce the number of unknowns.

@jbrande
Copy link
Collaborator

jbrande commented Apr 5, 2024

@taylorbell57 I'm also getting this now, on Ubuntu/WSL, identically to Kevin's error message.

ImportError: /home/jbrande/miniconda3/envs/eureka/lib/python3.9/site-packages/mc3/lib/_binarray.cpython-39-x86_64-linux-gnu.so: undefined symbol: _ZGVbN2v_exp

@kevin218
Copy link
Owner

kevin218 commented Apr 6, 2024

I pushed a workaround to Main, so updating Eureka! should "fix" the problem until we find a more permanent solution.

@taylorbell57
Copy link
Collaborator

@kevin218 and @jbrande, did you get those ImportError messages when importing Eureka! or when it came to finally making the plot? Just need to know where it was happening so I can figure out if/when I've reproduced the issue

@kevin218
Copy link
Owner

kevin218 commented May 2, 2024

The error is on import. If it fails, you'll get a warning message that states MC3 failed to import.

@kevin218
Copy link
Owner

kevin218 commented May 8, 2024

I just created a new conda environment and performed a fresh install using Python 3.10 and had no issues importing MC3. Perhaps MC3 has compatibility issues with Python 3.9.

@taylorbell57
Copy link
Collaborator

I know that jwst>=1.14.0 requires Python>=3.10, but I just made a test environment with Python=3.9.7 where I just manually installed mc3 and ipython and I was able to install and import mc3 just fine on Windows Subsystem for Linux. And trying to install the main branch of Eureka! with Python=3.9.7 gives the following error (as it should): ERROR: Package 'eureka' requires a different Python: 3.9.7 not in '>=3.10'

@jbrande and @evamariaa, could you please see if this is still an issue for you when starting from a fresh install of the main branch version (using pip install 'eureka[jwst]@git+https://github.com/kevin218/Eureka.git@main') in a fresh conda environment with Python=3.10? As a result of Kevin's recent emergency patch, I seem not to be getting any corner plots even from environments where mc3 used to work fine, so instead of testing the issue with a Stage 5 run, for now can you just try doing from mc3.stats import time_avg from an ipython or jupyter instance to see if you still get the ImportError issue?

@evamariaa
Copy link
Collaborator Author

Sorry for the delay on this, this got a little bit lost in my to-do list.
I did what you said Taylor, did a fresh install of Eureka with a new conda environment and python 3.10 and ran from mc3.stats import time_avg and that ran without an error!

@taylorbell57
Copy link
Collaborator

Alright, since two of you who previously had issues with this no longer do after creating a new conda environment, I'm going to close this issue as it seems not to be reproducible anymore. If anyone re-encounters this issue, feel free to leave a comment and we can re-open this issue

@taylorbell57 taylorbell57 closed this as not planned Won't fix, can't repro, duplicate, stale May 24, 2024
Stage 5: Light Curve Fitting automation moved this from To do to Done May 24, 2024
Road to v1.0 automation moved this from In progress to Done May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working LC Fit
Development

No branches or pull requests

6 participants