Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix macos segmentation fault #3518

Merged
merged 19 commits into from
Mar 4, 2024

Conversation

CloseChoice
Copy link
Collaborator

@CloseChoice CloseChoice commented Feb 23, 2024

Overview

Supports #3524

Description of the changes proposed in this pull request:

  • pin pytorch version on macos to avoid segmentation fault

Checklist

  • All pre-commit checks pass.
  • [ ] Unit tests added (if fixing a bug or adding a new feature)

@connortann
Copy link
Collaborator

Thanks for looking it to this. Have you been able to make any headway into determining what caused has caused the MacOS tests to suddenly start failing?

It looks like the failure on the latest commit has progressed to something related to lightgbm, which seems like progress. Might I suggest restricting the macos tests to extras: "test-core" , to see if at least the core shap tests pass?

@connortann connortann added bug Indicates an unexpected problem or unintended behaviour ci Relating to Continuous Integration / GitHub Actions labels Feb 26, 2024
@CloseChoice
Copy link
Collaborator Author

Thanks for looking it to this. Have you been able to make any headway into determining what caused has caused the MacOS tests to suddenly start failing?

It looks like the failure on the latest commit has progressed to something related to lightgbm, which seems like progress. Might I suggest restricting the macos tests to extras: "test-core" , to see if at least the core shap tests pass?

My commits don't feel like progress but more like guessing. I first thought that it is a new python version (since the last successful run was executed with python 3.11.7) that caused the problem on macos and there was a historical parallel in python2.x. Am now to reverting to macos11 but this shouldn't be permanent solution. Good idea with the adjustment of extras, will try it

@connortann
Copy link
Collaborator

connortann commented Feb 26, 2024

Here's an issue for us to discuss / debug: #3524

I've proposed that temporary hotfix over on #3525 to fix the CI so that other PRs are unblocked. However, it would be good to get to the root cause of the issue. Thanks again for taking this one on @CloseChoice !

@connortann
Copy link
Collaborator

Could also try the MacOS tests on 3.12 ?

@connortann connortann changed the title pin versions WIP: fix macos segmentation fault Mar 1, 2024
@CloseChoice
Copy link
Collaborator Author

Just put this here since it's a good reference:

Here are a couple notes:
The last successful run of the macos pipeline on master was this: https://github.com/shap/shap/actions/runs/7972563240/job/21765144274.
I debugged the macos pipeline using https://github.com/mxschmitt/action-tmate and found that the segmentation faults happen in the pytorch tests, in the lines where one calls the model on data, e.g. here. From the successful run I found that we used torch version 2.2.0 there instead of 2.2.1. Will check if it works if I pin the version.

@CloseChoice
Copy link
Collaborator Author

Should probably file an issue in the torch repo, but will leave that for tomorrow

Copy link

codecov bot commented Mar 2, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 60.65%. Comparing base (4e70ccf) to head (7d71ebe).
Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #3518   +/-   ##
=======================================
  Coverage   60.65%   60.65%           
=======================================
  Files          90       90           
  Lines       12722    12722           
=======================================
  Hits         7717     7717           
  Misses       5005     5005           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@CloseChoice
Copy link
Collaborator Author

Opened an issue on pytorch for this: pytorch/pytorch#121101

@connortann
Copy link
Collaborator

Great job for figuring this out! This looks good to me. There is "WIP" in the title - is this ready to merge?

After this is merged, I think we should probably keep #3524 open until things are working again with the latest release of pytorch.

Co-authored-by: connortann <71127464+connortann@users.noreply.github.com>
@CloseChoice CloseChoice changed the title WIP: fix macos segmentation fault fix macos segmentation fault Mar 4, 2024
@CloseChoice
Copy link
Collaborator Author

Great job for figuring this out! This looks good to me. There is "WIP" in the title - is this ready to merge?

After this is merged, I think we should probably keep #3524 open until things are working again with the latest release of pytorch.

this is ready to merge

@connortann connortann merged commit a20f5cd into shap:master Mar 4, 2024
8 of 10 checks passed
@CloseChoice CloseChoice deleted the FIX-macos-segmentation-fault branch March 4, 2024 12:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behaviour ci Relating to Continuous Integration / GitHub Actions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants