Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segmentation fault on autoML fit #571

Open
yairVanti opened this issue Oct 3, 2022 · 13 comments
Open

segmentation fault on autoML fit #571

yairVanti opened this issue Oct 3, 2022 · 13 comments

Comments

@yairVanti
Copy link

yairVanti commented Oct 3, 2022

happens on recent version of mljar. (0.11.3)

relevant details :

Translated Report (Full Report Below)

Process: python3.8 [36912]
Path: /Users/USER/*/python
Identifier: python3.8
Version: ???
Code Type: X86-64 (Native)
Parent Process: pycharm [32492]
Responsible: pycharm [32492]
User ID: 501

Date/Time: 2022-10-03 14:27:12.1979 +0300
OS Version: macOS 12.6 (21G115)
Report Version: 12
Bridge OS Version: 6.6 (19P6067)

Crashed Thread: 10

Exception Type: EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000010
Exception Codes: 0x0000000000000001, 0x0000000000000010
Exception Note: EXC_CORPSE_NOTIFY

VM Region Info: 0x10 is not in any region. Bytes before following region: 140737486778352
REGION TYPE START - END [ VSIZE] PRT/MAX SHRMOD REGION DETAIL
UNUSED SPACE AT START
--->
VM_ALLOCATE 7fffffe7f000-7fffffe80000 [ 4K] r-x/r-x SM=ALI

Thread 10 crashed with X86 Thread State (64-bit):
rax: 0x0000000140063e80 rbx: 0x00007fd330c6b240 rcx: 0x0000000000000004 rdx: 0x0000000000000000
rdi: 0x0000000000000002 rsi: 0x000070000b878b90 rbp: 0x000070000b878b90 rsp: 0x000070000b878aa0
r8: 0x0000000000000001 r9: 0x00000000000000c0 r10: 0x0000000000000001 r11: 0x0000000000000246
r12: 0x0000000000000000 r13: 0x0000000000000000 r14: 0x0000000000000002 r15: 0x000070000b878b90
rip: 0x0000000140018a6c rfl: 0x0000000000010206 cr2: 0x0000000000000010

Logical CPU: 8
Error Code: 0x00000004 (no mapping for user data read)
Trap Number: 14

Thread 10 instruction stream:
70 48 8b 74 24 78 48 8b-bc 24 80 00 00 00 48 89 pH.t$xH..$....H.
03 33 c0 4d 8b 0b 4d 8b-53 08 4d 8b 63 10 48 89 .3.M..M.S.M.c.H.
53 08 48 89 4b 10 49 89-28 49 89 70 08 49 89 78 S.H.K.I.(I.p.I.x
10 4d 89 4d 00 4d 89 55-08 4d 89 65 10 e8 62 bb .M.M.M.U.M.e..b.
f9 ff 66 90 41 55 41 56-41 57 53 55 48 83 ec 40 ..f.AUAVAWSUH..@
48 89 f5 48 8d 05 1a b4-04 00 48 63 ff 48 8b 10 H..H......Hc.H..
[4c]8b 2c fa 4c 89 ef e8-f8 2a ff ff 4d 8d b5 c0 L.,.L....*..M... <==
05 00 00 4c 89 f7 e8 f3-1d 00 00 89 c3 85 db 0f ...L............
85 12 05 00 00 48 8b 55-20 48 85 d2 0f 84 e4 04 .....H.U H......
00 00 b8 01 00 00 00 86-02 48 8b 45 28 4c 8b 38 .........H.E(L.8
48 8d 1d 0d b5 04 00 49-89 ad 98 01 00 00 83 3b H......I.......;
00 0f 84 ae 04 00 00 83-7b 14 00 74 03 0f ae f0 ........{..t....

packages from requirements file :

aiohttp==3.7.4.post0
albumentations==1.2.0
alembic==1.6.5
antlr4-python3-runtime==4.9.3
aporia==1.0.79
appnope==0.1.2
astroid==2.4.2
async-timeout==3.0.1
atomicwrites==1.4.0
attrs==21.2.0
backcall==0.2.0
blis==0.7.8
boto3==1.17.88
botocore==1.20.112
bson==0.5.10
catalogue==2.0.7
catboost==1.0.6
category-encoders==2.3.0
certifi==2020.12.5
chardet==4.0.0
click==7.1.2
cliff==3.8.0
cloudpickle==2.2.0
cmaes==0.8.2
cmd2==2.1.2
colorama==0.4.4
colorlog==5.0.1
colour==0.1.5
commonmark==0.9.1
confluent-kafka==1.6.1
croniter==1.3.5
cycler==0.10.0
cymem==2.0.6
dacite==1.6.0
dask==2022.9.2
dataclasses==0.6
dataclasses-json==0.5.2
deap==1.3.1
debugpy==1.6.0
decorator==4.4.2
deepchecks==0.9.0
dill==0.3.5.1
distributed==2022.9.2
dnspython==2.2.1
docker==5.0.3
dtreeviz==1.3.3
entrypoints==0.4
et-xmlfile==1.0.1
evidently==0.1.47.dev1
fastai==2.5.6
fastcore==1.4.4
fastdownload==0.0.6
fastjsonschema==2.15.3
fastprogress==1.0.2
Flask==1.1.2
fonttools==4.28.3
fsspec==2022.5.0
future==0.18.2
graphviz==0.17
greenlet==1.1.1
gunicorn==20.1.0
HeapDict==1.0.1
hydra-core==1.2.0
idna==2.10
ImageHash==4.2.1
imbalanced-learn==0.8.0
imblearn==0.0
imgaug==0.4.0
importlib-resources==5.7.1
iniconfig==1.1.1
ipykernel==6.6.0
ipython==7.23.1
ipython-genutils==0.2.0
isort==5.7.0
itsdangerous==1.1.0
jdcal==1.4.1
jedi==0.18.0
Jinja2==3.0.0
jmespath==0.10.0
joblib==1.2.0
jsonschema==4.6.0
jupyter-client==7.2.0
jupyter-core==4.10.0
kafka-python==2.0.2
kiwisolver==1.3.1
langcodes==3.3.0
lazy-object-proxy==1.4.3
lightgbm==3.3.2
llvmlite==0.39.1
locket==1.0.0
Mako==1.1.4
Markdown==3.3.4
MarkupSafe==2.0.0rc2
marshmallow==3.10.0
marshmallow-enum==1.5.1
marshmallow-oneofschema==3.0.1
matplotlib==3.3.4
matplotlib-inline==0.1.3
mccabe==0.6.1
mljar-supervised==0.11.3
msgpack==1.0.4
mttkinter==0.6.1
multidict==5.2.0
murmurhash==1.0.7
mypy-extensions==0.4.3
nbformat==5.4.0
nest-asyncio==1.5.5
nonechucks==0.4.2
numba==0.56.2
numpy==1.23.3
omegaconf==2.2.2
opencv-python==4.5.4.60
openpyxl==3.0.6
optuna==2.9.1
orjson==3.6.4
packaging==21.0
pandas==1.5.0
parso==0.8.1
partd==1.2.0
pathy==0.6.1
patsy==0.5.1
pbr==5.6.0
pendulum==2.1.2
pexpect==4.8.0
pickleshare==0.7.5
Pillow==8.3.2
plotly==5.5.0
pluggy==0.13.1
prefect==1.2.4
preshed==3.0.6
prettytable==2.1.0
prompt-toolkit==3.0.16
psutil==5.9.1
ptyprocess==0.7.0
py==1.10.0
pydantic==1.8.2
pyDeprecate==0.3.2
Pygments==2.8.1
pylint==2.6.2
pymongo==4.1.0
pyparsing==2.4.7
pyperclip==1.8.2
pyreadline3==3.3
pyrsistent==0.18.1
pytest==6.2.4
pytest-mock==3.7.0
python-dateutil==2.8.2
python-editor==1.0.4
pytorch-ignite==0.4.9
pyts==0.12.0
python-box==6.0.2
python-dateutil==2.8.2
python-editor==1.0.4
python-slugify==6.1.2
pyts==0.12.0
pytz==2022.2.1
pytzdata==2020.1
PyYAML==5.4.1
pyzmq==23.1.0
requests==2.25.1
rich==12.4.4
rope==0.18.0
s3transfer==0.4.2
scikit-learn==1.1.2
scikit-plot==0.3.7
scipy==1.9.1
seaborn==0.11.1
setuptools-scm==6.3.2
simplejson==3.17.2
six==1.16.0
sktime==0.7.0
slicer==0.0.7
smart-open==5.2.1
sortedcontainers==2.4.0
spacy==3.4.1
spacy-legacy==3.0.9
spacy-loggers==1.0.2
split-folders==0.4.3
SQLAlchemy==1.4.22
srsly==2.4.3
statsmodels==0.12.2
stevedore==3.3.0
stopit==1.1.2
stringcase==1.2.0
tabulate==0.8.9
tblib==1.7.0
tenacity==6.3.1
text-unidecode==1.3
thinc==8.1.2
threadpoolctl==3.1.0
tk==0.1.0
toml==0.10.2
tomli==1.2.2
toolz==0.12.0
torch==1.10.1
torchmetrics==0.8.2
torchvision==0.11.2
tornado==6.1
tqdm==4.64.1
traitlets==5.1.0
tsai==0.3.1
typer==0.4.1
typing-inspect==0.8.0
typing_extensions==4.3.0
update-checker==0.18.0
urllib3==1.26.3
wasabi==0.9.1
wcwidth==0.2.5
webencodings==0.5.1
wordcloud==1.8.1
wrapt==1.12.1
yarl==1.7.0
zipp==3.8.1
websocket-client==1.3.3
Werkzeug==1.0.1
xgboost==1.6.2
zict==2.2.0
pyod==1.0.4
suod==0.0.8
importlib_metadata==4.12.0
shap==0.39.0

@pplonski
Copy link
Contributor

pplonski commented Oct 3, 2022

@yairVanti thanks for reporting. Could you please provide the code+data to reproduce the issue? What processor type do you have?

@yairVanti
Copy link
Author

processeor is 2.6GHz 6-Core Intel Core i7
i think its not an issue of data, it happens on many kinds of data.
another thing is that it happens only if n_jobs is bigger than 1.
on one thread everything works.

@pplonski
Copy link
Contributor

pplonski commented Oct 3, 2022

Thank you. Are you able to track the reason?

@yairVanti
Copy link
Author

no, the linear regression works and then on the simple /default algorithms step it crashes.

@yairVanti
Copy link
Author

@pplonski - i suspect it's an issue with monterey OS version
see microsoft/LightGBM#4229
and https://www.pythonfixing.com/2021/11/fixed-python-multithreading-didn-work.html
maybe the import order matters ?
if the import of lightgbm will be first it will pass ?

@pplonski
Copy link
Contributor

pplonski commented Oct 6, 2022

Thanks @yairVanti for investigation. Can you confirm that import order matters? Can you try to import lightgbm in your script before automl import?

@yairVanti
Copy link
Author

tried , it didnt work....

@xinlnix
Copy link

xinlnix commented Oct 15, 2022

I got the same error because of the memory usage limited. The process is killed by the system. But mljar can not quit automaticly. But the trained model is preserved.

@yairVanti
Copy link
Author

when i ran all algorithms except of lightGBM - everything works.
so indeed the problem is narrowed to lightgbm algorithm in AutoML running on mac (currently my version is 12.6.1)

@pplonski
Copy link
Contributor

pplonski commented Dec 6, 2022

@yairVanti great catch! What processor do you have on Mac? M1/M2?

@yairVanti
Copy link
Author

2.6 GHz 6-Core Intel Core i7

@pplonski
Copy link
Contributor

pplonski commented Dec 6, 2022

are you able to manually update only lightgbm package and check if the problem still occurs?

@yairVanti
Copy link
Author

updated lightgbm to 3.3.0 (latest version in pypi) , and the problem persists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants