Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to compile concrete-ml models on Ubuntu #485

Closed
weimingma opened this issue Feb 6, 2024 · 8 comments
Closed

Failed to compile concrete-ml models on Ubuntu #485

weimingma opened this issue Feb 6, 2024 · 8 comments
Labels
bug Something isn't working

Comments

@weimingma
Copy link

Summary

Cannot compile concrete-ml AI models for example code

Description

  • concrete-ml: 1.4.0
  • python version: 3.8
  • OS: Ubuntu 20.04 (virtual environment)
    or
  • concrete-ml: 1.4.0
  • python version: 3.10.12
  • OS: Ubuntu 22.04 (virtual environment)

On either environment above, step by step procedure someone should follow to trigger the bug:

minimal POC to trigger the bug

pip3 install concrete-ml

Run the code below:

import concrete.ml
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from concrete.ml.sklearn import LogisticRegression
x, y = make_classification(n_samples=100, class_sep=2, n_features=30, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)
model = LogisticRegression(n_bits=8)
model.fit(X_train, y_train)
model.compile(X_train)

The result is shown below after compiling the model:

 #0 0x00007f13fccbb761 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (.localalias) (/home/weimin/.local/lib/python3.8/site-packages/mlir/_mlir_libs/libConcretelangBindingsPythonCAPI.so+0x13a5761)
 #1 0x00007f13fccb9174 SignalHandler(int) (/home/weimin/.local/lib/python3.8/site-packages/mlir/_mlir_libs/libConcretelangBindingsPythonCAPI.so+0x13a3174)
 #2 0x00007f14a6ba8090 (/lib/x86_64-linux-gnu/libc.so.6+0x43090)
 #3 0x00007f1401063128 cxx::unwind::prevent_unwind::h027936808a60dbca (/home/weimin/.local/lib/python3.8/site-packages/mlir/_mlir_libs/libConcretelangBindingsPythonCAPI.so+0x574d128)
 #4 0x00007f140105a7ba concrete_optimizer::dag::empty() (/home/weimin/.local/lib/python3.8/site-packages/mlir/_mlir_libs/libConcretelangBindingsPythonCAPI.so+0x57447ba)
 #5 0x00007f140037f097 mlir::concretelang::optimizer::DagPass::runOnOperation() (/home/weimin/.local/lib/python3.8/site-packages/mlir/_mlir_libs/libConcretelangBindingsPythonCAPI.so+0x4a69097)
 #6 0x00007f13fcbfe342 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) (.localalias) (/home/weimin/.local/lib/python3.8/site-packages/mlir/_mlir_libs/libConcretelangBindingsPythonCAPI.so+0x12e8342)
 #7 0x00007f13fcbfe919 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) (.localalias) (/home/weimin/.local/lib/python3.8/site-packages/mlir/_mlir_libs/libConcretelangBindingsPythonCAPI.so+0x12e8919)
 #8 0x00007f13fcbff923 mlir::detail::OpToOpPassAdaptor::runOnOperationImpl(bool) (.localalias) (/home/weimin/.local/lib/python3.8/site-packages/mlir/_mlir_libs/libConcretelangBindingsPythonCAPI.so+0x12e9923)
 #9 0x00007f13fcbfe036 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) (.localalias) (/home/weimin/.local/lib/python3.8/site-packages/mlir/_mlir_libs/libConcretelangBindingsPythonCAPI.so+0x12e8036)
#10 0x00007f13fcbfe919 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) (.localalias) (/home/weimin/.local/lib/python3.8/site-packages/mlir/_mlir_libs/libConcretelangBindingsPythonCAPI.so+0x12e8919)
#11 0x00007f13fcbff421 mlir::PassManager::run(mlir::Operation*) (/home/weimin/.local/lib/python3.8/site-packages/mlir/_mlir_libs/libConcretelangBindingsPythonCAPI.so+0x12e9421)
#12 0x00007f13fe8b1313 mlir::concretelang::pipeline::getFHEContextFromFHE[abi:cxx11](mlir::MLIRContext&, mlir::ModuleOp&, mlir::concretelang::optimizer::Config, std::function<bool (mlir::Pass*)>) (/home/weimin/.local/lib/python3.8/site-packages/mlir/_mlir_libs/libConcretelangBindingsPythonCAPI.so+0x2f9b313)
#13 0x00007f13fe88e58b mlir::concretelang::CompilerEngine::getConcreteOptimizerDescription(mlir::concretelang::CompilerEngine::CompilationResult&) (.localalias) (/home/weimin/.local/lib/python3.8/site-packages/mlir/_mlir_libs/libConcretelangBindingsPythonCAPI.so+0x2f7858b)
#14 0x00007f13fe8935bf mlir::concretelang::CompilerEngine::determineFHEParameters(mlir::concretelang::CompilerEngine::CompilationResult&) (.localalias) (/home/weimin/.local/lib/python3.8/site-packages/mlir/_mlir_libs/libConcretelangBindingsPythonCAPI.so+0x2f7d5bf)
#15 0x00007f13fe894653 mlir::concretelang::CompilerEngine::compile(mlir::ModuleOp, mlir::concretelang::CompilerEngine::Target, std::optional<std::shared_ptr<mlir::concretelang::CompilerEngine::Library> >) (.localalias) (/home/weimin/.local/lib/python3.8/site-packages/mlir/_mlir_libs/libConcretelangBindingsPythonCAPI.so+0x2f7e653)
#16 0x00007f13fe898c28 llvm::Expected<mlir::concretelang::CompilerEngine::Library> mlir::concretelang::compileModuleOrSource<mlir::ModuleOp>(mlir::concretelang::CompilerEngine*, mlir::ModuleOp, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, bool, bool, bool) (/home/weimin/.local/lib/python3.8/site-packages/mlir/_mlir_libs/libConcretelangBindingsPythonCAPI.so+0x2f82c28)
#17 0x00007f13fe8991c4 mlir::concretelang::CompilerEngine::compile(mlir::ModuleOp, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, bool, bool, bool) (/home/weimin/.local/lib/python3.8/site-packages/mlir/_mlir_libs/libConcretelangBindingsPythonCAPI.so+0x2f831c4)
#18 0x00007f13fca990fd mlir::concretelang::LibrarySupport::compile(mlir::ModuleOp&, std::shared_ptr<mlir::concretelang::CompilationContext>&, mlir::concretelang::CompilationOptions) (/home/weimin/.local/lib/python3.8/site-packages/mlir/_mlir_libs/libConcretelangBindingsPythonCAPI.so+0x11830fd)
#19 0x00007f13fcaa13f9 library_compile_module(LibrarySupport_Py, mlir::ModuleOp, mlir::concretelang::CompilationOptions, std::shared_ptr<mlir::concretelang::CompilationContext>) (/home/weimin/.local/lib/python3.8/site-packages/mlir/_mlir_libs/libConcretelangBindingsPythonCAPI.so+0x118b3f9)
#20 0x00007f13f91b3d35 (/home/weimin/.local/lib/python3.8/site-packages/mlir/_mlir_libs/_concretelang.cpython-38-x86_64-linux-gnu.so+0x70d35)
#21 0x00007f13f918a1db (/home/weimin/.local/lib/python3.8/site-packages/mlir/_mlir_libs/_concretelang.cpython-38-x86_64-linux-gnu.so+0x471db)
#22 0x00007f13f91689c7 (/home/weimin/.local/lib/python3.8/site-packages/mlir/_mlir_libs/_concretelang.cpython-38-x86_64-linux-gnu.so+0x259c7)
#23 0x00000000005d5499 PyCFunction_Call (/usr/bin/python3.8+0x5d5499)
#24 0x00000000005d6066 _PyObject_MakeTpCall (/usr/bin/python3.8+0x5d6066)
#25 0x00000000004e22b3 (/usr/bin/python3.8+0x4e22b3)
#26 0x000000000054c8a9 _PyEval_EvalFrameDefault (/usr/bin/python3.8+0x54c8a9)
#27 0x000000000054552a _PyEval_EvalCodeWithName (/usr/bin/python3.8+0x54552a)
#28 0x00000000005d5a23 _PyFunction_Vectorcall (/usr/bin/python3.8+0x5d5a23)
#29 0x0000000000547447 _PyEval_EvalFrameDefault (/usr/bin/python3.8+0x547447)
#30 0x000000000054552a _PyEval_EvalCodeWithName (/usr/bin/python3.8+0x54552a)
#31 0x00000000005d5a23 _PyFunction_Vectorcall (/usr/bin/python3.8+0x5d5a23)
#32 0x00000000005483b6 _PyEval_EvalFrameDefault (/usr/bin/python3.8+0x5483b6)
#33 0x00000000005d5846 _PyFunction_Vectorcall (/usr/bin/python3.8+0x5d5846)
#34 0x0000000000547447 _PyEval_EvalFrameDefault (/usr/bin/python3.8+0x547447)
#35 0x000000000054552a _PyEval_EvalCodeWithName (/usr/bin/python3.8+0x54552a)
#36 0x00000000005d5a23 _PyFunction_Vectorcall (/usr/bin/python3.8+0x5d5a23)
#37 0x0000000000579c7d (/usr/bin/python3.8+0x579c7d)
#38 0x00000000005d5fcf _PyObject_MakeTpCall (/usr/bin/python3.8+0x5d5fcf)
#39 0x000000000054ca58 _PyEval_EvalFrameDefault (/usr/bin/python3.8+0x54ca58)
#40 0x000000000054552a _PyEval_EvalCodeWithName (/usr/bin/python3.8+0x54552a)
#41 0x00000000004e1bd0 (/usr/bin/python3.8+0x4e1bd0)
#42 0x00000000005483b6 _PyEval_EvalFrameDefault (/usr/bin/python3.8+0x5483b6)
#43 0x000000000054552a _PyEval_EvalCodeWithName (/usr/bin/python3.8+0x54552a)
#44 0x00000000004e1bd0 (/usr/bin/python3.8+0x4e1bd0)
#45 0x000000000054c8a9 _PyEval_EvalFrameDefault (/usr/bin/python3.8+0x54c8a9)
#46 0x000000000054552a _PyEval_EvalCodeWithName (/usr/bin/python3.8+0x54552a)
#47 0x0000000000684327 PyEval_EvalCode (/usr/bin/python3.8+0x684327)
#48 0x0000000000673a41 (/usr/bin/python3.8+0x673a41)
#49 0x0000000000673abb (/usr/bin/python3.8+0x673abb)
#50 0x0000000000488acc (/usr/bin/python3.8+0x488acc)
#51 0x000000000048947a PyRun_InteractiveLoopFlags (/usr/bin/python3.8+0x48947a)
#52 0x0000000000674a19 PyRun_AnyFileExFlags (/usr/bin/python3.8+0x674a19)
#53 0x00000000004c41f6 (/usr/bin/python3.8+0x4c41f6)
#54 0x00000000006b43fd Py_BytesMain (/usr/bin/python3.8+0x6b43fd)
#55 0x00007f14a6b89083 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24083)
#56 0x00000000005da67e _start (/usr/bin/python3.8+0x5da67e)

@weimingma weimingma added the bug Something isn't working label Feb 6, 2024
@fd0r
Copy link
Collaborator

fd0r commented Feb 6, 2024

Hello @weimingma , sorry to hear your troubles using Concrete ML.
When you say:

Ubuntu 22.04 (virtual environment)

Do you mean a VM or docker container? Also are on an ARM system or x86?

@weimingma
Copy link
Author

Hello @weimingma , sorry to hear your troubles using Concrete ML. When you say:

Ubuntu 22.04 (virtual environment)

Do you mean a VM or docker container? Also are on an ARM system or x86?

Hi @fd0r , Appreciate for your prompt reply. It's a VM and x86. Let me know if need more info. Thanks.

@fd0r
Copy link
Collaborator

fd0r commented Feb 6, 2024

Hello again @weimingma , I discussed with my colleagues but we can't find an obvious explanation from the information we have now.

Could you please check your glibc version with ldd --version? Also we might need a full traceback if there is more information there and the hardware specifications of the system you are using to run the VM (any detail about your configuration would help 🙏🏼 ).

We never tested Concrete in a VM afaik so that would allow us to replicate your issue and in the future add a test for it in our CI.

@weimingma
Copy link
Author

Hello again @weimingma , I discussed with my colleagues but we can't find an obvious explanation from the information we have now.

Could you please check your glibc version with ldd --version? Also we might need a full traceback if there is more information there and the hardware specifications of the system you are using to run the VM (any detail about your configuration would help 🙏🏼 ).

We never tested Concrete in a VM afaik so that would allow us to replicate your issue and in the future add a test for it in our CI.

@fd0r The ldd version is: ldd (Ubuntu GLIBC 2.31-0ubuntu9.14) 2.31. I am trying to get HW related info and fulltrace and will get back to you later. Thanks!

@fd0r
Copy link
Collaborator

fd0r commented Feb 10, 2024

Alright so that's probably not an issue with glibc, could you link us or send directly the full traceback please? 🙏🏼

@weimingma
Copy link
Author

Alright so that's probably not an issue with glibc, could you link us or send directly the full traceback please? 🙏🏼

Hi @fd0r I have tried several times in the case. Unfortunately, there is no Python traceback shown. There is only llvm logs shown after the model is compiled in my initial post. Also, the following shown at the end of the logs:

PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace. Illegal instruction (core dumped)

Please let me know if you were requesting any other logs.
BTW, FYI. the CPU info: CPU: 16 x AMD EPYC 7232P 8-Core Processor (1 Socket).
Thanks.

@youben11
Copy link
Member

@weimingma an Illegal instruction could mean your CPU isn't exposing aes cpu instructions to the VM. Could you run lscpu or cpuid from the host and the vm and grep -i aes. If the instructions are available on the host and not on the VM, then you need to make them available through your hypervisor

@weimingma
Copy link
Author

@weimingma an Illegal instruction could mean your CPU isn't exposing aes cpu instructions to the VM. Could you run lscpu or cpuid from the host and the vm and grep -i aes. If the instructions are available on the host and not on the VM, then you need to make them available through your hypervisor

@youben11 Followed your comments, I have confirmed that the compiling failure was due to lack of aes flag on the VM cpu. This issue is resolved after adding aes to my VM. I am closing this issue. Really apprecite your help~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants