Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

illegal instruction (core dumped) #43

Closed
eneserdo opened this issue May 19, 2022 · 4 comments
Closed

illegal instruction (core dumped) #43

eneserdo opened this issue May 19, 2022 · 4 comments

Comments

@eneserdo
Copy link

eneserdo commented May 19, 2022

Thanks for the great work. I was trying to run predictions on YCBInEOAT by following your guide, and I run the command:

python scripts/run_ycbineoat.py --data_dir ycb_dir/bleach0 --port 5555 --model_name 021_bleach_cleanser

it gave this error:

/home/airlab/enes/bundle/BundleTrack/scripts/../build/bundle_track_ycbineoat /tmp/config_ycb_dir.yml
illegal instruction (core dumped)

First, I suspected from tensorflow version because old CPUs do not support AVX instruction which is used by newer tensorflow versions, but while lfnet container works without an error, this happened in the main container. And, afaik, there is no tensorflow in there.

Here is my lscpu | grep Flags output to compare with yours:

flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt aes lahf_lm pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat flush_l1d

What could cause this? And How can I solve it?
Thanks in advance.

@wenbowen123
Copy link
Owner

wenbowen123 commented May 21, 2022

I'm not sure how it happens. Most people seem to never meet this problem as they were able to run it without mentioning this in previous issues. I put my machine's flags here for reference any way:

fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_single pti retpoline intel_ppin tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dtherm ida arat pln pts

Perhaps you can first identify if the error occurs when running my code or 3rd party, e.g. opencv/PCL.

@ChenxMa
Copy link

ChenxMa commented Feb 29, 2024

Thanks for the great work. I was trying to run predictions on YCBInEOAT by following your guide, and I run the command:

python scripts/run_ycbineoat.py --data_dir ycb_dir/bleach0 --port 5555 --model_name 021_bleach_cleanser

it gave this error:

/home/airlab/enes/bundle/BundleTrack/scripts/../build/bundle_track_ycbineoat /tmp/config_ycb_dir.yml
illegal instruction (core dumped)

First, I suspected from tensorflow version because old CPUs do not support AVX instruction which is used by newer tensorflow versions, but while lfnet container works without an error, this happened in the main container. And, afaik, there is no tensorflow in there.

Here is my lscpu | grep Flags output to compare with yours:

flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt aes lahf_lm pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat flush_l1d

What could cause this? And How can I solve it? Thanks in advance.

Hi, maybe it's too long to remember, but I also encountered the same problem, have you found any solutions yet?

@eneserdo
Copy link
Author

eneserdo commented Feb 29, 2024

@ChenxMa I do not remember really, maybe I did not use it afterward. But it was probably due to the CPU. Did you try it on a different machine?

@ChenxMa
Copy link

ChenxMa commented Mar 4, 2024

@ChenxMa I do not remember really, maybe I did not use it afterward. But it was probably due to the CPU. Did you try it on a different machine?

Yes, I changed a different machine, and the code runs well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants