Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

volk_8u_conv_k7_r2puppet_8u fails in Ubuntu 16.04 #117

Closed
dmiralles2009 opened this issue Aug 5, 2017 · 18 comments
Closed

volk_8u_conv_k7_r2puppet_8u fails in Ubuntu 16.04 #117

dmiralles2009 opened this issue Aug 5, 2017 · 18 comments
Labels

Comments

@dmiralles2009
Copy link
Contributor

I cloned the volk repo and perform the installation steps as described in the README file with the caveat that I am not running the last step of $ sudo make install since I want a local copy only. I am, however, running the volk_profile executable from within the build directory as ./apps/volk_profile. While checking the output of the code I noticed the following:

RUN_VOLK_TESTS:volk_8u_conv_k7_r2puppet_8u(131071,198)
spiral completed in 191.857ms
generic completed in 2393.31ms
offset 0 in1: 1 in2: 0 tolerance was: 0
offset 1 in1: 1 in2: 0 tolerance was: 0
offset 2 in1: 1 in2: 0 tolerance was: 0
offset 3 in1: 1 in2: 0 tolerance was: 0
offset 4 in1: 1 in2: 0 tolerance was: 0
offset 5 in1: 1 in2: 0 tolerance was: 0
offset 6 in1: 1 in2: 0 tolerance was: 0
offset 7 in1: 1 in2: 0 tolerance was: 0
offset 8 in1: 1 in2: 0 tolerance was: 0
offset 9 in1: 1 in2: 0 tolerance was: 0
volk_8u_conv_k7_r2puppet_8u: fail on arch spiral
Best aligned arch: generic
Best unaligned arch: generic

Notice that the proto kernel fails and there are some print messages regarding the tolerance, not really sure I fully understand it. For additional information, I also included the architecture of the machine in question:

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 58
Model name: Intel(R) Core(TM) i7-3632QM CPU @ 2.20GHz
Stepping: 9
CPU MHz: 1217.304
CPU max MHz: 3200.0000
CPU min MHz: 1200.0000
BogoMIPS: 4389.84
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 6144K
NUMA node0 CPU(s): 0-7

Thanks in advance for the help

@n-west
Copy link
Member

n-west commented Aug 14, 2017

I've heard reports of this, but it seems to work for me. I think there must be something about some of the randomly generated inputs that sometimes triggers a failure. Not entirely sure what to do about it. If you happen to have another machine does it also fail there?

@dmiralles2009
Copy link
Contributor Author

Hey @n-west , that is very interesting. For me, it always fails and I have tried this on 3 machines with similar architecture and OS.

@n-west
Copy link
Member

n-west commented Sep 20, 2017

This is a tricky one.... I just set up a ci since it seems like a good thing to do, and it verifies that this passes on at least that machine (https://travis-ci.org/n-west/volk). I think this code was machine-generated in the first place and it's well outside my wheelhouse and interests to debug it

@mfogle
Copy link

mfogle commented Nov 16, 2017

I see that this conversation happened on Aug 14, 2017, but thought I'd comment. I am running on Ubuntu 14.04LTS and see the exact same problem. The volk_8u_conv_k7_r2puppet_8 kernel test fails, which means that instead of getting the speed-optimized, intrinsic-ized SPIRAL Viterbi (http://www.spiral.net/index.html), you get the generic implementation which runs 10x slower.

@n-west
Copy link
Member

n-west commented Nov 16, 2017

I'm aware that this seems to fail for some people, but it passes on all of my machines, on independent CI machines, and on many other people's machines. I haven't received enough information from people reporting failures to identify any pattern of what is causing the failure. If you depend on this kernel perhaps you can do some investigating-- can you try on a more modern ubuntu (I'm certain this passes on my hardware with Ubuntu 16.04)? Do you have another machine that it passes on?

@dmiralles2009, are you also on Ubuntu 14.04?

I simply don't have the resources to debug this when I don't even observe the failure. Someone that depends on this code that is able to reliably observe the failure needs to own it.

@mfogle
Copy link

mfogle commented Nov 16, 2017 via email

@n-west
Copy link
Member

n-west commented Nov 16, 2017

volk_config should go in to your $HOME/.volk by default.

@mfogle
Copy link

mfogle commented Nov 16, 2017 via email

@n-west
Copy link
Member

n-west commented Nov 16, 2017

Attached images don't come through on GitHub issues.

However, here is the kernel passing QA with different compilers:
https://travis-ci.org/n-west/volk/builds/278233344

@mfogle
Copy link

mfogle commented Nov 16, 2017 via email

@noc0lour
Copy link
Member

Can confirm failing kernel.

RUN_VOLK_TESTS: volk_8u_conv_k7_r2puppet_8u(131071,198)
spiral completed in 104.652ms
generic completed in 1808.89ms
offset 0 in1: 1 in2: 0 tolerance was: 0
offset 1 in1: 1 in2: 0 tolerance was: 0
offset 2 in1: 1 in2: 0 tolerance was: 0
offset 3 in1: 1 in2: 0 tolerance was: 0
offset 4 in1: 1 in2: 0 tolerance was: 0
offset 6 in1: 1 in2: 0 tolerance was: 0
offset 7 in1: 1 in2: 0 tolerance was: 0
offset 8 in1: 1 in2: 0 tolerance was: 0
offset 10 in1: 1 in2: 0 tolerance was: 0
offset 12 in1: 1 in2: 0 tolerance was: 0
volk_8u_conv_k7_r2puppet_8u: fail on arch spiral
Best aligned arch: generic
Best unaligned arch: generic
processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 23
model		: 1
model name	: AMD Ryzen Threadripper 1900X 8-Core Processor
stepping	: 1
microcode	: 0x8001129
cpu MHz		: 3792.403
cache size	: 512 KB
physical id	: 0
siblings	: 16
core id		: 0
cpu cores	: 8
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid amd_dcm aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 xsaves clzero irperf arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload overflow_recov succor smca
bugs		: fxsave_leak sysret_ss_attrs null_seg
bogomips	: 7584.80
TLB size	: 2560 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 48 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate eff_freq_ro [13] [14]

Building with GCC-7 && ccache enabled.

@n-west
Copy link
Member

n-west commented Feb 12, 2018

@noc0lour just to clarify-- does this fail all the time on all of your machines?

My best guess is that the random input is violating some assumption made by the spiral code, but I just tried many calls to srand and varying vector sizes-- always passes for me (currently on Mac with LLVM 9.0.0, but also n Debian with GCC 4.8 and 4.9). I'm also not really seeing any patterns in OS/CPU/compilers

@noc0lour
Copy link
Member

Can't reproduce with GCC 7.3.0 now. I even checked out the version I probably used in November. Running it a couple of times did not fail this kernel...

@Ka-zam
Copy link
Contributor

Ka-zam commented Aug 30, 2018

Still fails for me with a fresh git copy:
RUN_VOLK_TESTS: volk_8u_conv_k7_r2puppet_8u(131071,198) spiral completed in 151.29ms generic completed in 3814.47ms offset 0 in1: 1 in2: 0 tolerance was: 0 offset 1 in1: 1 in2: 0 tolerance was: 0 offset 3 in1: 1 in2: 0 tolerance was: 0 offset 5 in1: 1 in2: 0 tolerance was: 0 offset 6 in1: 1 in2: 0 tolerance was: 0 offset 7 in1: 1 in2: 0 tolerance was: 0 offset 8 in1: 1 in2: 0 tolerance was: 0 offset 9 in1: 1 in2: 0 tolerance was: 0 offset 10 in1: 1 in2: 0 tolerance was: 0 offset 12 in1: 1 in2: 0 tolerance was: 0 volk_8u_conv_k7_r2puppet_8u: fail on arch spiral Best aligned arch: generic Best unaligned arch: generic

Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 32 On-line CPU(s) list: 0-31 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 1 NUMA node(s): 4 Vendor ID: AuthenticAMD CPU family: 23 Model: 1 Model name: AMD EPYC 7351P 16-Core Processor Stepping: 2 CPU MHz: 1198.800 CPU max MHz: 2400,0000 CPU min MHz: 1200,0000 BogoMIPS: 4799.72 Virtualization: AMD-V L1d cache: 32K L1i cache: 64K L2 cache: 512K L3 cache: 8192K NUMA node0 CPU(s): 0-3,16-19 NUMA node1 CPU(s): 4-7,20-23 NUMA node2 CPU(s): 8-11,24-27 NUMA node3 CPU(s): 12-15,28-31 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid amd_dcm aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx hw_pstate sme ssbd ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 xsaves clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca
>g++ --version g++ (Ubuntu 7.3.0-16ubuntu3) 7.3.0
Linux 4.15.0-33-generic #36-Ubuntu SMP Wed Aug 15 16:00:05 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

@mfogle
Copy link

mfogle commented Aug 30, 2018

I created a simple Python script as a short-term fix. (Remove the '.txt' extension to use.) Although there are comments and error messages in the script that assume your gnuradio install was done with PyBOMBS 2.3.2, as long as it can find the 'volk_profile' executable (i.e. in your path or prefix), it should work.

The issue is likely in the volk performance test that is run for the spiral architecture since, after running this Python script, the CMU Spiral Viterbi algorithm is used and works without a problem on several PC and server platforms, either native Linux or VM. The final solution would likely fixing the volk performance test.

Patch_volk_config.py.txt

@cfr34k
Copy link

cfr34k commented Sep 27, 2019

I also can confirm this issue on the following CPUs:

  • AMD Ryzen 5 2600 Six-Core Processor
  • Intel(R) Core(TM) i5-5200U CPU

For me, a test error is also reported for the avx2 architecture (on both CPUs):

RUN_VOLK_TESTS: volk_8u_conv_k7_r2puppet_8u(131071,1987)
spiral completed in 1132.56 ms
avx2 completed in 1604.29 ms
generic completed in 26607.5 ms
offset 0 in1: 1 in2: 0 tolerance was: 0
offset 1 in1: 1 in2: 0 tolerance was: 0
offset 3 in1: 1 in2: 0 tolerance was: 0
offset 4 in1: 1 in2: 0 tolerance was: 0
offset 5 in1: 1 in2: 0 tolerance was: 0
offset 8 in1: 1 in2: 0 tolerance was: 0
offset 13 in1: 1 in2: 0 tolerance was: 0
offset 14 in1: 1 in2: 0 tolerance was: 0
offset 17 in1: 1 in2: 0 tolerance was: 0
offset 18 in1: 1 in2: 0 tolerance was: 0
volk_8u_conv_k7_r2puppet_8u: fail on arch spiral
offset 0 in1: 1 in2: 0 tolerance was: 0
offset 1 in1: 1 in2: 0 tolerance was: 0
offset 3 in1: 1 in2: 0 tolerance was: 0
offset 4 in1: 1 in2: 0 tolerance was: 0
offset 5 in1: 1 in2: 0 tolerance was: 0
offset 8 in1: 1 in2: 0 tolerance was: 0
offset 13 in1: 1 in2: 0 tolerance was: 0
offset 14 in1: 1 in2: 0 tolerance was: 0
offset 17 in1: 1 in2: 0 tolerance was: 0
offset 18 in1: 1 in2: 0 tolerance was: 0
volk_8u_conv_k7_r2puppet_8u: fail on arch avx2
Best aligned arch: generic
Best unaligned arch: generic

When I switch to the "generic" architecture (via ~/.volk/volk_config), everything is ok in my GNU Radio flow graph.

As this still worked automatically a few days ago and I use openSUSE Tumbleweed (a rolling release Linux distribution), I suspect that this could be somehow related to the build environment of the package. I'll try to build the package locally to verify.

@jdemel
Copy link
Contributor

jdemel commented Nov 1, 2019

Is this error still present after #294 got merged? I hope #294 fixed this issue and we can now close it.

@michaelld
Copy link
Contributor

When I updated my local Volk (devel) install to the latest GIT master commit, neither make test nor running volk_profile showed this error. So ... I'm guessing it was indeed fixed! I'm going to go ahead and close this issue as fixed; if anyone thinks otherwise, please reopen and provide info as to what's going on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants