-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
volk_8u_conv_k7_r2puppet_8u fails in Ubuntu 16.04 #117
Comments
I've heard reports of this, but it seems to work for me. I think there must be something about some of the randomly generated inputs that sometimes triggers a failure. Not entirely sure what to do about it. If you happen to have another machine does it also fail there? |
Hey @n-west , that is very interesting. For me, it always fails and I have tried this on 3 machines with similar architecture and OS. |
This is a tricky one.... I just set up a ci since it seems like a good thing to do, and it verifies that this passes on at least that machine (https://travis-ci.org/n-west/volk). I think this code was machine-generated in the first place and it's well outside my wheelhouse and interests to debug it |
I see that this conversation happened on Aug 14, 2017, but thought I'd comment. I am running on Ubuntu 14.04LTS and see the exact same problem. The volk_8u_conv_k7_r2puppet_8 kernel test fails, which means that instead of getting the speed-optimized, intrinsic-ized SPIRAL Viterbi (http://www.spiral.net/index.html), you get the generic implementation which runs 10x slower. |
I'm aware that this seems to fail for some people, but it passes on all of my machines, on independent CI machines, and on many other people's machines. I haven't received enough information from people reporting failures to identify any pattern of what is causing the failure. If you depend on this kernel perhaps you can do some investigating-- can you try on a more modern ubuntu (I'm certain this passes on my hardware with Ubuntu 16.04)? Do you have another machine that it passes on? @dmiralles2009, are you also on Ubuntu 14.04? I simply don't have the resources to debug this when I don't even observe the failure. Someone that depends on this code that is able to reliably observe the failure needs to own it. |
Thanks for your quick reply. I have multiple VMs (12.04, 14.04, 16.04, CentOS 7.2 even) and will run a dry-run using the profiler.
I am very familiar with the SPIRAL Viterbi - I wrote a puncturing wrapper for it some time ago (and know it works), but saw that it had been integrated into gnuradio/gr-fec, so I thought it best to use that one since it played nicely with VOLK (and mine didn’t).
Is there a simple way to force the architecture from “generic” to “spiral” only for this kernel? May help my debug efforts.
I see no “volk_config” file anywhere, although I know during gnuradio PyBOMBS (2.x btw) the profiler is executed. Maybe it was written to a json file instead? I looked everywhere for it, and did run “volk-config-info --prefix”…still couldn’t locate the file.
…_____________________
Michael Fogle
Signalscape, Inc.
mikef@signalscape.com
direct: (919)678-6581
_____________________
From: Nathan West [mailto:notifications@github.com]
Sent: Thursday, November 16, 2017 13:49
To: gnuradio/volk <volk@noreply.github.com>
Cc: Fogle, Mike <MikeF@signalscape.com>; Comment <comment@noreply.github.com>
Subject: Re: [gnuradio/volk] volk_8u_conv_k7_r2puppet_8u fails in Ubuntu 16.04 (#117)
I'm aware that this seems to fail for some people, but it passes on all of my machines, on independent CI machines, and on many other people's machines. I haven't received enough information from people reporting failures to identify any pattern of what is causing the failure. If you depend on this kernel perhaps you can do some investigating-- can you try on a more modern ubuntu (I'm certain this passes on my hardware with Ubuntu 16.04)? Do you have another machine that it passes on?
@dmiralles2009<https://github.com/dmiralles2009>, are you also on Ubuntu 14.04?
I simply don't have the resources to debug this when I don't even observe the failure. Someone that depends on this code that is able to reliably observe the failure needs to own it.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#117 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AGkODxMl6nGZhVmleVZj-I_1LRDi7br3ks5s3IOEgaJpZM4OujPU>.
--------------------------- This email and any files transmitted with it are confidential and intended solely for the use of Signalscape, Inc. and the addressed individual or entity. If you have received this email in error please delete it. Information in this email may be subject to the Privacy Act of 1974 and any unauthorized review, use, disclosure, or distribution is strictly prohibited. Any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the company.
|
|
FYI…ran “volk_profile” on Ubuntu 14.04, 16.04, 17.10 (all debians) and for giggles, ran it on CentOS 7.2 (redhat) all running gnuradio 3.7.11. I ran it on 3 different hardware platforms using 2 VMs, 1 docker, and 1 was native (i.e. the CentOS 7.2 machine).
|
Attached images don't come through on GitHub issues. However, here is the kernel passing QA with different compilers: |
Just to circle back ‘round, here’s what the image showed:
RUN_VOLK_TESTS: volk_8u_conv_k7_r2puppet_8u(131071,198)
spiral completed in 291.602ms
generic completed in 3129.87ms
offset 0 in1: 1 in2: 0 tolerance was: 0
offset 2 in1: 1 in2: 0 tolerance was: 0
offset 4 in1: 1 in2: 0 tolerance was: 0
offset 5 in1: 1 in2: 0 tolerance was: 0
offset 6 in1: 1 in2: 0 tolerance was: 0
offset 7 in1: 1 in2: 0 tolerance was: 0
offset 8 in1: 1 in2: 0 tolerance was: 0
offset 9 in1: 1 in2: 0 tolerance was: 0
volk_8u_conv_k7_r2puppet_8u: fail on arch spiral
Best aligned arch: generic
Best unaligned arch: generic
…_____________________
Michael Fogle
Signalscape, Inc.
mikef@signalscape.com
direct: (919)678-6581
_____________________
From: Nathan West [mailto:notifications@github.com]
Sent: Thursday, November 16, 2017 15:22
To: gnuradio/volk <volk@noreply.github.com>
Cc: Fogle, Mike <MikeF@signalscape.com>; Comment <comment@noreply.github.com>
Subject: Re: [gnuradio/volk] volk_8u_conv_k7_r2puppet_8u fails in Ubuntu 16.04 (#117)
Attached images don't come through on GitHub issues.
However, here is the kernel passing QA with different compilers:
https://travis-ci.org/n-west/volk/builds/278233344
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#117 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AGkOD3GxbpqYZAJzUBszAsvZqjztZwRZks5s3JlcgaJpZM4OujPU>.
--------------------------- This email and any files transmitted with it are confidential and intended solely for the use of Signalscape, Inc. and the addressed individual or entity. If you have received this email in error please delete it. Information in this email may be subject to the Privacy Act of 1974 and any unauthorized review, use, disclosure, or distribution is strictly prohibited. Any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the company.
|
Can confirm failing kernel.
Building with GCC-7 && ccache enabled. |
@noc0lour just to clarify-- does this fail all the time on all of your machines? My best guess is that the random input is violating some assumption made by the spiral code, but I just tried many calls to srand and varying vector sizes-- always passes for me (currently on Mac with LLVM 9.0.0, but also n Debian with GCC 4.8 and 4.9). I'm also not really seeing any patterns in OS/CPU/compilers |
Can't reproduce with |
Still fails for me with a fresh git copy:
|
I created a simple Python script as a short-term fix. (Remove the '.txt' extension to use.) Although there are comments and error messages in the script that assume your gnuradio install was done with PyBOMBS 2.3.2, as long as it can find the 'volk_profile' executable (i.e. in your path or prefix), it should work. The issue is likely in the volk performance test that is run for the spiral architecture since, after running this Python script, the CMU Spiral Viterbi algorithm is used and works without a problem on several PC and server platforms, either native Linux or VM. The final solution would likely fixing the volk performance test. |
I also can confirm this issue on the following CPUs:
For me, a test error is also reported for the avx2 architecture (on both CPUs):
When I switch to the "generic" architecture (via As this still worked automatically a few days ago and I use openSUSE Tumbleweed (a rolling release Linux distribution), I suspect that this could be somehow related to the build environment of the package. I'll try to build the package locally to verify. |
When I updated my local Volk (devel) install to the latest GIT master commit, neither |
I cloned the volk repo and perform the installation steps as described in the README file with the caveat that I am not running the last step of
$ sudo make install
since I want a local copy only. I am, however, running thevolk_profile
executable from within the build directory as./apps/volk_profile
. While checking the output of the code I noticed the following:RUN_VOLK_TESTS:volk_8u_conv_k7_r2puppet_8u(131071,198)
spiral completed in 191.857ms
generic completed in 2393.31ms
offset 0 in1: 1 in2: 0 tolerance was: 0
offset 1 in1: 1 in2: 0 tolerance was: 0
offset 2 in1: 1 in2: 0 tolerance was: 0
offset 3 in1: 1 in2: 0 tolerance was: 0
offset 4 in1: 1 in2: 0 tolerance was: 0
offset 5 in1: 1 in2: 0 tolerance was: 0
offset 6 in1: 1 in2: 0 tolerance was: 0
offset 7 in1: 1 in2: 0 tolerance was: 0
offset 8 in1: 1 in2: 0 tolerance was: 0
offset 9 in1: 1 in2: 0 tolerance was: 0
volk_8u_conv_k7_r2puppet_8u: fail on arch spiral
Best aligned arch: generic
Best unaligned arch: generic
Notice that the proto kernel fails and there are some print messages regarding the tolerance, not really sure I fully understand it. For additional information, I also included the architecture of the machine in question:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 58
Model name: Intel(R) Core(TM) i7-3632QM CPU @ 2.20GHz
Stepping: 9
CPU MHz: 1217.304
CPU max MHz: 3200.0000
CPU min MHz: 1200.0000
BogoMIPS: 4389.84
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 6144K
NUMA node0 CPU(s): 0-7
Thanks in advance for the help
The text was updated successfully, but these errors were encountered: