New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Freeze in Gqrx with hackRF 2021-03-1 driver version #883
Comments
If I use the official appimage of Gqrx (last version, same as archlinux package : 2.14.4) then there is no freeze, no bug with hackRF 2021-03-1 driver : The bug occurs with the archlinux package of Gqrx. |
But the appimage can also contain hackRF lib in 2018-01 version, that can explain why there is no freeze. |
Have you tried your HackRF with other software on Arch? |
Yes, I tried 2 softwares : gnu radio and hackTV, both work with hackRF 2021-03-01. Gqrx freezes randomly with hackRF 2021-03-01 when switching modes (AM, FM, LSB), and sometimes when I start/stop hackRF with the menu "file -> start dsp", and "File -> stop dsp", it seems that the driver in 2021-03 version may hang the device in some conditions. |
@straithe : do you have a test script/binary that can check if all features, functions of hackRF driver 2021-03-01 are Ok ? I use 2 commands for checking transmission and reception and it seems ok : But I feel these commands are not enough for testing hackRF driver, if Gqrx 2.14.4 manages to crash the 2021-03 driver (and not with the 2018 version) then it is worth to investigate, there is perhaps something wrong in the 2021-03 driver version about some functions, a regression. For tracking the faulty commit that introduced the bug then a solution would be to use the git-bisect feature of git. Gnu-radio companion seems to work fine, but I don't know if it was built with libhackrf 2021 version or libhackrf 2018 version. |
I haven't had any luck reproducing this so far (running Ubuntu 20.04 & pybombs GR/GQRX). Could you run gqrx in a terminal, reproduce the issue, and paste the terminal output here? |
@miek : Don't forget that ubuntu 20.04 is not a rolling release distro, some libraries, and linux kernel may not be at the last version, I ran gqrx in a terminal, when the freeze occurs then the output doesn't show unusual information :
perhaps there is a debug mode for qgrx in order to have more output at the terminal ? I don't know what is pybomb, I use the regular gqrx package from archlinux, there is no pybomb used by gqrx, my python version is 3.9.4,
my hackRF is a chinese version bought very recently, no problem with other SDR applications (gnu-radio companion, hackTV), I feel that the bug occurs when gqrx tries to start/stop the hackRF device, randomly it can create freeze, perhaps the way you use the hackRF lib is not fully appropriate, a more "conservative way" by adding additionnal time (when calling hackRF functions ?) may avoid the freeze ? |
I made a git bisect and I found the faulty commit :
We should revert this faulty commit in order to fix the crash about stop_tx/tx commands that might occur on recent hackrf one chinese models, when using 2021-03-01 hackrf driver version. The git bisect log :
|
Reverting this commit is not enough for solving freezes in Gqrx, as recent commits in hackrf change also start/stop rx, tx functions, |
Since we can't reproduce your issue, can you please upload a screen capture of you reproducing this issue? |
Here is the video in attachment, I used the last version of gqrx (2.14.5) and libhackrf 2021.03, with archlinux, The bug doesn't occur if I use the appimage version of qgrx (the official appimage that we can find on gqrx's github), but I think the appimage may contain hackRF lib in 2018-01 version, that can explain why there is no freeze when using appimage. VID_20211028_223011_.mp4 |
Thank you for the video. That helps! Can you give me a screen shot of your right hand menu before and after you change from WFM to AM, then to WFM? I am updating my setup and trying to follow along with your steps exactly and I can't quite see what buttons you are pressing there. |
Hello, here is a new screen capture, this time the freeze occurs after the change WFM to AM. output.mp4 |
This issue may be related to #916. We are going to try to recreate that issue, address it, and hopefully provide a solution that you might be able to use as well. I'll update you as we have more information. |
@martinling : Can you provide a patch file compatible with the 2021-03 version of hackrf (last release version) ? Your current commit is not compatible with the last release version of hackrf (date of 2021-03), because I tried to apply your modifications to hackrf.c file and I get a lot of errors during compilation, about undeclared functions (HACKRF_OPERACAKE_MAX_BOARDS, and unknow types like hackrf_operacake_freq_range). |
@Potomac sure, here's a patch against 2021.03.01: bug-916-rebase-2021.03.01.patch.txt You can also get the same result by cherry-picking my commit |
@martinling : I tried your patch, I see no real difference, the bug is still here, gqrx freezes randomly after changing settings like frequency mode (AM, FM etc...), activating/deactivating RDS mode for example. I suggest you to test with gqrx (not your test source code), and try to change settings like frequency mode, RDS, you have to test for a long period (sometimes the freeze doesn't occur immediately, you have to retry sometimes 20 times before the freeze occurs). |
I'm afraid I haven't been able to reproduce this problem here. I'm running gqrx 2.14.4 on Debian. With firmware and libhackrf from 2021.03.01, I can switch back and forth between FM and AM over 100 times without it hanging. |
i am somewhat able to repro what is probably the same issue, even though i trigger it without any retuning. gqrx starts up fine, i can rx some fm.
wireshark usb capture of a gqrx session during the issue.
|
@Potomac could you perhaps try running the test program (hackrf_loop.c) from #916? If this is a hackrf bug rather than a gqrx one, it seems like it must involve some failure during a quick stop & start of the hackrf - which is exactly what that test program exercises, over and over again. You should be able to build it against your installed libhackrf with: Then if you run
If you see anything else (e.g. errors, hanging, or |
so libhackrf with the #1029 changes, firmware from m0/buffer branch which "shouldnt" matter for this. |
Aha! I've reproduced it now - using the same commits as @gozu42. I wasn't able to get it to happen by switching AM/FM or the start/stop button, but using the keyboard shortcut (Ctrl-D) for start/stop and just holding it down, I triggered it after a few seconds. Terminal output shows:
I'll investigate further. |
@martinling : thanks for your commit, I will test it soon as possible. |
@martinling : Like the last time : can you provide a patch file compatible with the 2021-03 version of hackrf (last release version) ? |
@Potomac sure, here you go - this should apply to the 2021.03.01 release and includes both fixes from #1029: |
@martinling : I made the test of your patch with Qqrx, and I still have the bug, no change. Then I tried the test program "hackrf_loop" and I see the message "not started yet" after few seconds : Finally started I tested on 2 different USB controllers (internal USB 2.0 ports of my motherboard, and a USB 3.0 controller PCI-E card) and the result is the same. OS is archlinux, a rolling release distro (like gentoo) where all libraries and linux kernel are at the latest release version (it may have an importance, as bugs can occur more easily on rolling release distros than classic distros like ubuntu, where libraries are not always in their latest versions). |
Thanks for testing this @Potomac. In my case I'm no longer getting problems in gqrx, but I've seen that pattern happening in I don't yet understand what's happening in the case where it just gets stuck at I'm going to get the USB analyzer out next and check whether data is actually showing up on the wire in this case. |
@martinling : what is sure is that the previous hackrf driver version (2018) doesn't have the bug, hackrf_start_rx() and hackrf_stop_rx() run without problems with 2018 version on my hackRF. So a solution would be to track the difference in source code between 2018 and 2021-03 versions, I tried with a git-bisect command but the results were inconclusive, the first bad commit found by git is : but reverting 1442014 is not possible, as commits made after 1442014 need this faulty commit. |
Unfortunately because this involves a race condition, the problem can come and go as a result of completely unrelated changes if they affect the timing of the code execution. We can't rely on testing and bisecting - we need to understand what the actual issue is to make sure it's solved properly.
|
Perhaps a log could help to understand better the race condition, if hackrf (or one of its dependencies) can be modified in order to generate debug log. You can also try the tests with an old PC (or a raspberry pi 4), as the execution of the code will be slower on an old PC, it may trigger more easily the race condition. |
I don't see a great way to attack this with more logging, because the symptom is that something doesn't happen. As far as I've been able to tell so far, everything is being started up correctly but no data comes back. Setting an environment variable of And then we wait for the transfer callbacks, which just never come. We can see that the transfer thread hasn't hung, because every 500ms it times out and starts It looks like there may be a device-side issue involved, so I'll be looking at that side of things next. |
The attached log is from Windows 10 running |
I have confirmed that there is a firmware-side bug which can cause RX stop/start to appear to succeed, but stall after the first 16KB of data. I've started a separate issue for this in #1042. |
Thanks Martinling for continuing the investigations, |
I'm certain that it's a firmware bug, because I have captured the USB traffic directly, and we can see that the host has correctly told the device to stop & start, but the device then stops sending data after the first 16KB. The host is continuing to ask for more data and the device doesn't send any more. That puts the blame clearly on the firmware. At the moment I don't know how that bug happens on the firmware side, so I can't tell how far back it goes in the project history. And because it's intermittent and seems to be affected by timing, bisecting can't be relied on to find out. But it's interesting that you've had no problems with the 2018 version, so I'll look at the differences between there and 2021. My next step will be to get a JTAG/SWD debugger hooked up to the HackRF's MCU and look at the firmware state when this happens. That should hopefully reveal something about the cause. Let's keep further discussion about the firmware bug on #1042. In terms of the original topic of this issue - the freeze in gqrx - my conclusion at the moment is that the same symptom can happen for three different causes. Two of those causes are host-side bugs that I believe to be fixed in #1029 (but that needs a little more work, plus review and merge). The third cause is the firmware bug which I'm now tracking in #1042. |
Can you provide a patch file compatible with 2021-03 hackrf version ? |
Sure, here you go: bug-916-v3-rebase-2021.03.01.patch.txt To build and flash the firmware you can follow the instructions here. |
@martinling : I tried your patch and I flashed the firmware : the bug is still here, I can easily trigger the bug with Gqrx, I tried also the hackrf_loop program : I get also the bug with the message : Start Perhaps you should test with a configuration similar to mine :
If I test with the appimage version of Gqrx (2.14.5) then there is no bug. |
can you please post output from hackrf_info on your current configuration? i was able to repro the problem on both test program and gqrx before, but with the combination of the host+fw fixes the hangs are now gone here, on both high powered PC and low-powered raspi. |
Well I think I made a mistake about the firmware file, I think I still use the 2021.3 version of firmware, I didn't manage to build the new firmware file, the compilation fails, can you send me a compiled version of the new firmware file ? |
|
@martinling @gozu42 : I finally manage to flash the firmware with the correct version sent by gozu42, the bug seems gone when I use hackrf_loop program, Thank you very much Martin for this excellent job. The output of hackrf_info :
|
That's fantastic news @Potomac, looks like we've finally solved this one. It's been quite an adventure! |
the reported version there for the lib might actualy be an artifact of the take-release-and-manually-patch process. my flow for the build was ...
the result is that all versions in hackrf_info output directly give the git commit used:
|
It would be interesting to have a nightly build channel for hackrf (like firefox), in order to download and test a beta/alpha version of hackRF binary (including the firmware). Thanks again Martinling for the resolution of the bug, I hope these commits will be included soon in the master branch of hackrf. |
That's a good idea but I have some worries about it. We would need to provide quite a lot of different binaries - we have three supported boards for the firmware, and several supported operating systems for the host tools. Some users will always download the wrong thing without understanding, and that can create a lot of support headaches. A simple change we could make is to retain the build products of our existing Github Actions builds, which can be done. We run these builds every time commits are pushed to master or a PR, so if we kept the results you could just look at the latest CI run for any branch, and download prebuilt binaries for libhackrf, the host tools and the firmware. But it would depend on the user knowing how to correctly install these in the right places I expect. I'll discuss it with the team. Thanks also to you @Potomac for all your work in reporting this problem, experimenting with things, and testing fixes over the last few months. I don't think we would have tracked down all the different bugs involved without your persistent efforts. |
Steps to reproduce
Expected behaviour
Gqrx should no randomly freeze during operation
Actual behaviour
Randomly, when switching frequency mode (AM, FM, LSB etc...) then Gqrx can freeze, I hear no sound from hackRF, it can happen 30% of the time when switching frequency mode,
if I use 2018 hackrf driver version then there is no freeze, no bugs, all is Ok with Gqrx,
my hackRF is a 2014 PCB version, firmware has been updated to 2021-03
Version information
archlinux 64 bits
hackrf_info version: 2021.03.1
libhackrf version: 2021.03.1 (0.6)
Board ID Number: 2 (HackRF One)
Firmware Version: 2021.03.1 (API:1.04)
Gqrx 2.14.4
The text was updated successfully, but these errors were encountered: