Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test failure on MIPS and other RISC archs: test_cabrillo.c:159: error: Failure! #210

Closed
df7cb opened this issue Jan 5, 2021 · 18 comments
Closed

Comments

@df7cb
Copy link
Contributor

df7cb commented Jan 5, 2021

tlf 1.4.1 fails on mips64el:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=979377
https://buildd.debian.org/status/fetch.php?pkg=tlf&arch=mips64el&ver=1.4.1-2&stamp=1609846479&raw=0

FAIL: run_cabrillo
==================

[==========] Running 12 test(s).
[ RUN      ] test_starts_with_succeed
[       OK ] test_starts_with_succeed
[ RUN      ] test_starts_with_fails
[       OK ] test_starts_with_fails
[ RUN      ] test_translateItem
[       OK ] test_translateItem
[ RUN      ] test_freeCabfmt
[       OK ] test_freeCabfmt
[ RUN      ] test_parseLine
[       OK ] test_parseLine
[ RUN      ] test_readCabrilloFormatUniversal
[       OK ] test_readCabrilloFormatUniversal
[ RUN      ] test_readCabrilloFormatWAE
[       OK ] test_readCabrilloFormatWAE
[ RUN      ] test_readCabrilloFileNotFound
[       OK ] test_readCabrilloFileNotFound
[ RUN      ] test_readCabrilloFormatNotFound
[       OK ] test_readCabrilloFormatNotFound
[ RUN      ] test_cabToTlf_ParseQSO
[  ERROR   ] --- 0 != 0x3
[   LINE   ] --- test_cabrillo.c:159: error: Failure!
[  FAILED  ] test_cabToTlf_ParseQSO
[ RUN      ] test_cabToTlf_ParseXQSO
[  ERROR   ] --- 0 != 0x3
[   LINE   ] --- test_cabrillo.c:179: error: Failure!
[  FAILED  ] test_cabToTlf_ParseXQSO
[ RUN      ] test_cabToTlf_ParseQTC
[       OK ] test_cabToTlf_ParseQTC
[==========] 12 test(s) run.
[  PASSED  ] 10 test(s).
[  FAILED  ] 2 test(s), listed below:
[  FAILED  ] test_cabToTlf_ParseQSO
[  FAILED  ] test_cabToTlf_ParseXQSO

 2 FAILED TEST(S)
FAIL run_cabrillo (exit status: 2)

... and similarly on other architectures: https://buildd.debian.org/status/logs.php?pkg=tlf&ver=1.4.1-2

@dl1jbe
Copy link
Member

dl1jbe commented Jan 6, 2021

Thanks for the report. We will look into it.

@zcsahok
Copy link
Member

zcsahok commented Jan 6, 2021

@df7cb to get the full picture: was all OK before? if yes, what has been changed? (gcc? hamlib? ...)
Thanks.

@airween
Copy link
Member

airween commented Jan 6, 2021

@zcsahok,

here are the full list of logs of buildd with Tlf 1.4.1-1 (before the hamlib4).

The Sid version of Debian is continuously upgraded, so it may be the cause of GCC - but I don't think that's a Hamlib issue.

@zcsahok
Copy link
Member

zcsahok commented Jan 6, 2021

FYI trying to reproduce this but it's not possible to install current [20201202] sid mips64el/mipsel using QEMU 1:3.1+dfsg-8+deb10u8 due to kernel panic. (previous installer [20200314] does boot, however)
Now installing buster and will try to upgrade.

@df7cb
Copy link
Contributor Author

df7cb commented Jan 6, 2021

1.4.1-1 built on 19 May 2020 with gcc-9_9.3.0-12 was fine, while
1.4.1-2 built on 05 Jan 2021 with gcc-10_10.2.1-3 was failing.
But that was surely not the only toolchain change...
Thanks for looking!

@df7cb
Copy link
Contributor Author

df7cb commented Jan 6, 2021

With #211 fixed I retried building git head on mipsel, and the tests are all passing now.

Could you tag a new release? That would be the easiest way forward I think.

@zcsahok
Copy link
Member

zcsahok commented Jan 6, 2021

Kind of magic, as the changes were just at unrelated places.

@df7cb
Copy link
Contributor Author

df7cb commented Jan 6, 2021

That might mean the bug is still there... :(
Still, a release would fix the current problem that tlf doesn't build on all Debian release architectures.

@airween
Copy link
Member

airween commented Jan 6, 2021

Or we can make a patch from the upstream for the 1.4.1. I'll take a look it soon.

@dl1jbe
Copy link
Member

dl1jbe commented Jan 6, 2021

@df7cb What is the difference between 1.4.1-1 and 1.4.1-2?

Thanks to the lot of changes in the last year a new release is really due. But does it make sense to have a new release with the unresolved problem of a potential failing test.

Best would be to compile on mipsel64 with -g and run the test (run_cabrillo) separate in a gdb session to narrow down the problem.

@zcsahok if you could manage to set up a system in qemu that would be great.

@df7cb
Copy link
Contributor Author

df7cb commented Jan 6, 2021

Code-wise there is no difference, but it's now built with hamlib 4, and the toolchain changed a lot.

I can do more debugging on the mipsel/mips64el machine tomorrow.

@AdrianBunk
Copy link

Building with -fsanitize=address on amd64 gives:

==12985==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55fc9b0c4afc at pc 0x55fc9b0b2ec5 bp 0x7ffc26e11a80 sp 0x7ffc26e11a78
WRITE of size 4 at 0x55fc9b0c4afc thread T0
    #0 0x55fc9b0b2ec4 in write_log_fm_cabr src/readcabrillo.c:91
    #1 0x55fc9b0b39ee in cab_qso_to_tlf src/readcabrillo.c:363
    #2 0x55fc9b0b14af in test_cabToTlf_ParseQSO test/test_cabrillo.c:157
    #3 0x7fc9cde8e238  (/lib/x86_64-linux-gnu/libcmocka.so.0+0x5238)
    #4 0x7fc9cde8ebd8 in _cmocka_run_group_tests (/lib/x86_64-linux-gnu/libcmocka.so.0+0x5bd8)
    #5 0x55fc9b0b0601 in main test/run_cabrillo.c:37
    #6 0x7fc9cdcead09 in __libc_start_main ../csu/libc-start.c:308
    #7 0x55fc9b0b0829 in _start (/tmp/tlf-1.4.1/test/run_cabrillo+0xf829)

0x55fc9b0c4afc is located 4 bytes to the left of global variable 'qsoflags_for_qtc' defined in 'test_cabrillo.c:15:5' (0x55fc9b0c4b00) of size 80000
0x55fc9b0c4afc is located 56 bytes to the right of global variable 'bandinx_spy' defined in 'test_cabrillo.c:26:5' (0x55fc9b0c4ac0) of size 4
SUMMARY: AddressSanitizer: global-buffer-overflow src/readcabrillo.c:91 in write_log_fm_cabr

The bug is in https://sources.debian.org/src/tlf/1.4.1-2/src/readcabrillo.c/#L91
When nr_qsos is 0 you are overwriting something outside the array, and on some architectures this might just be the value of bandinx_spy (on amd64 there seems to be some space between the variables so the write might be harmless there).
Random changes of the code might just change what gets wrongly overwritten.

I've confirmed that an if(nr_qsos) before that line fixes the test on MIPS, I do not know whether this is the correct fix.

@dl1jbe
Copy link
Member

dl1jbe commented Jan 7, 2021

Looks promising.

A quick check of the function in question shows that two lines up store_qso() is called. In that function nr_qsos gets incremented. So normally the variable is at least one.

But in the testcode (test_cabrillo.c) we stub out store_qso() with an empty definition. We should modify it as follow:

void store_qso() { nr_qsos++; }

That should fix it.

@AdrianBunk
Copy link

void store_qso() { nr_qsos++; }

That should fix it.

Thanks, works.

@df7cb
Copy link
Contributor Author

df7cb commented Jan 7, 2021

This fixes the test problem on mipsel (32) as well. Thanks!
I'll upload that as hotfix so it's no longer blocking the hamlib4 transition in Debian.

@dl1jbe
Copy link
Member

dl1jbe commented Jan 7, 2021

Fixed in PR #212. If no negative comment I will pick it up in next two days.

@AdrianBunk '-fsanitize=address' showed some more problems. Will be fixed by above mentioned commits. Thanks for pointing in that direction.

@zcsahok
Copy link
Member

zcsahok commented Jan 7, 2021

Fix e95db78 for Cabrillo test confirmed on QEMU mips64el system using

user@dmips64el:~/tlf$ gcc --version
gcc (Debian 10.2.1-3) 10.2.1 20201224
user@dmips64el:~/tlf$ dpkg -l | grep hamlib
ii  libhamlib-dev:mips64el                   4.0-3                           mips64el     Development library to control radio transceivers and receivers
ii  libhamlib4:mips64el                      4.0-3                           mips64el     Run-time library to control radio transceivers and receivers

It just takes ages to install party due to broken kernel (had to go the buster->bullseye->sid way). And a full Tlf build takes well over an hour.

@dl1jbe
Copy link
Member

dl1jbe commented Jan 8, 2021

Ok. Will merge PR #212 now.

dl1jbe added a commit that referenced this issue Jan 8, 2021
* Fix out of bounds error during test (issue #210)

* Fix memory leaks in test code

* Fix possible stack underflow during test

* Fix overlapping buffer warning which happens only if we call setcontest() with the already set contest name.

Closes issue #210
@dl1jbe dl1jbe closed this as completed Jan 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants