Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test failures on ARM and i686 #197

Closed
mbakke opened this issue Aug 21, 2016 · 9 comments
Closed

Test failures on ARM and i686 #197

mbakke opened this issue Aug 21, 2016 · 9 comments

Comments

@mbakke
Copy link

mbakke commented Aug 21, 2016

Hi,

I've packaged dlib for GNU Guix and the CI tool reports test failures on ARM and i686 targets.

On ARM the failing tests are:

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!! TEST FAILED: test_learning_to_track !!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Failure message from test: 

Error occurred at line 219.
Error occurred in file /tmp/nix-build-dlib-19.1.drv-0/dlib-19.1/dlib/test/learning_to_track.cpp.
Failing expression was test_val == 1.
0.3912

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!! TEST FAILED: test_max_cost_assignment !!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Failure message from test: 

Error occurred at line 124.
Error occurred in file /tmp/nix-build-dlib-19.1.drv-0/dlib-19.1/dlib/test/max_cost_assignment.cpp.
Failing expression was assignment_cost(cost,assign) == true_eval.

On i686 there are five tests total failing, see the full log here: https://hydra.gnu.org/build/1443544/nixlog/3/raw

And here for the ARM build log: https://hydra.gnu.org/build/1443671/nixlog/2/raw

The errors look serious, so I wonder if we are doing something wrong? Is ARM and i686 supported at all? I cannot reproduce this on amd64, but the Hydra build server is actually crashing on test_empirical_kernel_map: https://hydra.gnu.org/build/1443697/nixlog/2/raw

@davisking
Copy link
Owner

Some systems have broken versions of BLAS. You could disable BLAS and see if that helps. But some of these errors have nothing to do with BLAS so I can't say. Certainly dlib will work on ARM and i686 machines, but I don't know why the tests are failing on your computer.

@mbakke
Copy link
Author

mbakke commented Aug 24, 2016

I can reproduce the test failures on Ubuntu 16.04 in a 32-bit chroot. So, I don't think these failures are Guix related.

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!! TEST FAILED: test_dnn !!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Failure message from test:

Error occurred at line 1218.
Error occurred in file /home/ubuntu/dlib/dlib/test/dnn.cpp.
Failing expression was res.
Gradient error in data variable #143.  Relative error: -0.0576071
expected derivative: 1.37646
output derivative:   1.45575
iteration:           2
--------------------------------------------------------------------------------------------
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!! TEST FAILED: test_fhog !!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Failure message from test:

Error occurred at line 77.
Error occurred in file /home/ubuntu/dlib/dlib/test/fhog.cpp.
Failing expression was std::abs(hog[o][r][c] - ref_hog[r][c](o)) < 1e-6.
2.93553e-06
--------------------------------------------------------------------------------------------
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!! TEST FAILED: test_optimization !!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Failure message from test:

Error occurred at line 1154.
Error occurred in file /home/ubuntu/dlib/dlib/test/optimization.cpp.
Failing expression was rs.mean() < 1e-5.
--------------------------------------------------------------------------------------------
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!! TEST FAILED: test_sockets !!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Failure message from test:

Error occurred at line 221.
Error occurred in file /home/ubuntu/dlib/dlib/test/sockets.cpp.
Failing expression was !srv.error_occurred.



Testing Finished
Total number of individual testing statements executed: 471047857
Number of failed tests: 4
Number of passed tests: 130

The fifth test failure was indeed solved by updating OpenBLAS.

@davisking
Copy link
Owner

davisking commented Aug 24, 2016 via email

@mbakke
Copy link
Author

mbakke commented Aug 24, 2016

I roughly followed the steps here: https://wiki.ubuntu.com/DebootstrapChroot

..with some modifications. I believe this was the full procedure:

# apt install debootstrap schroot
# mkdir -p /chroot/xenial_i386
# debootstrap --variant=buildd --arch=i386 xenial /chroot/xenial_i386 http://archive.ubuntu.com/ubuntu/
# cat <<EOF > /etc/schroot/chroot.d/xenial_i386.conf
[xenial_i386]
description=Ubuntu 16.04 i386
directory=/chroot/xenial_i386
personality=linux32
root-users=ubuntu
type=directory
users=ubuntu
EOF
ubuntu@build-host $ schroot -c xenial_i386 -u root
(chroot) # sed -i 's/xenial main/xenial main universe/g' /etc/apt/sources.list
(chroot) # apt install build-essential cmake pkg-config libx11-dev libjpeg-dev libpng-dev libgif-dev libopenblas-dev liblapack-dev

...and then build and run the tests as usual. Tested with latest git master.

@davisking
Copy link
Owner

davisking commented Aug 24, 2016 via email

@davisking
Copy link
Owner

davisking commented Aug 24, 2016 via email

@mbakke
Copy link
Author

mbakke commented Aug 25, 2016

Thanks, that was quick! I can confirm that, on Ubuntu, only the fhog test still fails.

On Guix, in addition to fhog, I still see a test_sparse_vector failure on i686:

Error occurred at line 88.
Error occurred in file /tmp/guix-build-dlib-19.1.drv-0/source/dlib/test/sparse_vector.cpp.
Failing expression was max(abs(r1-r2)) < 1e-15.

Do you have any idea what may cause this?

I can't easily check if the ARM failures are resolved, but can open another issue for that if it still occurs after the next release (I'll simply disable these tests for 19.1).

@davisking
Copy link
Owner

davisking commented Aug 25, 2016 via email

@mbakke
Copy link
Author

mbakke commented Aug 26, 2016

Upgrading gcc from 4.9.3 to 5.3.0 made the sparse vector test pass. FWIW the expression value under GCC 4.9.3 was max(abs(r1-r2)) = 1.77636e-15.

So I think this is resolved for now, save for the fhog test which you are aware of. I will disable these tests for 19.1 and open a new issue if we still have problems in the next release. Thank you!

@mbakke mbakke closed this as completed Aug 26, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants