Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FAIL: distribution_tests #10

Closed
cicku opened this issue Aug 21, 2014 · 26 comments
Closed

FAIL: distribution_tests #10

cicku opened this issue Aug 21, 2014 · 26 comments

Comments

@cicku
Copy link
Contributor

cicku commented Aug 21, 2014

Hi,

The testsuite is nearly perfect except one below:

Bernoulli distribution: ...
PASS. Bernoulli distribution: ......
PASS. Beta distribution: ...
PASS. Beta distribution: gsl: simplex.c:265: ERROR: non-finite function value encountered
Default GSL error handler invoked.

gsl version: 1.16

@jgmbenoit
Copy link
Collaborator

Hello,
on my Debian Wheezy box and on a Sid virtual box (pbuilder), the testsuite passes.
Jerome

@cicku
Copy link
Contributor Author

cicku commented Sep 10, 2014

So?

@b-k
Copy link
Owner

b-k commented Sep 10, 2014

I'm also unable to replicate on an up-to-date Fedora box with gsl 1.15.

@cicku, any hints as to how we can replicate the failure? OS, number of threads, using the latest version of the library? If you set .verbose='y' in tests/distribution_tests.c:208, the test should log the parameters tried; what was the last set at failure? If you're running make check then such output will be diverted to distribution_tests.log. Any info about the state of the parameters from the debugger? Thx.

@cicku
Copy link
Contributor Author

cicku commented Sep 13, 2014

I tested it on Fedora 22 32 bits.

I will re check this soon.

@jgmbenoit
Copy link
Collaborator

It appears that the tests failed on Debian i386 boxes (for the least at the time of writing):
https://buildd.debian.org/status/package.php?p=apophenia
Unfortunately, I cannot reproduce the failures on a virtual i386 box on my amd64 box, I have no access to any i386 box, I cannot grab the trace of the tests either.

@b-k
Copy link
Owner

b-k commented Nov 3, 2014

I recall having trouble with the GSL's beta-related functions when alpha or beta are near zero. I don't have access to any 32-bit machines right now, so I can't replicate the problem, but there is evidence here and there of arbitrary limits to prevent breaks. For example, here's the slightly mis-documented constraint function for the apop_beta model:
static long double beta_constraint(apop_data *data, apop_model *v){
//constraint is 0 < beta_1 and 0 < beta_2
return apop_linear_constraint(v->parameters->vector, .margin= 1e-4);
}
Does expanding the margin (e.g., to 1e-3) help to prevent NaNs on the 32-bit version?

Many of the tests are based on using random draws to arrive at a statistic, and comparing the statistic to its expected value to within some tolerance. There are two types of error:

  • The calculation is just wrong, producing a value far from the expected value
  • The random draws lead to a value that is close to the expected value, but outside of the tolerance in the test.

When a test fails, it's worth investigating which type of error it is, and if it's the second type, it may be reasonable to just adjust the tolerance of the test. Raising the number of draws is also an option, depending on how long we are willing to wait for a test to finish.

@jgmbenoit
Copy link
Collaborator

Finally I could grab the test-suite.log file on a i386 box: I got the same error.

@cicku
Copy link
Contributor Author

cicku commented Sep 3, 2015

Yes, I still get the same error with 0.999a.

@jgmbenoit
Copy link
Collaborator

The error was fix in the last debian packages, and the fix was brought to GitHub a couple of days ago.

@LocutusOfBorg
Copy link

Hi @jgmbenoit and all, the bug was fixed, but the 1.0 version introduced the issue again (I checked only i386) I checked, the patches were indeed applied, so I looked at the other commits, and the bisect found the commit responsible for the regression:
344fab5

reverting this commit on top of 1.0 makes the testsuite succeed on i386

make  check-TESTS
make[3]: Entering directory '/«BUILDDIR»/apophenia-1.0+ds/_build/tests'
make[4]: Entering directory '/«BUILDDIR»/apophenia-1.0+ds/_build/tests'
PASS: factors
PASS: error_test
PASS: nist_tests
PASS: utilities_test
PASS: ../eg/apop_map_row
PASS: sort_example
PASS: db_tests
PASS: ../eg/binning
PASS: ../eg/data_fill
PASS: ../eg/dot_products
PASS: ../eg/entropy_vector
PASS: ../eg/draw_to_db
PASS: ../eg/iv
PASS: ../eg/normalization_demo
PASS: ../eg/ols_oneliner
PASS: ../eg/ols
PASS: ../eg/ks_tests
PASS: ../eg/simple_subsets
PASS: ../eg/parameterization
PASS: ../eg/t_test_by_rows
PASS: ../eg/test_fisher
PASS: ../eg/test_distances
PASS: ../eg/test_harmonic
PASS: ../eg/test_pruning
PASS: ../eg/test_regex
PASS: rake_test
PASS: ../eg/fake_logit
PASS: test_kernel_ll
PASS: ../eg/db_fns
PASS: ../eg/dconstrain
PASS: ../eg/boot_clt
PASS: ../eg/f_test
PASS: ../eg/entropy_model
PASS: ../eg/hills2
PASS: ../eg/fix_params
PASS: ../eg/jacobian
PASS: ../eg/jack
PASS: ../eg/ml_imputation
PASS: ../eg/pmf_test
PASS: ../eg/test_kl_divergence
PASS: ../eg/some_cdfs
PASS: ../eg/test_ranks
PASS: ../eg/cross_models
PASS: ../eg/logit
../../build-aux/test-driver: line 107: 20479 Aborted                 "$@" > $log_file 2>&1
FAIL: test_apop
PASS: distribution_tests
PASS: ../eg/transform
PASS: update_via_rng
PASS: ../eg/test_updating
PASS: lognormal_test
============================================================================
Testsuite summary for apophenia 1.0
============================================================================
# TOTAL: 50
# PASS:  49
# SKIP:  0
# XFAIL: 0
# FAIL:  1
# XPASS: 0
# ERROR: 0

I'm not sure about what needs to be fixed, but right now the package is stuck in unstable, and won't migrate without a fix.

cheers!

Gianfranco

@LocutusOfBorg
Copy link

BTW this seems to be a regression on 32bit architecture, because at a first look 64 bit builds are fine.

@jgmbenoit
Copy link
Collaborator

Thanks for your reports. I will come back to this bug latter, once I can concentrate: it is very low level. I have my idea, but I have to test it on different architectures.

@LocutusOfBorg
Copy link

actually the FTBFS is RC for Debian policy, and the package build failure is preventing the old libgsl from being removed from unstable/testing (cruft report).
https://ftp-master.debian.org/cruft-report-daily.txt

so, if you can, please upload the revert of that commit (I can do it if you don't have upload privileges), and then take your time to properly fix the issue
(Assuming the revert is "safe" for users)

@jgmbenoit
Copy link
Collaborator

I will not assume that the revert is safe. I was not aware of the GSL issue. Is there a list of packages that block the GSL package ? I will have a look and try to fix the issue this week-end.

@LocutusOfBorg
Copy link

we are working on the decruft of the old library, there is no hurry, but please note that the next ubuntu LTS should have 1.0, and it is currently in proposed because of this bug.
So, please try to fix if possible (a timeline < 1 month would be awesome)

@jgmbenoit
Copy link
Collaborator

Fortunately enough, this timeline is greater than my timeline.

@b-k
Copy link
Owner

b-k commented Feb 1, 2016

@LocutusOfBorg, would you please post or send me test_apop.log so we can see which function is failing?

@LocutusOfBorg
Copy link

Sundry tests for Apophenia.
Running relatively faster tests.  To run slower tests (primarily simulated annealing), use -s.
For quieter output, use -q. To change thread count (default=2), use -t1, -t2, -t3, ...
vtables:
PASS.  test listwise delete:
PASS.  rownames:
PASS.  apop_dot:
PASS.  apop_jackknife:
PASS.  test multivariate_normal:
PASS.  log and exponent:
PASS.  split and stack test:
PASS.  test probit and logit:
PASS.  test probit and logit again:
PASS.  test data compressing:
PASS.  weighted regression:
PASS.  offset OLS:variadic_apop_data_pack: The input data set has 0 elements, but the output vector you want to fill has size 1. Please make these sizes equal.
variadic_apop_data_pack: The input data set has 0 elements, but the output vector you want to fill has size 1. Please make these sizes equal.
lt-test_apop: ../../tests/test_apop.c:1194: test_ols_offset: Assertion `variadic_apop_vector_distance((variadic_type_apop_vector_distance) {zcov, wcov}) < 1e-4' failed.
FAIL test_apop (exit status: 134)

I hope this helps!

cheers!

@b-k
Copy link
Owner

b-k commented Feb 1, 2016

Thanks, that's very useful. I'll have a closer look in a few hours and try to get this resolved today.

b-k added a commit that referenced this issue Feb 1, 2016
@b-k
Copy link
Owner

b-k commented Feb 1, 2016

Commit 4bb0399 addresses the bug. Please test on your side and let me know how it goes.

What happened: line 801 of apop_conversions.c had been:
apop_data_pack(in->more, &vout);
but should be:
apop_data_pack(in->more, &vout, more_pages, use_info_pages);
This is in the part of apop_data_pack that recursively handles subsequent pages in a chain of apop_data sets. The recursive call needs to use the same options as the initial call, but the recursive call had been using the defaults, which leads to the confusion expressed in the warning.

The commit @LocutusOfBorg pointed out added a page to the parameter set and thus caused this bug to be active.

@jgmbenoit
Copy link
Collaborator

I am ready to buy it :-) When do you plan to make a new release ?

@LocutusOfBorg
Copy link

@b-k @jgmbenoit thanks to you both!
Please let me know if I have to sponsor anything for you
this is the patch I applied:

--- a/apop_conversions.c
+++ b/apop_conversions.c
@@ -848,7 +848,7 @@ APOP_VAR_ENDHEAD
             in = in->more;
         if (in->more){
             vout = gsl_vector_subvector((gsl_vector *)out, offset, out->size - offset).vector;
-            apop_data_pack(in->more, &vout);
+            apop_data_pack(in->more, &vout, more_pages, use_info_pages);
         }
     }
     return out;

@jgmbenoit
Copy link
Collaborator

I have just updated the debian package material and ask for sponsorship to the usual sponsor. Thanks for your help.

@LocutusOfBorg
Copy link

thanks!

@b-k
Copy link
Owner

b-k commented Feb 12, 2016

OK, have updated the pkg branch with the above changes (and two others regarding Kullback-Leibler divergence and mixture distributions).

@b-k b-k closed this as completed Feb 12, 2016
@cicku
Copy link
Contributor Author

cicku commented Feb 13, 2016

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants