Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stochastic unit test failures with python 3.5 #82

Closed
tleonardi opened this issue Apr 10, 2019 · 2 comments
Closed

Stochastic unit test failures with python 3.5 #82

tleonardi opened this issue Apr 10, 2019 · 2 comments

Comments

@tleonardi
Copy link
Owner

The txComp test randomly fails with python 3.5 due to a p-value discrepancy:

============================= test session starts ==============================
platform linux -- Python 3.5.6, pytest-4.4.0, py-1.7.0, pluggy-0.9.0
rootdir: /home/travis/build/tleonardi/nanocompore
collected 45 items                                                             
tests/test_Integration.py .sssssssssssssssssss.                          [ 46%]
tests/test_SampCompDB.py ....                                            [ 55%]
tests/test_TxComp.py ...........F....                                    [ 91%]
tests/test_Whitelist.py ....                                             [100%]
=================================== FAILURES ===================================
____________________________ test_txComp_GMM_anova _____________________________
test_ref_pos_list = ([{'data': {'KD': {'KD1': {'coverage': 100, 'dwell': array([121.70757477, 129.70100881, 111.67411923, 131.1979314 ,
  ...328765287e-39, 3.3968653938213694e-40, 1.9321679678623975e-36, 8.482777798353687e-40, nan, 7.06503867181238e-40, ...]})
    def test_txComp_GMM_anova(test_ref_pos_list):
        ml = mock.Mock()
        if sys.version_info < (3, 6):
            tol = 0.0002
        else:
            tol=0.00000001
        res = txCompare(test_ref_pos_list[0], methods=['GMM'], logit=False, sequence_context=2, min_coverage=3, logger=ml, allow_warnings=False, random_state=np.random.RandomState(seed=42))
        GMM_pvalues = [pos['txComp']['GMM_anova_pvalue'] for pos in res ]
>       assert GMM_pvalues == [pytest.approx(i, abs=tol, nan_ok=True) for i in test_ref_pos_list[1]['GMM_anova']]
E       assert [0.0008574768...76768562, ...] == [0.00085747684... 2.0e-04, ...]
E         At index 3 diff: 0.0017335646468135102 != 0.0010906844025473576 ± 2.0e-04
E         Use -v to get the full diff
tests/test_TxComp.py:104: AssertionError
============== 1 failed, 25 passed, 19 skipped in 187.78 seconds ===============
The command "pytest" exited with 1.

After inspecting the results of the GMM fitting, it looks like when the test fails there's a small cluster counts discrepancy:

Passing

root@3f23411890db:/nanocompore/tests# pytest test_TxComp.py -vs                                                                                                                                                                                                                             
=================================================================================================================================== test session starts ====================================================================================================================================
platform linux -- Python 3.5.7, pytest-4.4.0, py-1.8.0, pluggy-0.9.0 -- /usr/local/bin/python                                                                                                                                                                                               
cachedir: .pytest_cache                                                                                                                                                                                                                                                                     
rootdir: /nanocompore                                                                                                                                                                                                                                                                       
collected 16 items                                                                                                                                                                                                                                                                          
                                                                                                                                                                                                                                                                                            
test_TxComp.py::test_combine_pvalues_hou[pvalues0] PASSED                                                                                                                                                                                                                                   
test_TxComp.py::test_combine_pvalues_hou[pvalues1] PASSED                                                                                                                                                                                                                                   
test_TxComp.py::test_combine_pvalues_hou[pvalues2] PASSED                                                                                                                                                                                                                                   
test_TxComp.py::test_combine_pvalues_hou[pvalues3] PASSED                                                                                                                                                                                                                                   
test_TxComp.py::test_combine_pvalues_raises_exception_with_invalid_pvalues[pvalues0] PASSED                                                                                                                                                                                                 
test_TxComp.py::test_combine_pvalues_raises_exception_with_invalid_pvalues[pvalues1] PASSED                                                                                                                                                                                                 
test_TxComp.py::test_combine_pvalues_raises_exception_with_invalid_pvalues[pvalues2] PASSED                                                                                                                                                                                                 
test_TxComp.py::test_nonparametric_test[v10-v20-expected0] PASSED                                                                                                                                                                                                                           
test_TxComp.py::test_sum_of_squares[x0-1243] PASSED                                                                                                                                                                                                                                         
test_TxComp.py::test_sum_of_squares[2-4] PASSED                                                                                                                                                                                                                                             
test_TxComp.py::test_sum_of_squares[x2-201] PASSED                                                                                                                                                                                                                                          
test_TxComp.py::test_txComp_GMM_anova {'GMM_anova_model': {'delta_logit': -5.154130973499999, 'table': F_onewayResult(statistic=915.3557790180453, pvalue=0.0010906844025473576), 'pvalue': 0.0010906844025473576, 'log_ratios': array([ 2.77258872, -2.60796674, -2.46385324,  2.46385324])
}, 'shift_stats': OrderedDict([('c1_mean_intensity', 101.25371018948026), ('c2_mean_intensity', 120.1690563433108), ('c1_median_intensity', 101.27532194257512), ('c2_median_intensity', 120.70768400991767), ('c1_sd_intensity', 9.976570066976405), ('c2_sd_intensity', 9.138834592954336)
, ('c1_mean_dwell', 99.53689876631836), ('c2_mean_dwell', 121.33193242927064), ('c1_median_dwell', 99.33218082115548), ('c2_median_dwell', 121.12071127421946), ('c1_sd_dwell', 10.25957982066471), ('c2_sd_dwell', 9.296922575493252)]), 'GMM_model': {'cluster_counts': 'KD1:95/5__WT1:6/9
4__WT2:7/93__KD2:93/7', 'model': GaussianMixture(covariance_type='full', init_params='kmeans', max_iter=1000,                                                                                                                                                                               
        means_init=None, n_components=2, n_init=1, precisions_init=None,                                                                                                                                                                                                                    
        random_state=<mtrand.RandomState object at 0x7f47adf82438>,                                                                                                                                                                                                                         
        reg_covar=1e-06, tol=0.001, verbose=0, verbose_interval=10,                                                                                                                                                                                                                         
        warm_start=False, weights_init=None)}, 'GMM_anova_pvalue_context_2': 4.706324386384477e-14, 'GMM_anova_pvalue': 0.0010906844025473576}                                                                                                                                              
PASSED                                                                                                                                                                                                                                                                                      
test_TxComp.py::test_txComp_GMM_logit [1.2742453287653416e-39, 3.3968653938213694e-40, 1.9321679678623975e-36, 6.01195032712085e-40, nan, 7.06503867181238e-40, 1.8392720921153275e-40, 9.162002495725215e-32, 5.92288489163853e-34, 3.1972432623453856e-40]                                
[1.274245328765287e-39, 3.3968653938213694e-40, 1.9321679678623975e-36, 8.482777798353687e-40, nan, 7.06503867181238e-40, 1.839272092115274e-40, 9.162002495725215e-32, 5.922884891638699e-34, 3.1972432623454785e-40]                                                                      
['KD1:94/6__WT1:7/93__WT2:9/91__KD2:95/5', 'KD1:92/8__WT1:12/88__WT2:7/93__KD2:93/7', 'KD1:82/18__WT1:5/95__WT2:10/90__KD2:85/15', 'KD1:95/5__WT1:6/94__WT2:7/93__KD2:93/7', 'NC', 'KD1:9/91__WT1:96/4__WT2:92/8__KD2:8/92', 'KD1:8/92__WT1:95/5__WT2:90/10__KD2:8/92', 'KD1:15/85__WT1:96/4
__WT2:98/2__KD2:14/86', 'KD1:3/97__WT1:86/14__WT2:84/16__KD2:5/95', 'KD1:7/93__WT1:94/6__WT2:93/7__KD2:7/93']                                                                                                                                                                               
PASSED                                                                                                                                                                                                                                                                                      
test_TxComp.py::test_txComp_GMM_anova_0_var PASSED                                                                                                                                                                                                                                          
test_TxComp.py::test_txComp_GMM_dup_lab PASSED                                                                                                                                                                                                                                              
test_TxComp.py::test_txComp_lowCov PASSED                                                                                                                                                                                                                                                   
                                                                                                                                                                                                                                                                                            
================================================================================================================================ 16 passed in 1.13 seconds ================================================================================================================================$

Failing

root@3f23411890db:/nanocompore/tests# pytest test_TxComp.py -vs                                                                                                                                                                                                                             
=================================================================================================================================== test session starts ====================================================================================================================================
platform linux -- Python 3.5.7, pytest-4.4.0, py-1.8.0, pluggy-0.9.0 -- /usr/local/bin/python                                                                                                                                                                                               
cachedir: .pytest_cache                                                                                                                                                                                                                                                                     
rootdir: /nanocompore                                                                                                                                                                                                                                                                       
collected 16 items                                                                                                                                                                                                                                                                          
                                                                                                                                                                                                                                                                                            
test_TxComp.py::test_combine_pvalues_hou[pvalues0] PASSED                                                                                                                                                                                                                                   
test_TxComp.py::test_combine_pvalues_hou[pvalues1] PASSED                                                                                                                                                                                                                                   
test_TxComp.py::test_combine_pvalues_hou[pvalues2] PASSED                                                                                                                                                                                                                                   
test_TxComp.py::test_combine_pvalues_hou[pvalues3] PASSED                                                                                                                                                                                                                                   
test_TxComp.py::test_combine_pvalues_raises_exception_with_invalid_pvalues[pvalues0] PASSED                                                                                                                                                                                                 
test_TxComp.py::test_combine_pvalues_raises_exception_with_invalid_pvalues[pvalues1] PASSED                                                                                                                                                                                                 
test_TxComp.py::test_combine_pvalues_raises_exception_with_invalid_pvalues[pvalues2] PASSED                                                                                                                                                                                                 
test_TxComp.py::test_nonparametric_test[v10-v20-expected0] PASSED                                                                                                                                                                                                                           
test_TxComp.py::test_sum_of_squares[x0-1243] PASSED                                                                                                                                                                                                                                         
test_TxComp.py::test_sum_of_squares[2-4] PASSED                                                                                                                                                                                                                                             
test_TxComp.py::test_sum_of_squares[x2-201] PASSED                                                                                                                                                                                                                                          
test_TxComp.py::test_txComp_GMM_anova {'GMM_anova_pvalue': 0.0017335646468135102, 'GMM_anova_model': {'log_ratios': array([ 2.77258872, -2.46385324,  2.46385324, -2.77258872]), 'table': F_onewayResult(statistic=575.3465305297385, pvalue=0.0017335646468135102), 'pvalue': 0.00173356464
68135102, 'delta_logit': -5.236441963}, 'GMM_model': {'cluster_counts': 'KD1:95/5__WT2:7/93__KD2:93/7__WT1:5/95', 'model': GaussianMixture(covariance_type='full', init_params='kmeans', max_iter=1000,                                                                                     
        means_init=None, n_components=2, n_init=1, precisions_init=None,                                                                                                                                                                                                                    
        random_state=<mtrand.RandomState object at 0x7fd59152e708>,                                                                                                                                                                                                                         
        reg_covar=1e-06, tol=0.001, verbose=0, verbose_interval=10,                                                                                                                                                                                                                         
        warm_start=False, weights_init=None)}, 'GMM_anova_pvalue_context_2': 1.2760955364845727e-13, 'shift_stats': OrderedDict([('c1_mean_intensity', 101.25371018948026), ('c2_mean_intensity', 120.16905634331079), ('c1_median_intensity', 101.27532194257512), ('c2_median_intensity', 
120.70768400991767), ('c1_sd_intensity', 9.976570066976405), ('c2_sd_intensity', 9.138834592954336), ('c1_mean_dwell', 99.53689876631836), ('c2_mean_dwell', 121.33193242927064), ('c1_median_dwell', 99.33218082115548), ('c2_median_dwell', 121.12071127421946), ('c1_sd_dwell', 10.259579
82066471), ('c2_sd_dwell', 9.296922575493252)])}
FAILED
test_TxComp.py::test_txComp_GMM_logit [1.2742453287653416e-39, 3.396865393821225e-40, 1.9321679678623975e-36, 8.482777798354296e-40, nan, 7.0650386718125795e-40, 1.8392720921153275e-40, 4.6826664356268694e-32, 5.922884891638699e-34, 3.197243262345706e-40]
[1.274245328765287e-39, 3.3968653938213694e-40, 1.9321679678623975e-36, 8.482777798353687e-40, nan, 7.06503867181238e-40, 1.839272092115274e-40, 9.162002495725215e-32, 5.922884891638699e-34, 3.1972432623454785e-40]
['KD1:94/6__WT2:9/91__KD2:95/5__WT1:7/93', 'KD1:92/8__WT2:7/93__KD2:93/7__WT1:12/88', 'KD1:82/18__WT2:10/90__KD2:85/15__WT1:5/95', 'KD1:95/5__WT2:7/93__KD2:93/7__WT1:5/95', 'NC', 'KD1:9/91__WT2:92/8__KD2:8/92__WT1:96/4', 'KD1:8/92__WT2:90/10__KD2:8/92__WT1:95/5', 'KD1:14/86__WT2:98/2
__KD2:14/86__WT1:96/4', 'KD1:3/97__WT2:84/16__KD2:5/95__WT1:86/14', 'KD1:7/93__WT2:93/7__KD2:7/93__WT1:94/6']
PASSED
test_TxComp.py::test_txComp_GMM_anova_0_var PASSED
test_TxComp.py::test_txComp_GMM_dup_lab PASSED
test_TxComp.py::test_txComp_lowCov PASSED

========================================================================================================================================= FAILURES =========================================================================================================================================
__________________________________________________________________________________________________________________________________ test_txComp_GMM_anova ___________________________________________________________________________________________________________________________________

test_ref_pos_list = ([{'data': {'KD': {'KD1': {'coverage': 100, 'dwell': array([121.70757477, 129.70100881, 111.67411923, 131.1979314 ,
  ...328765287e-39, 3.3968653938213694e-40, 1.9321679678623975e-36, 8.482777798353687e-40, nan, 7.06503867181238e-40, ...]})

    def test_txComp_GMM_anova(test_ref_pos_list):
        ml = mock.Mock()
        if sys.version_info < (3, 6):
            tol = 0.0002
        else:
            tol=0.00000001
        res = txCompare(test_ref_pos_list[0], methods=['GMM'], logit=False, sequence_context=2, min_coverage=3, logger=ml, allow_warnings=False, random_state=np.random.RandomState(seed=42))
        GMM_pvalues = [pos['txComp']['GMM_anova_pvalue'] for pos in res ]
        print(res[3]['txComp'])
>       assert GMM_pvalues == [pytest.approx(i, abs=tol, nan_ok=True) for i in test_ref_pos_list[1]['GMM_anova']]
E       AssertionError: assert [0.0008574768...76768562, ...] == [0.00085747684... 2.0e-04, ...]
E         At index 3 diff: 0.0017335646468135102 != 0.0010906844025473576 ± 2.0e-04
E         Full diff:
E         - [0.0008574768473501677,
E         + [0.0008574768473501677 ± 2.0e-04,
E         ?                       ++++++++++
E         -  0.0036329291397528157,
E         +  0.0036329291397528157 ± 2.0e-04,...
E         
E         ...Full output truncated (24 lines hidden), use '-vv' to show

test_TxComp.py:105: AssertionError
=========================================================================================================================== 1 failed, 15 passed in 1.34 seconds ============================================================================================================================

@tleonardi
Copy link
Owner Author

As a temporary workaround I'm lowering the tol for python 3.5

tleonardi added a commit that referenced this issue Apr 10, 2019
tleonardi added a commit that referenced this issue Apr 10, 2019
* Added empty changelog

* Small doc changes

* Typo and reformat doc

* Fixed issue #68

* Added travis file

* Added branches safelist

* Removed python 3.2

* Added devel branch to travis safelist

* Updated pytest version

* Replcated tmpdir_factory with tmp_path_factory

* Fixed python version compatibility

* Fixed syntax error

* Casting pathlib objects to string for compatibility with python3.5

* Fixed python 3.5 compatibility error

* Fixed issue #68

* Removed trailing whitespace

* Added travis badge

* Skip many integration tests if running inside Travis

* Added conditional workflow for 3.5 version

* Lowered p-value tolerance for python 3.5

* Fixed typos in checking python version

* nbconvert is now installed bebore mknotebooks

* Further tolerance adjustment for python3.5

* Updated Changelog

* Clean up of travis yml file

* Added Slack notifications and automatic gh-pages deployment (#78)

* Updated release data and bumped version number

* Add Changelog to doc (#81)

* Workaround for issue #82
@a-slide
Copy link
Collaborator

a-slide commented Apr 10, 2019

At the moment it would be advisable to use Python3.6+ instead

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants