Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python-swifter build fails on 1-core VM #102

Closed
bmwiedemann opened this issue Mar 23, 2020 · 5 comments
Closed

python-swifter build fails on 1-core VM #102

bmwiedemann opened this issue Mar 23, 2020 · 5 comments

Comments

@bmwiedemann
Copy link

Originally filed at https://bugzilla.opensuse.org/show_bug.cgi?id=1158578

While working on reproducible builds for openSUSE, I found that
our python-swifter-0.301 package consistently fails some test when building in a 1-core VM

actual results:

[snip]
 ======================================================================
 FAIL: test_nonvectorized_text_apply_on_large_dataframe (swifter.swifter_tests.TestSwifter)
 ----------------------------------------------------------------------
 Traceback (most recent call last):
   File "/home/abuild/rpmbuild/BUILD/swifter-0.296/swifter/swifter_tests.py", line 219, in test_nonvectorized_text_apply_on_large_dataframe
     self.assertLess(swifter_time, pd_time)
 AssertionError: 90.86491084098816 not less than 80.60593891143799

 ======================================================================
 FAIL: test_vectorized_math_apply_on_large_rolling_dataframe (swifter.swifter_tests.TestSwifter)
 ---------------------------------------------------------------------- 
 Traceback (most recent call last):
   File "/home/abuild/rpmbuild/BUILD/swifter-0.296/swifter/swifter_tests.py", line 237, in test_vectorized_math_apply_on_large_rolling_dataframe    
     self.assertLess(swifter_time, pd_time)
 AssertionError: 8.690391063690186 not less than 8.230899572372437
 
 ----------------------------------------------------------------------
 Ran 13 tests in 1189.797s
 
 FAILED (failures=5)
@jmcarpenter2
Copy link
Owner

Hi @bmwiedemann ,

Thanks for notifying me, this totally makes sense.

In building swifter, I set some high standards for tests to pass. One of those standards requires that swifter is at least as fast as a regular pandas apply. However, this standard simply won't be met on a machine with 1-core because then the parallelism can't be leveraged.

If there's anyway to leverage even 2-cores for your test machine, then these should pass.

Thanks,
Jason

@bmwiedemann
Copy link
Author

Couldnt you detect the number of CPUs and either skip this kind of test on 1-core machines or lower your requirements? E.g. expect it to be at least half als fast as pandas there.

Our build VMs can have variations in performance depending on other builds that happen in other VMs on the same host.

@jmcarpenter2
Copy link
Owner

jmcarpenter2 commented Mar 28, 2020

Hey @bmwiedemann,

Great idea. I'm currently working on PR #104 to make this change. I will follow-up here when the PR is merged.

Thanks for the suggestion,
Jason

@jmcarpenter2
Copy link
Owner

Follow-up: PR #104 is merged.

@bmwiedemann , does openSUSE clone the repo to run the tests or does it require a new release of the package to run them? I want to make sure that your problem is resolved before closing this issue.

Thanks,
Jason

@bmwiedemann
Copy link
Author

bmwiedemann commented Mar 29, 2020

For openSUSE we usually pull updates in manually and usually use release tarballs.
Thus, I manually tested the PR patch ontop of our existing package and tests now pass on 1-core as well.

As a coder my main concern would be the large amount of code duplication introduced. But then it is your project and long-term maintenance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants