Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix multiple bin for float binsize bug. #48

Merged
merged 6 commits into from
Nov 18, 2020
Merged

Conversation

CWE0
Copy link
Contributor

@CWE0 CWE0 commented Nov 16, 2020

Assures that each float rng is computed only once and no comparison between two float rng is required.
In addition unused variables i,x are removed.

Addresses issue #46.

Remove unused variables i, x
Fix duplicate bin bug for float binsize due to round-off
Bugfix: n is float
@iamlikeme
Copy link
Owner

The fix introduces a regression - test test_count_cycles_binsize is now failing (see output from Travis or run the tests locally).

Also, in order to merge this PR it must provide a new test case which demonstrates the bug you are fixing, i.e. the test should fail without your fix and should pass after the fix. The example in #46 is a good starting point for the new test.

Bugfix zero range for option binsize.
Unique treatent of options binsize and nbins.
Additional test case for float binsize bug.
@CWE0
Copy link
Contributor Author

CWE0 commented Nov 17, 2020

My last commit solves the regression and adds the required test case.

@iamlikeme
Copy link
Owner

I have slightly refactored/simplified the code.
I have also refactored unit tests so that the test case you contributed can be used in more tests.

@CWE0
Copy link
Contributor Author

CWE0 commented Nov 18, 2020 via email

@iamlikeme
Copy link
Owner

That is true, but we are talking about extremely cheap operations. Like, nanoseconds cheap. Take setdefault for example:

>>> d = {}
>>> %timeit d.setdefault(5, 0.0)
# 81.6 ns ± 0.362 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

So the performance hit of doing some extra iterations in lines 158-159 is negligible. And it is probably comparable to the cost of doing a set difference (line 157 before the refactor). But by swapping the ibin set to nmax we get simpler and more readable code which accomplishes the exact same thing.

@gsokoll
Copy link
Contributor

gsokoll commented Nov 18, 2020

There are many micro-optimisations you could do, eg drop the nmax comparison in line 156 (which is done for every point in the entire cycles list) and replace it with a search for max n in counts afterwards.

The benefit is still likely to be small, but since I have used the rainflow routine on many GB of data, all improvements are nice...

@iamlikeme iamlikeme merged commit c5318df into iamlikeme:master Nov 18, 2020
@iamlikeme
Copy link
Owner

Thank you for your contribution @CWE0

@iamlikeme
Copy link
Owner

@gsokoll, feel free to submit a pull request with performance optimizations.

@iamlikeme
Copy link
Owner

This PR is included in v3.0.1 on PyPI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants