Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cost zero division #20

Merged
merged 13 commits into from
May 18, 2020
Merged

Conversation

accurrently
Copy link
Contributor

@accurrently accurrently commented Nov 2, 2019

Fix for bug #19.

Updated cost() to include a kwarg called zvalue which defaults to 0.5. Changes should have little effect on existing calls.

It effectively prevents zero values for nbytes from creating a ZeroDivisionError by providing a nonzero value less than 1. The function still raises ZeroDivisionError if zvalue==0.

New return value:

time / (nbytes if nbytes else zvalue) / 1e9

Copy link
Member

@jrbourbeau jrbourbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @accurrently, I appreciate you fixing this! I wonder if instead of adding an additional zvalue keyword, could we just use 1 if nbytes is 0. I.e. something along the lines of your original comment (#19 (comment)):

def cost(nbytes, time):
    return float(time) / max(nbytes, 1) / 1e9

What do you think about this approach?

@accurrently
Copy link
Contributor Author

We can. Though bytes should never be negative, part of used the if/else statement to protect against a zero value instead of having to worry about negative values.

The reason I chose .5 is to differentiate between an object of 1 byte and a value that clearly indicates a zero/null. If the difference in outcome isn't significant, 1 should be fine.

If nbytes can be guaranteed to be >=0, then using max() should work just fine, in my opinion.

And frankly, max() is much nicer to read than the if/else. If the value of one works, that should be fine as well.

Agree on elimination of the kwarg.

cachey/cache.py Outdated Show resolved Hide resolved
cachey/cache.py Outdated Show resolved Hide resolved
@@ -1,8 +1,17 @@
import sys
from time import sleep

import pytest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we don't need this import.

def test_cost():

assert cost(200, 500) == 2.5e-09
assert cost(0, 500) == 5e-07
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not assert that they have to be these values exactly. I think that it would be fine if things just worked, and that the cost was relatively low, like maybe

Suggested change
assert cost(0, 500) == 5e-07
assert 0 <= cost(0, 500) <= 100

from cachey import Cache, Scorer, nbytes

from cachey.cache import cost

def test_cost():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def test_cost():
def test_cost_with_zero_bytes():

@martindurant
Copy link
Member

@accurrently , were you planning to finish this off?

@martindurant
Copy link
Member

Thanks for the original work here, @accurrently

@martindurant martindurant merged commit d338a53 into dask:master May 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants