Skip to content

VR-3974: don't upload artifact in log_training_data()#576

Merged
convoliution merged 28 commits into
masterfrom
client/ml/histogram
Apr 29, 2020
Merged

VR-3974: don't upload artifact in log_training_data()#576
convoliution merged 28 commits into
masterfrom
client/ml/histogram

Conversation

@convoliution
Copy link
Copy Markdown
Contributor

@convoliution convoliution commented Apr 27, 2020

No description provided.

@convoliution convoliution marked this pull request as ready for review April 28, 2020 01:07
Comment thread client/verta/verta/_internal_utils/_histogram_utils.py
Comment thread client/verta/verta/_internal_utils/_histogram_utils.py
Comment thread client/verta/verta/_internal_utils/_histogram_utils.py Outdated
Comment thread client/verta/verta/_internal_utils/_histogram_utils.py
Comment thread client/verta/tests/test_deployment.py Outdated
'count': [5, 10, 15]}}, 'type': 'discrete'},
'continuous col': {'histogram': {'float': {
'bucket_limits': [0, 2.9, 5.8, 8.7, 11.6, 14.5, 17.4, 20.3, 23.2, 26.099999999999998, 29],
'count': [3, 3, 3, 3, 3, 3, 3, 3, 3, 2]}}, 'type': 'float'},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 is really telling me that this code is wrong, since there were 30 values sent in. Don't just write tests to pass the code you wrote. Write tests to implement the functionality you want. If the code doesn't pass the tests, then it has a bug.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh I hadn't even noticed the 2. That's super careless of me. I've figured out better tests, now.

Comment thread client/verta/tests/test_deployment.py Outdated
Comment thread client/verta/tests/test_deployment.py
Comment thread client/verta/tests/test_deployment.py Outdated
Comment on lines +626 to +634
# data within buckets
assert all(buckets[0] <= series)
assert all(series <= buckets[-1])

# counts correct
bin_windows = list(zip(buckets[:-1], buckets[1:]))
for i, (l, r) in enumerate(bin_windows[:-1]):
assert sum((l <= series) & (series < r)) == counts[i]
assert sum(buckets[-2] <= series) == counts[-1]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test will pass if buckets[-1] is always 1e9 as long as all data is smaller than that. Same for buckets[0]. Let's assert we found the right boundaries.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added checks for the leftmost and rightmost boundaries.

Would it make sense to check the other buckets as well? I would just be copy&pasting the logic from the util, which I think defeats the purpose of testing functionality.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to check the other ones. Just the cases that are easy to get wrong.

On a side note, you can always verify the intent of the functionality instead of copying the code. For example in this case, we could check 1) are the bucket sizes about the same? 2) do we get the number of buckets we expected? These two are the properties we want. The code in utils is just a way to get to it.

Comment thread client/verta/tests/test_deployment.py
Copy link
Copy Markdown
Contributor

@conradoverta conradoverta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clicked the wrong button.

@convoliution convoliution merged commit a777bba into master Apr 29, 2020
@convoliution convoliution deleted the client/ml/histogram branch April 29, 2020 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants