Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

forbid float weights for int storage for #289 #346

Closed
wants to merge 1 commit into from
Closed

forbid float weights for int storage for #289 #346

wants to merge 1 commit into from

Conversation

LovelyBuggies
Copy link
Collaborator

No description provided.

@LovelyBuggies
Copy link
Collaborator Author

LovelyBuggies commented Mar 28, 2020

@henryiii I have fixed the bugs in #289 . I saw your work #290 and found you were focusing on the pybind11 part. However, I focused on the Python part -- hist.py. Basically, I found the self._hist._storage_type cannot match Python data type like int, so I have to add the PyType = int() to allow them to communicate. I think I can better this work with your instruction, e.g. how to make the two types <class 'boost_histogram._core.storage.int64'> == <class 'int'> equal.

Now, the target has been achieved:

import boost_histogram as bh
import numpy as np

h = bh.Histogram(bh.axis.Regular(10, 0, 1), storage=bh.storage.Int64())
h.fill([.2, .3], weight=[1, 2])   # this will work
h.fill([.2, .3], weight=[1., 2])  # this will not work -- TypeError: "Weight type must match storage type."
h.fill([.2, .3], weight=[1, 2.])  # this will not work -- TypeError: "Weight type must match storage type."
h.fill([.2, .3], weight=[1., 2.]) # this will not work -- TypeError: "Weight type must match storage type."

w, data = h.to_numpy()

This works in my notebooks.

@LovelyBuggies
Copy link
Collaborator Author

By the way, CI failed again. I wonder why this always happens. 😢

Copy link
Member

@henryiii henryiii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should fix the underlying problem (as done in the old PR), rather than trying to add a check here. However, maybe a check like you have here might help that PR.

@@ -98,11 +98,11 @@ def __init__(self, *axes, **kwargs):

# Keyword only trick (change when Python2 is dropped)
with KWArgs(kwargs) as k:
storage = k.optional("storage", Double())
self._storage = k.optional("storage", Double())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can access the storage type from the C++ object, which is better than trying to hang on to it in Python.

# should not allow float weight for int storage
if "weight" in kwargs:
for w in kwargs["weight"]:
if not isinstance(self._storage._PyType, type(w)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an incorrect check; weight could be a single value or an array of values, this is only checking for one int rather than an array (I think).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, it does work on arrays of ints? Something like this might fix the other PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might misunderstand. As you can see, for w in kwargs["weight"]: will iterate every value in weight list. And you can see the demo above in my comments. Every possible broken case is considered.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, okay. I see the loop. Does this work when this is not an array? And this will be horribly slow, since it's a Python loop.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You get the point! Only weight=[1] is supported now. weight=1 is not supported. But it would not be a hard job if you really want!

And this will be horribly slow, since it's a Python loop

Yep, I have seen the importance to deep into the underlying C++ parts.

@@ -17,6 +17,8 @@ def __repr__(self):
@set_family(MAIN_FAMILY)
@set_module("boost_histogram.storage")
class Int64(store.int64, Storage):
def __init__(self):
self._PyType = int()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is causing the segfault in the CI, as Int64 is not properly initializing store.int64 (but could be wrong). The fact the tests are segfaulting is slightly bothering me.

@henryiii
Copy link
Member

This PR doesn't fix the Unlimited storage, since internally they still are always converted to floats, while the the other PR does fix that.

@LovelyBuggies
Copy link
Collaborator Author

LovelyBuggies commented Mar 28, 2020

This PR doesn't fix the Unlimited storage, since internally they still are always converted to floats, while the the other PR does fix that.

@henryiii Do you expect a build-in exception rather than raiseError? As I said, it covers the demanded functionality, i.e. to forbid float weights for Int64 storage, but might not in a good way. (The C-Py interface.hpp is so hard to read so I had to use this trick -- but I will try to overcome it if you really expect me to do so.)

I agree that we should modify the underlying parts. The bad thing is, in storage.hpp, a histogram even stores data in double naturally, that's why I think your modification, using C++ templates, is a good direction.

So, do you have any plans? Close this PR and stick to your draft PR, or make some enhancements to this PR's work (e.g. use some tricks to remove the loop... maybe possible)? I think do the former will be a more difficult process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants