-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure posterior predictive samples have the proper type #3683
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3683 +/- ##
==========================================
+ Coverage 89.82% 89.91% +0.09%
==========================================
Files 134 134
Lines 20133 20194 +61
==========================================
+ Hits 18084 18157 +73
+ Misses 2049 2037 -12
|
This LGTM. I can add a test sometime this week if you'd like. |
pymc3/sampling.py
Outdated
else: | ||
value_shape = () | ||
|
||
# initialize if necessary | ||
if k not in self.trace_dict: | ||
array_shape = (self._len,) + value_shape | ||
self.trace_dict[k] = np.full(array_shape, np.nan) | ||
self.trace_dict[k] = np.zeros(array_shape, dtype=np.array(v).dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we keep nan
as the default value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't the arrays in trace_dict
supposed to be completely filled with the posterior predictive samples? Then the filler value would have no relevance at all. In this case, I think the situation would be np.empty
ideal usecase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is failing because nan
are floats so if trace_dict
is filled with integers they will be upcasted to floats.
The default value for empty
is also float so we should specify the dtype as with zeros
. It seems to me that using empty
of zeros
is basically the same, unless using empty
is preferred because it conveys the idea that the initial values are irrelevant and they will be completetelly replaced. @OriolAbril Is that the case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes to everything, the dtype must be specified and empty
only allocates the memory that will be occupied by the array so the user can fill the array afterwards without problems (the values of the array may or may not actually be zeros, it depends on what was stored there before).
Moreover, just for the record because this array allocation is far from being a bottleneck, empty
also leaves the actual memory untouched, thus empty
is much faster than zeros
or full
.
this is the actual line I changed https://github.com/pymc-devs/pymc3/pull/3683/files#diff-7eb6c4a83cfe45b9fc0eac76b57e2175R1171, the rest is black. I can revert black if you want.