-
-
Notifications
You must be signed in to change notification settings - Fork 6
Add dual averaging algorithm #35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
68a6ad8 to
00d5f63
Compare
|
@brandonwillard As you can see in This also happens if I set |
00d5f63 to
fa68fa5
Compare
Codecov Report
@@ Coverage Diff @@
## main #35 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 9 10 +1
Lines 345 361 +16
Branches 14 14
=========================================
+ Hits 345 361 +16
Continue to review full report at Codecov.
|
We can definitely address this; however, I first need to understand the entire context of the upcasting. In general, there are some casting configuration options (e.g. |
tests/test_algorithms.py
Outdated
| gradient = aesara.grad(value, x) | ||
| return update(gradient, step, x, x_avg, gradient_avg) | ||
|
|
||
| x_init = at.as_tensor(0.0, dtype="float64") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this fail if you use aesara.config.floatX?
I'm guessing that the (CI and local) tests are running with aesara.config.floatX set to "float64", so that's why an explicit "float64" is fine, but setting this to aesara.config.floatX and changing that config option to "float32" (before loading Aesara) should not cause a problem.
If that's the upcasting issue you described, then, yes, that sounds like an Aesara issue. Just don't forget to set that option before loading Aesara; otherwise, it won't be active and you will get confusing casting issues. In other words, that option can't be properly changed after loading Aesara, because sometimes that value is used during class/type/object creation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I set aesara.config.floatX = "float32" right after the imports I still get an error if I don't specify x_init's type or set it to float32.
But I’m not sure what pytest does with the code so I'll try in a stand-alone script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try it with a conftest.py setup like Aesara's, but with floatX set, of course, or try the same with the AESARA_FLAGS env variable. That will make sure the setting is available before the imports.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the conftest.py file with floatX set to float32; I checked that print(aesara.config.floatX) in the test_dual_averaging function returns float32. However the test still fails with the following error:
ValueError: When compiling the inner function of scan the following error has been encountered: The initial state (`outputs_info` in scan nomenclature) of variable IncSubtensor{Set;:int64:}.0 (argument number 1) has dtype float32, while the result of the inner function (`fn`) has dtype float64. This can happen if the inner function of scan results in an upcast or downcast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this related question with a simpler example on SO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just provided an answer to that question.
4a866f6 to
3ac9e1b
Compare
|
The type error has nothing to do with t0 = 10
step = at.as_tensor(0, dtype="int32")
eta = 1.0 / (step + t0)
print(eta.dtype)will return On the other hand the following code ```python
t0 = 10
step = at.as_tensor(0, dtype="int32")
eta = (1.0 / (step + t0)).astype("floatX")
print(eta.dtype)will return whatever Another thing that I find very confusing is that x_init = at.as_tensor(1.0)
print(x_init.dtype)will return |
|
I ended up adding |
eb475b6 to
7e1635e
Compare
fa43113 to
ce59d26
Compare
In this instance, NumPy has the same behavior: import numpy as np
np.dtype(np.array(1.0, dtype=np.float32) / np.array(10, dtype=np.int32))
# dtype('float64')
I'm not seeing that locally: import os
os.environ["AESARA_FLAGS"] = "floatX=float32"
import aesara.tensor as at
assert aesara.config.floatX == "float32"
x_init = at.as_tensor(1.0)
x_init.dtype
# 'float32' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, we need to go along with the configured Aesara upcasting/promotion rules, and this often necessitates the use of dtypes from user-provided graphs. I'll try running this again locally to see where that could be done.
|
Ok, that makes a lot more sense. I'll try to see how we could use the dtypes from the graphs the user provides. As for the previous example, I get the same thing as you do on my machine. The following is confusing though: import os
os.environ["AESARA_FLAGS"] = "floatX=float64"
import aesara
import aesara.tensor as at
assert aesara.config.floatX == "float64"
x_init = at.as_tensor(1.0)
print(x_init.dtype)
# 'float32' |
Adds the dual averaging algorithm that is commonly used to adapt the step size of HMC algorithms. Closes #34.
This one is completely independent from the rest of the code so I'll be working on it while we solve the current issues with HMC and NUTS.
Reference: http://webdoc.sub.gwdg.de/ebook/serien/e/CORE/dp2005_67.pdf