Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hamiltonian Monte Carlo with Dual Averaging #728

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

emilemathieu
Copy link
Contributor

Hello to all !

This PR partially solve this issue #541 which is about implementing NUTS. As proposed in this topic, starting with the Dual Averaging is a good start.

The implemented inference method is in edward/inferences/hmcda.py and the corresponding test in tests/test-inferences/test_hmcda.py.

I hope you'll appreciate the PR ! :)

@emilemathieu emilemathieu mentioned this pull request Aug 3, 2017
@emilemathieu
Copy link
Contributor Author

@dustinvtran all checks have passed, could this PR be merged ?

Copy link
Member

@dustinvtran dustinvtran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My sincere apologies for the delay. I finally had some time to re-read the NUTS paper and go through your implementation. Great job! (especially on the dynamic leapfrog implementation)

I only have minor suggestions below.

step_size = self.find_good_eps()
sess = get_session()
init_op = tf.global_variables_initializer()
sess.run(init_op)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initialize op shouldn't be needed inside inference.initialize(). It's called within the run() method or alternatively must be called manually after you call inference.initialize() on the algorithm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this, I have: Attempting to use uninitialized value Variable.
I can't making it work without :/

The updates assume each Empirical random variable is directly
parameterized by ``tf.Variable``s.
"""

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove emptyline

"""Simulate Hamiltonian dynamics using a numerical integrator.
Correct for the integrator's discretization error using an
acceptance ratio. The initial value of espilon is heuristically chosen
with Algorithm 4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

epsilon, Algorithm 4.

"""
self.scope_iter = 0 # a convenient counter for log joint calculations

# Find intial epsilon
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initial

Parameters
----------
n_adapt : float
Number of samples with adaption for epsilon
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adaptation

# Accept or reject sample.
u = Uniform().sample()
alpha = tf.minimum(1.0, tf.exp(ratio))
accept = u < alpha
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comparing with tf.log(u) < ratio should be more numerically stable than checking on the PDF scale.

assign_ops.append(self.n_accept.assign_add(tf.where(accept, 1, 0)))
return tf.group(*assign_ops)

def do_not_adapt_step_size(self, alpha):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for methods built for internal implementation, prepend the name with _.

def do_not_adapt_step_size(self, alpha):
# Do not adapt step size but assign last running averaged epsilon to epsilon
assign_ops = []
assign_ops.append(self.H_B.assign_add(0.0).op)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you add 0 to these variables? What happens if we don't use assign ops for them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tf.cond arguments; true_fn and false_fnmust have the same type of outputs, which is a list of ops in our case.
We could also do assign_ops.append(tf.assign(self.H_B, self.H_B).op).
I would be happy to receive any better idea.

@emilemathieu
Copy link
Contributor Author

Thanks for your feedbacks @dustinvtran !

I've fixed all your suggestions but the initialization issue. I dot know how to fix that.

Latent variable keys to samples.
"""
self.scope_iter += 1
scope = 'inference_' + str(id(self)) + '/' + str(self.scope_iter)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note we no longer use scope_iter and str(id(self)) implementation for scopes. See the latest versions of hmc.py and sghmc.py where we use scope = tf.get_default_graph().unique_name("inference").

assign_ops.append(self.n_accept.assign_add(tf.where(accept, 1, 0)))
return tf.group(*assign_ops)

def _do_not__adapt_step_size(self, alpha):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_do_not_adapt_step_size

@dustinvtran
Copy link
Member

Without this, I have: Attempting to use uninitialized value Variable.
I can't making it work without :/

Do you know which tf.Variable causes this to break? For example, is there a reason you wrote epsilon = tf.Variable(1.0, trainable=False) instead of epsilon = tf.constant(1.0)?

@emilemathieu
Copy link
Contributor Author

I have updated to epsilon = tf.constant(1.0) but it still break without the initialization.

Could it come from the empirical variables self.latent_vars which are needed in find_good_eps ?

@dustinvtran
Copy link
Member

Upon further investigation, the issue is data-dependent initialization. The tf.Variable epsilon depends on tf.Variables in the model and approximating families. This means that the variables have to go through separate session calls to the init ops so that the model / approximating families are initialized first. Related: tensorflow/tensorflow#4920

Not sure how to fix this just yet.

# analytic solution: N(loc=0.0, scale=\sqrt{1/51}=0.140)
inference = ed.HMCDA({mu: qmu}, data={x: x_data})
inference.run(n_adapt=1000)
print(qmu.mean().eval())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove print statements in test

k = tf.constant(0)

def while_condition(k, v_z_new, v_r_new, grad_log_joint):
# Stop when k < n_steps
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

always use two space indent, including inside internal functions

@emilemathieu
Copy link
Contributor Author

Prints and indents fixed.

Nice spotted for the initialization issue ! What about the workaround proposed by yaroslavvb (2sd comment) ?

@dustinvtran
Copy link
Member

dustinvtran commented Aug 14, 2017

I hesitate because it (1) requires users to use a custom initialization scheme; and (2) depends on a tf.contrib library which can be unstable in its API / internal implementation. That said, I think it's worth trying as it may be the only viable solution; we can tweak/relax it later.

When replacing the init op inside inference.run() with the following, I wasn't able to get it to work. You're welcome to tweak it so it does work.

    if variables is None:
      # Force variables to be initialized after any variables they depend on.
      from tensorflow.contrib import graph_editor as ge
      def make_safe_initializer(var):
        """Returns initializer op that only runs for uninitialized ops."""
        return tf.cond(tf.is_variable_initialized(var),
                       tf.no_op,
                       lambda: tf.assign(var, var.initial_value).op,
                       name="safe_init_" + var.op.name).op

      safe_initializers = {}
      for v in tf.global_variables():
        safe_initializers[v.op.name] = make_safe_initializer(v)

      g = tf.get_default_graph()
      for v in tf.global_variables():
        var_name = v.op.name
        var_cache = g.get_operation_by_name(var_name + "/read")
        ge.reroute.add_control_inputs(var_cache, [safe_initializers[var_name]])

      init = tf.group(*safe_initializers.values())
    else:
      init = tf.variables_initializer(variables)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants