Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: examples of simple time series models #480

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

nfoti
Copy link
Contributor

@nfoti nfoti commented Feb 27, 2017

I have written a few examples of using edward to fit some simple time series models. They are in Jupyter notebook form right now, but I'm happy to convert them to scripts if that's easier (though I think having them as notebooks with some annotations can also be useful for learning edward). Any suggestions for improvement are welcome. I am planning on adding at least two more examples, one that uses a VAR(1) model for multiple replicates of a series and another example demonstrating a VAR(p) process (p > 1).

One point to notice is that constructing the Inference objects takes a very long time. I haven't dug into why this is happening, but I'm guessing that there is a lot of copying going when when I connect the training data to the corresponding nodes in the computation graph. Now that ICML is over I should have some time to dig into what's actually happening during construction and ways that we could potentially speed it up.

@dustinvtran
Copy link
Member

dustinvtran commented Feb 27, 2017

Cool! Good idea on the notebooks. I think this is the more suitable platform longer term as well. I'm looking into porting all tutorials and selected examples into notebooks.

re:Inference. Graph construction time scales roughly as the number of nodes in the model's computational graph grows. You could probably vectorize operations such as

beta = [Normal(mu=0., sigma=2.) for i in xrange(p)]

to be a vector of normals, and then apply tf.gather(beta, j) to index it. (Of course, for only few parameters, it's the time series that's the bigger problem and not the beta.) We should definitely look into optimizing the graph construction; I think the copying done under the hood can be done faster and fewer times.

All else looks solid.

@nfoti
Copy link
Contributor Author

nfoti commented Feb 28, 2017

Do you think it's worth trying to specify the models with tf.while_loop? I tried to do this first but ran into some trouble, but not that I know more about tf and edward I may be successful.

@dustinvtran
Copy link
Member

dustinvtran commented Feb 28, 2017

That would definitel speed up construction time although at the cost of a big drag at run time. ( its also a bit harder to specify variatiaonal approximations of random varuables inside body functions)

Since you're using reparameterization gradients and/or MAP, the graph should only need to copy the likelihood once in order to swap dependence to the approximate posterior instead of the prior. You can also try diagnosing the graph from TensorBoard by specifying the logdir argument in inference.

@nfoti
Copy link
Contributor Author

nfoti commented Feb 28, 2017

sounds good. i'll try monitoring with tensorboard and also digging around and doing some profiling. does edward have a reasonable way to save models and inference objects to disk? or were you just thinking pickle? thanks.

@dustinvtran
Copy link
Member

dustinvtran commented Feb 28, 2017

We use the recommended approach in TensorFlow, which is to export/import MetaGraphs if you want to save the full graph to disk (i.e., both model and inference so everything up to inference.initialize()).

If you want to save just the model parameters for either the variational approximation or probability model, you can look into tf.Train.Saver. This enables checkpointing and reusing models.

@chmp
Copy link
Contributor

chmp commented Mar 2, 2017

Currently, I am also trying to use edward for time-series modeling and this PR aligns perfectly with my experiments. Thanks :).

So far I only tried the example_ar-p notebook and hope it's ok to offer some feedback:

  • since xrange is python 2 only, it may be a good idea to replace it with range. As far as I can tell there shouldn't be any serious performance implications
  • the MAP cell contains a missing new-line before inference

@chmp
Copy link
Contributor

chmp commented Mar 3, 2017

I also experimented with using tf.scan to build the AR model. However, I ran into multiple problems, since scan generates recursive ops: get_variables needs to be modified to skip nodes already visited. Once this issue is fixed, copy dies to an infinite recursion (#306). Here, I have no fix yet.

For reference, this is how I generate the x variable for p = 2:

def build_ar_graph(state, epsilon):
    _prev_sample, _prev_mean, x0_sample, x0_mean, x1_sample, x1_mean = state
    mean = tf.gather(beta.value(), 1) * x0_sample + tf.gather(beta.value(), 0) * x1_sample
    sample = mean + epsilon
    
    return (
        x0_sample, x0_mean,
        x1_sample, x1_mean,
        sample, mean,
    )

initial = (
    tf.constant(0.0), tf.constant(0.0),  # < dummy sample, mean to ensure correct shape
    mu + tf.random_normal([], stddev=10.0), mu,
    mu + tf.random_normal([], stddev=10.0), mu,
)
epsilon = tf.random_normal([T], stddev=0.1)
x_sample, x_mean, _1, _2, _3, _4 = tf.scan(build_ar_graph, epsilon, initial)

x = Normal(
    mu=x_mean, 
    sigma=tf.constant([10.0] * p + [0.1] * (T - p)), 
    value=x_sample,
)

(edit: just figured out the recursion is wrong, will update it later this afternoon. Won't change the fact however, that ed.copy will fail.).
(edit 2: updated the scan function to pass through both the mean and the sample. Also fixed the order of beta.)

@dustinvtran
Copy link
Member

dustinvtran commented Mar 4, 2017

Cool. Thanks for the update!

re:copy. I think we'll have to make it handle dynamic ops such as scan and while_loop as special cases.

@chmp
Copy link
Contributor

chmp commented Mar 4, 2017

wrt copy: at first glance, it does seem as if tf breaks down the while / scan ops into more fundamental control flow ops. I fear a more general solution is needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants