FunctionalDyBM Demo
===================
- Author: Hiroshi Kajino
- Date: Sep 27, 2016.
- (C) Copyright IBM Corp. 2016

FunctionalDyBM (F-DyBM) models the dynamics of a function $f^{[t]}(x)$, where $x\in\mathcal{X}$ is a feature vector in a feature space, and $t\in\mathbb{N}$ is a discrete time step.
For example, $f^{[t]}(x)$ can represent the temperature at location $x$ and time step $t$.

Usage
-----
For each time step $t$,

1. it receives finite observations of a pattern, $[f^{[t]}(x_1^{[t]}),\dots,f^{[t]}(x_{N^{[t]}}^{[t]})]$,

1. it learns the model parameters using `learn_one_step(pattern, loc)` method, where `loc` corresponds to $[x_1^{[t]},\dots,x_{N^{[t]}}^{[t]}]$ and `pattern` corresponds to $[f^{[t]}(x_1^{[t]}),\dots,f^{[t]}(x_{N^{[t]}}^{[t]})]$, and

1. predicts the next pattern $f^{[t+1]}(x)$ at any location $x\in\mathcal{X}$ using `predict_next(loc)` method, where `loc` corresponds to a set of locations in which the next patterns are calculated.

Model descriptions and initialization
-------------------------------------
F-DyBM has the following two memory units, and utilize them to predict the next pattern $f^{[t+1]}(x)$.

- queue stores raw patterns

- eligibility traces store some statistics of all the previous patterns. If we use `insert_to_etrace="w_delay"`, a pattern popped from the queue will be inserted to eligibility traces (thus delayed by `delay`), and if we use `insert_to_etrace="wo_delay"`, a pattern enqueued to the queue is also inserted into eligibility traces.

The weight parameters used to predict the next pattern from the memory units are stored in `self.variables` and will be learned on-the-fly using `learn_one_step` method.

F- DyBM can be initialized using the following hyperparameters:

- `dim` is the dimension of the feature space $\mathcal{X}$.

- `anc_points` is an array of shape `(n_anc, dim)`.

- `delay` determines the length of the queue (which will be `delay-1`).

- `decay_rates` is a list of eligibility traces' parameters.

- `noise_var` determines the noise variance on the pattern.

- `ker_paras` sets the kernel function $K(x, x^{\prime})$, used for function approximation.

- `insert_to_etrace` determines the definition of eligibility traces, as discussed above.

- `learning_rate` sets the learning rate of SGD.

Example
-------
In the following example, we let F-DyBM learn `test_func`. At each time step, we calculate RMSE between the actual pattern given by `test_func` and the prediction given by F-DyBM and present the score in every 100 steps, and at the end, we show the actual pattern and the predicted one.

In [1]:
from pydybm.time_series.functional_dybm import FunctionalDyBM
from six.moves import xrange
import numpy as np

MAX_ITER=1000 # number of time steps
dim = 2 # dimension of feature space
n_obs = 100 # number of observations at each time step
n_anc = 10 # number of anchor points, which are used for defining basis functions
delay = 3 # a pattern will be fed to eligibility traces with this delay
decay_rates = [0.2, 0.9] # parameters of eligibility traces. in this case, we have two eligibility traces

def test_func(loc,t):
    """
    loc : array, shape (n_obs, dim)
        each row corresponds to a coordinate of each observation point.
    t : int
        time step
    """
    return np.sin(loc.sum(axis=1) + t/10.0) + 0.00001 * np.random.randn(loc.shape[0])


# initialize anchor points and models
anc = np.random.uniform(low=0.0, high=1.0, size=(n_anc, dim))
model = FunctionalDyBM(dim=dim, anc_points=anc, delay=delay, decay_rates=decay_rates,
                      noise_var=1.0, ker_paras={"ker_type":"rbf", "gamma":1.0},
                      insert_to_etrace="w_delay", learning_rate=0.0001)

# for each time step t, 
# (i) we randomly generate observation points, `loc`, and observations of a functional pattern at the points, `pattern`,
# (ii) compute RMSE between the actual pattern and prediction by the model, and
# (ii) we update the parameters as well as the internal states of F-DyBM using fit().
for t in xrange(MAX_ITER):
    loc = np.random.uniform(0.0, 1.0, (n_obs, dim))
    pattern = test_func(loc, t)
    if t%100 == 0:
        print("step {}:\t RSME = {}".format(t, model.compute_RMSE(pattern, loc)))
    model.learn_one_step(pattern, loc)

# the pattern at time step MAX_ITER and the predicted pattern by the model will be presented.
loc = np.random.uniform(0.0, 1.0, (n_obs, dim))
pattern = test_func(loc, MAX_ITER)
print("\npattern = {}".format(pattern))
print("pred = {}".format(model.predict_next(loc)))


step 0:	 RSME = 0.804297057773843
step 100:	 RSME = 0.19065979348680756
step 200:	 RSME = 0.08098709018367681
step 300:	 RSME = 0.04750983481968885
step 400:	 RSME = 0.03489134381583352
step 500:	 RSME = 0.03669342287636237
step 600:	 RSME = 0.03691558073285102
step 700:	 RSME = 0.01656613738495161
step 800:	 RSME = 0.039154285932145
step 900:	 RSME = 0.030231793172077692

pattern = [ 0.22816552  0.46928338  0.40465998  0.82039651  0.69958197  0.72838946
  0.46086176  0.12740303  0.37587426  0.09493412  0.64909052  0.51408783
  0.83356479  0.51357757  0.44394982  0.66815925  0.49099194  0.6314766
  0.51187462  0.39545874  0.79438388  0.9288989   0.05273392  0.95175743
 -0.35581896  0.80602032  0.69375366  0.24147853  0.79609136  0.5606667
  0.39235362 -0.15255081  0.77692577  0.21062653 -0.32438     0.85193157
  0.33286599  0.85778826  0.23943005  0.83246809  0.96467168 -0.22380548
 -0.03368148  0.30272629  0.8218343   0.25702916  0.50423556  0.39985679
 -0.36830519 -0.04733531  0.7066