New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] GP functionality #2322

Closed
wants to merge 32 commits into
base: master
from

Conversation

Projects
None yet
5 participants
@bwengals
Copy link
Member

bwengals commented Jun 16, 2017

  • adds the FITC approximation
  • I used the GPML and GPflow implementations as reference
  • new class is called GPfitc at the moment, has the same AP as GP, but with the additional required arg inducing_points
  • no tests yet since the GP module is WIP
  • see this gist for some usage examples for the time being
  • no automatic optimization routine of inducing point locations, needs to be specified by user
  • need to make covariance function methods that return just the diagonal, see note on lines 156, 171 of gp.py
  • probably will need to change the structure of all this as more GP stuff is added
@fonnesbeck

This comment has been minimized.

Copy link
Member

fonnesbeck commented Jun 18, 2017

Great work.

Instead of having a separate GPfitc interface, what about unifying under GP and instantiating the approximation if inducing points are passed, and a regular GP otherwise?

mean_func : Mean
Mean function of Gaussian process
cov_func : Covariance
Covariance function of Gaussian process
X : array
Grid of points to evaluate Gaussian process over. Only required if the
GP is not an observed variable.
sigma : scalar or array

This comment has been minimized.

@fonnesbeck

fonnesbeck Jun 24, 2017

Member

Docstring does not match argument name

@bwengals bwengals changed the title [WIP] FITC approximation for GPs [WIP] GP functionality Jul 5, 2017

@bwengals

This comment has been minimized.

Copy link
Member Author

bwengals commented Jul 5, 2017

This might be a good time to see if people are interested in looking this over, regarding the API, functionality, stuff like that. Not ready for a detailed code review. Here is a gist demoing the broader functionality. There is still more to be done though! Here's my necessary/maybe list of things to add at this point, still looking into the maybes, and wouldn't mind input or suggestions of course.

Necessary:

  • tests
  • docs
  • examples

Maybe?:

  • Inducing point location optimization
  • multiple dependent vars, Grid, Kronecker methods
  • Cov functions with compact support / sparse Cholesky
  • An inducing point method for the non-conjugate case
  • a state space, time series model
return np.array(samples)
y = [v for v in model.observed_RVs if v.name==gp.name][0]
samples = [gp.distribution.random(point=trace[idx], X_values=X_values, y=y, obs_noise=obs_noise) for idx in indices]
return np.array(samples)

This comment has been minimized.

@fonnesbeck

fonnesbeck Jul 6, 2017

Member

Add new line at end of file

mean = tt.dot(tt.transpose(As), solve_upper(tt.transpose(L_B), c))
C = solve_lower(L_B, As)
if obs_noise:
cov = self.K(Xs, Xs) - tt.dot(tt.transpose(As), As)\

This comment has been minimized.

@fonnesbeck

fonnesbeck Jul 6, 2017

Member

Avoid continuation character throughout. Enclose selection in parentheses instead.

This comment has been minimized.

@bwengals

bwengals Jul 6, 2017

Author Member

will do. curious though, is it better for readability or?

This comment has been minimized.

@fonnesbeck

fonnesbeck Jul 16, 2017

Member

Partly readability, partly convention.

@junpenglao

This comment has been minimized.

Copy link
Member

junpenglao commented Jul 6, 2017

This looks really nice but I am not sure I understand what are the approx do inside of the GP class. Are they operated on the logp to turn it into something easier to approximate/sample from? It is not an estimator right (i.e., I cannot use it to get estimation of parameters)

Mean function of Gaussian process
cov_func : Covariance
Covariance function of Gaussian process
sigma2 : scalar or array

This comment has been minimized.

@fonnesbeck

fonnesbeck Jul 6, 2017

Member

Was wondering about the decision to use sigma2 versus sigma. Would we not be putting a prior on the latter?

This comment has been minimized.

@bwengals

bwengals Jul 6, 2017

Author Member

i hadnt really thought about it. thinking about it more, yeah I think your right it makes more sense to parameterize in terms of sigma, not sigma2



def GP(name, X, mean_func=None, cov_func=None,

This comment has been minimized.

@fonnesbeck

fonnesbeck Jul 6, 2017

Member

I like the refactoring

@fonnesbeck

This comment has been minimized.

Copy link
Member

fonnesbeck commented Jul 6, 2017

@junpenglao its the sparse approximation method used when inducing points are passed.


super(GP, self).__init__(*args, **kwargs)
if inducing_points is None:

This comment has been minimized.

@fonnesbeck

fonnesbeck Jul 6, 2017

Member

Can you just nest this condition in the one below

approx = approx.upper()

##### CONJUGATE, APPROXIMATION
if approx not in ["VFE", "FITC"]:

This comment has been minimized.

@fonnesbeck

fonnesbeck Jul 6, 2017

Member

What happens if inducing points are passed but approx is None? Should you perhaps just set one as the default?

This comment has been minimized.

@bwengals

bwengals Jul 6, 2017

Author Member

good catch, definitely

"""
def __init__(self, mean_func=None, cov_func=None, X=None, sigma=0, *args, **kwargs):
global CHOL_CONST

This comment has been minimized.

@AustinRochford

AustinRochford Jul 10, 2017

Member

global is usually a code smell to me. Can this be passed as a keyword argument to functions that need it instead?

This comment has been minimized.

@bwengals

bwengals Jul 11, 2017

Author Member

it could, i thought the global would be cleaner in this case. only one function stabilize uses this keyword, and stabilize is used by most everything. IMHO chol_const doesn't need to be accessed by the user

This comment has been minimized.

@ferrine

ferrine Jul 11, 2017

Member

There is no need in global as CHOL_CONST is in outer scope or I miss something

This comment has been minimized.

@ferrine

ferrine Jul 11, 2017

Member

I've really missed some lines after, sorry. But why do you change global state? Is it so essential? I feel worried too

This comment has been minimized.

@bwengals

bwengals Jul 12, 2017

Author Member

yeah fair enough, it is a bit smelly. I'll take this out

raise ValueError('cov_func must be a subclass of Covariance')

# NONCONJUGATE
if observed is None:

This comment has been minimized.

@AustinRochford

AustinRochford Jul 10, 2017

Member

Unobserved isn' necessarily non-conjugate, or am I missing something?

This comment has been minimized.

@bwengals

bwengals Jul 11, 2017

Author Member

hmm.. if the gp + noise values are observed, and the noise is gaussian, it's best to use conjugacy to integrate out the actual gp so it doesnt need to be sampled. But thinking about it more, thats only true if nothing happens in the model later that depends on the observed gp + noise data. This would be kind of weird... but possible I guess.

This comment has been minimized.

@AustinRochford

AustinRochford Jul 13, 2017

Member

Good point. I don't think it's necessary to marginalize here given the rarity of the use case, but perhaps alter the naming?

if samples is None:
samples = len(trace)

indices = np.random.choice(np.arange(samples), len(trace), replace=False)
if progressbar:

This comment has been minimized.

@ferrine

ferrine Jul 10, 2017

Member

Tqdm has kwarg disable=True for that

This comment has been minimized.

@bwengals

bwengals Jul 11, 2017

Author Member

oh nice, ill use that

bwengals added some commits Jul 11, 2017


approx = approx.upper()
if approx not in ["VFE", "FITC"]:
raise ValueError(("'FITC' or 'VFE' are the implemented "

This comment has been minimized.

@ferrine

ferrine Jul 15, 2017

Member

better message should say that 'FITC' or 'VFE' are the only approximations that are implemented and show what has user provided. e.g. "'FITC' or 'VFE' are the only supported approximations for GP, got %s"

This comment has been minimized.

@bwengals

bwengals Jul 17, 2017

Author Member

good call, changed

# initialize inducing points with K-means
from scipy.cluster.vq import kmeans
# first whiten X
if not isinstance(X, np.ndarray):

This comment has been minimized.

@ferrine

ferrine Jul 15, 2017

Member

I would rather use np.asarray(X) here instead of X.value

This comment has been minimized.

@ferrine

ferrine Jul 15, 2017

Member

Additionally you can check if X is not a tensor, that can be not a rare case.

This comment has been minimized.

@bwengals

bwengals Jul 17, 2017

Author Member

thanks, the check should be a bit more robust now

@bwengals

This comment has been minimized.

Copy link
Member Author

bwengals commented Jul 17, 2017

@fonnesbeck @AustinRochford @otherinterestedfolks I think this PR is in a state where it's ready for review! I'll start on the docs next. I'll submit doc changes to this PR since they will probably help with the review.

@@ -35,26 +37,29 @@ def __init__(self, input_dim, active_dims=None):
if len(active_dims) != input_dim:
raise ValueError("Length of active_dims must match input_dim")

def __call__(self, X, Z=None, diag=False):
def __call__(self, X, Xs=None, diag=False):

This comment has been minimized.

@fonnesbeck

fonnesbeck Jul 19, 2017

Member

Z seemed simpler and more distinct. Why the change?

This comment has been minimized.

@bwengals

bwengals Jul 20, 2017

Author Member

Z seemed most commonly used to represent inducing point locations in the lit. I changed it to Xs to look more like $X_*$

continue

# if factor is a numpy array
if isinstance(factor, np.ndarray):

This comment has been minimized.

@ferrine

ferrine Jul 20, 2017

Member

I think elif is more suitable here

@@ -207,6 +205,21 @@ def full(self, X, Xs=None):
raise NotImplementedError


class Periodic(Stationary):

This comment has been minimized.

@fonnesbeck

fonnesbeck Jul 25, 2017

Member

Does this obviate the need for Cosine?

This comment has been minimized.

@bwengals

bwengals Jul 25, 2017

Author Member

I think they can both be here. Functions drawn from a gp w/ cosine prior are exactly sinusoidal functions. Periodic, with its lengthscale parameter, gives functions that are periodic but not perfectly sinusoidal.

@@ -411,10 +410,6 @@ def sample_gp(trace, gp, X_new, n_samples=None, obs_noise=True,
obs_noise : bool
Flag for including observation noise in sample. Does not apply to GPs
used with non-conjugate likelihoods. Defaults to False.
cov_func : Covariance

This comment has been minimized.

@fonnesbeck

fonnesbeck Jul 25, 2017

Member

Why are the docstrings removed when the arguments are still available?

This comment has been minimized.

@bwengals

bwengals Jul 25, 2017

Author Member

Ah, the arguments should also be gone. This is where I ran into problems trying to support additive gps. I'm submitting a new pr with pretty big refactor soon...

@bwengals bwengals referenced this pull request Jul 26, 2017

Merged

GP functionality 2 #2444

@bwengals bwengals closed this Jul 26, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment