Random Features for Expected Gaussian Kernels

Implementation of the random Fourier features for expected Gaussian kernels with Fastfood and low rank approximation as described in the paper "Classification of Sparse and Irregularly Sampled Time Series with Mixtures of Expected Gaussian Kernels and Random Features".

Requirements

Python 2.7
numpy
scipy
scikit-learn
fht (Python module for fast Hadamard transform)

Example

The following example (demo.py) demonstrates how to classify irregularly sampled time series data using random features for the expected Gaussian kernels. The data file data/ECG200-50.pkl used in the example is created based on the ECG200 data set from the UCR time series classification archive with 50% sampling density. The data file is in Python's pickle format that stores the data set in the form of ts_train, ts_test, l_train, l_test, where

ts_train is the training time series as a list of tuples. Each tuple contains two numpy ndarrays (t, y) where t stores the time stamps and y stores the corresponding observed values.
l_train stores the labels associated with the training time series ts_train as a numpy vector (integer).
ts_test and l_test are the time series and corresponding labels of the test set.

import numpy as np
import cPickle as pickle
from sklearn.svm import LinearSVC
import gp
from full_marginal import compute_means_covs
from fastfood import FastfoodEGK

np.random.seed(111)
with open('data/ECG200-50.pkl', 'rb') as f:
    ts_train, ts_test, l_train, l_test = pickle.load(f)

# Estimate GP hyperparameters and the noise parameter by maximizing
# the marginal likelihood.
gp_parms = gp.learn_hyperparms(ts_train)

# All time series are defined over a common time interval [0, 1].
# We use 300 evenly-spaced reference time points between [0, 1]
# to represent each time series.
t_ref = np.linspace(0, 1, 300)

# Compute the marginal posterior mean and covariance matrix for
# both training and test time series
train_means, train_covs = compute_means_covs(ts_train, t_ref, gp_parms)
test_means, test_covs = compute_means_covs(ts_test, t_ref, gp_parms)

# We use 500 random features with low-rank approximation, rank 10 in this
# case, and normalize the random feature vector to have unit length.
# By dropping the rank argument or set rank to 0 turns off the low rank
# approximation.
# The parameters gamma and C can be chosen using cross validation.
rp = FastfoodEGK(gamma=20, n_sample=500, rank=10,
                 normalize=True)
clf = LinearSVC(C=100)
X_train = rp.fit_transform(train_means, train_covs)
clf.fit(X_train, l_train)
X_test = rp.transform(test_means, test_covs)
l_predict = clf.predict(X_test)
accuracy = np.mean(l_predict == l_test)
print accuracy

Support

The development of this code was supported by the National Science Foundation through award # IIS-1350522.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
LICENSE		LICENSE
README.md		README.md
demo.py		demo.py
fastfood.py		fastfood.py
full_marginal.py		full_marginal.py
gp.py		gp.py
random_mvegk.py		random_mvegk.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Random Features for Expected Gaussian Kernels

Requirements

Example

Support

License

About

Releases

Packages

Contributors 2

Languages

License

mlds-lab/egk

Folders and files

Latest commit

History

Repository files navigation

Random Features for Expected Gaussian Kernels

Requirements

Example

Support

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages