Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add transformer base class and adapt preprocessor / converter to fit + apply #4285

Merged
Merged
Show file tree
Hide file tree
Changes from 57 commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
c4d77f9
Add transformer base class
vinx13 May 14, 2018
bf42a2b
Make converter and preprocessor subclasses of transformer
vinx13 May 14, 2018
5f838b6
Cleanup dense preprocessors and rename init to fit
vinx13 May 14, 2018
76eb698
Cleanup string preprocessors and rename init to fit
vinx13 May 14, 2018
5ff86f3
Cleanup pca preprocessors and rename init to fit
vinx13 May 14, 2018
c898878
Cleanup and rename api in DependenceMaximization
vinx13 May 15, 2018
73bc7ec
Update meta examples using 'fit' api
vinx13 May 15, 2018
3168500
Update FisherLDA api
vinx13 May 15, 2018
140c8c2
Update doc of FisherLDA
vinx13 May 15, 2018
8c0cb93
Cleanup preprocessors
vinx13 May 15, 2018
5b48af8
Cleanup preprocessors
vinx13 May 15, 2018
03b6061
Add transformer to swig
vinx13 May 15, 2018
521d12c
Add transformer::apply(features, inplace) api
vinx13 May 16, 2018
5cdd269
Adapt python examples to new transformer api
vinx13 May 16, 2018
cd738f9
Fix indent in python examples
vinx13 May 16, 2018
4c57338
Deprecate densepreproc old api add apply_to_matrix
vinx13 May 17, 2018
7c4ace0
Cleanup and implement apply_to_matrix in dense preproc subclasses
vinx13 May 17, 2018
62dc621
Cleanup and refactor PCA and FisherLDA
vinx13 May 18, 2018
ff8fd5f
Remove DimensionReductionPreprocessor
vinx13 May 18, 2018
a423241
Implement apply in kernel pca
vinx13 May 19, 2018
04c0884
Fix swig
vinx13 May 20, 2018
49e9f25
Split template implementation of string preproc
vinx13 May 21, 2018
5ecf291
Implement apply api in string preproc, deprecate apply_to_string_feat…
vinx13 May 21, 2018
8cd24e3
Implement apply in sparse preproc
vinx13 May 21, 2018
715bdb7
Use fwd declaration
vinx13 May 22, 2018
3d72162
Drop DimensionReductionPreproc in python example
vinx13 May 22, 2018
0072350
Unref input features in transformer::apply
vinx13 May 22, 2018
74117e7
Fix wrong index in LogPlusOne
vinx13 May 22, 2018
2aa1a47
Fix string preprocessor
vinx13 May 23, 2018
c11da19
Fix refcount bug
vinx13 May 23, 2018
0913587
Update meta example using transformer::apply
vinx13 May 24, 2018
b921804
Register params in PCA / KernelPCA
vinx13 May 24, 2018
e5850f0
Fix ref count in DensePreproc
vinx13 May 24, 2018
bfac72e
Fix wrong index in RandomFourierGaussPreproc
vinx13 May 24, 2018
05b98ea
Update tests using transformer::apply
vinx13 May 24, 2018
ae55c2a
Fix ref count
vinx13 May 24, 2018
9f94402
Throw error in out-of-place mode in sparse preproc
vinx13 May 24, 2018
f18664e
Update doc of transformers
vinx13 May 25, 2018
4f486a6
Refactor ICA converters into fit + apply
vinx13 May 25, 2018
cf2904c
Some-ize and use fit / apply api in tests and meta examples of ica
vinx13 May 25, 2018
2c4b067
Use transformer::apply in unit tests
vinx13 May 25, 2018
34a7958
Some-ize PCA test
vinx13 May 27, 2018
7f9ca1a
Don't create new features instance in StringPreproc
vinx13 May 27, 2018
069df5d
Some-ize unittests
vinx13 May 27, 2018
c30e7e6
Apply formatter
vinx13 May 27, 2018
a83ee08
Fix ica converters
vinx13 May 28, 2018
3dc409e
Don't ref the result features in preproc
vinx13 May 28, 2018
2bfc4b5
Some-ize lars unittests to fix ref count
vinx13 May 28, 2018
e7ed858
Fix indent
vinx13 May 29, 2018
e52712e
Use std min/max instead of cmath
vinx13 May 29, 2018
2e7a488
Use better var name in meta example
vinx13 May 29, 2018
560cc99
Minor codestyle improvement
vinx13 May 29, 2018
88aa5a5
Fix PruneVarSubMean
vinx13 May 29, 2018
4793cf4
Convert to dense feats in ica base class
vinx13 May 29, 2018
aaeea9b
Use 'override' in preproc
vinx13 May 29, 2018
fc49cd2
Use std::copy
vinx13 May 29, 2018
f0a29c1
Remove 'override' as inconsistent override cause many warnings
vinx13 May 29, 2018
b927fc1
Fix style
vinx13 May 29, 2018
26e5224
Rename transformer::apply -> transform
vinx13 May 29, 2018
2937f88
Fix transformer rename in swig
vinx13 May 30, 2018
8493511
Add %newobject to transformer in swig
vinx13 May 30, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ ica.set_tol(0.00001)
#![set_parameters]

#![apply_convert]
ica.fit(feats)
Features converted = ica.apply(feats)
#![apply_convert]

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ ica.set_tol(0.00001)
#![set_parameters]

#![apply_convert]
ica.fit(feats)
Features converted = ica.apply(feats)
#![apply_convert]

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ ica.set_tol(0.00001)
#![set_parameters]

#![apply_convert]
ica.fit(feats)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the new API here!

Features converted = ica.apply(feats)
#![apply_convert]

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ ica.set_tol(0.00001)
#![set_parameters]

#![apply_convert]
ica.fit(feats)
Features converted = ica.apply(feats)
#![apply_convert]

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ ica.set_tol(0.00001)
#![set_parameters]

#![apply_convert]
ica.fit(feats)
Features converted = ica.apply(feats)
#![apply_convert]

Expand Down
16 changes: 8 additions & 8 deletions examples/meta/src/regression/least_angle_regression.sg
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,12 @@ Labels labels_test = labels(f_labels_test)
#![preprocess_features]
PruneVarSubMean SubMean()
NormOne Normalize()
SubMean.init(features_train)
SubMean.apply_to_feature_matrix(features_train)
SubMean.apply_to_feature_matrix(features_test)
Normalize.init(features_train)
Normalize.apply_to_feature_matrix(features_train)
Normalize.apply_to_feature_matrix(features_test)
SubMean.fit(features_train)
Features pruned_features_train = SubMean.apply(features_train)
Features pruned_features_test = SubMean.apply(features_test)
Normalize.fit(features_train)
Features normalized_features_train = Normalize.apply(pruned_features_train)
Features normalized_features_test = Normalize.apply(pruned_features_test)
#![preprocess_features]

#![create_instance]
Expand All @@ -27,8 +27,8 @@ Machine lars = machine("LeastAngleRegression", labels=labels_train, lasso=False,
#![create_instance]

#![train_and_apply]
lars.train(features_train)
Labels labels_predict = lars.apply(features_test)
lars.train(normalized_features_train)
Labels labels_predict = lars.apply(normalized_features_test)

#[!extract_w]
RealVector weights = lars.get_real_vector("w")
Expand Down
37 changes: 18 additions & 19 deletions examples/undocumented/python/converter_locallylinearembedding.py
Original file line number Diff line number Diff line change
@@ -1,28 +1,27 @@
#!/usr/bin/env python
data = '../data/fm_train_real.dat'
parameter_list = [[data,20],[data,30]]
from tools.load import LoadMatrix

def converter_locallylinearembedding (data_fname,k):
try:
from shogun import RealFeatures, CSVFile
try:
from shogun import LocallyLinearEmbedding
except ImportError:
print("LocallyLinearEmbedding not available")
exit(0)
lm=LoadMatrix()
data = lm.load_numbers('../data/fm_train_real.dat')

features = RealFeatures(CSVFile(data_fname))
parameter_list = [[data, 20], [data, 30]]

converter = LocallyLinearEmbedding()
converter.set_target_dim(1)
converter.set_k(k)
converter.apply(features)
def converter_locallylinearembeeding (data, k):
from shogun import RealFeatures
from shogun import LocallyLinearEmbedding

features = RealFeatures(data)

converter = LocallyLinearEmbedding()
converter.set_k(k)

converter.fit(features)
features = converter.apply(features)

return features

return features
except ImportError:
print('No Eigen3 available')

if __name__=='__main__':
print('LocallyLinearEmbedding')
converter_locallylinearembedding(*parameter_list[0])
converter_locallylinearembeeding(*parameter_list[0])

10 changes: 4 additions & 6 deletions examples/undocumented/python/distance_canberraword.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,17 +17,15 @@ def distance_canberraword (fm_train_dna=traindna,fm_test_dna=testdna,order=3,gap
charfeat.set_features(fm_train_dna)
feats_train=StringWordFeatures(charfeat.get_alphabet())
feats_train.obtain_from_char(charfeat, order-1, order, gap, reverse)
preproc=SortWordString()
preproc.init(feats_train)
feats_train.add_preprocessor(preproc)
feats_train.apply_preprocessor()
preproc = SortWordString()
preproc.fit(feats_train)
feats_train = preproc.apply(feats_train)

charfeat=StringCharFeatures(DNA)
charfeat.set_features(fm_test_dna)
feats_test=StringWordFeatures(charfeat.get_alphabet())
feats_test.obtain_from_char(charfeat, order-1, order, gap, reverse)
feats_test.add_preprocessor(preproc)
feats_test.apply_preprocessor()
feats_test = preproc.apply(feats_test)

distance=CanberraWordDistance(feats_train, feats_train)

Expand Down
8 changes: 3 additions & 5 deletions examples/undocumented/python/distance_hammingword.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,16 +21,14 @@ def distance_hammingword (fm_train_dna=traindna,fm_test_dna=testdna,
feats_train=StringWordFeatures(charfeat.get_alphabet())
feats_train.obtain_from_char(charfeat, order-1, order, gap, reverse)
preproc=SortWordString()
preproc.init(feats_train)
feats_train.add_preprocessor(preproc)
feats_train.apply_preprocessor()
preproc.fit(feats_train)
feats_train = preproc.apply(feats_train)

charfeat=StringCharFeatures(DNA)
charfeat.set_features(fm_test_dna)
feats_test=StringWordFeatures(charfeat.get_alphabet())
feats_test.obtain_from_char(charfeat, order-1, order, gap, reverse)
feats_test.add_preprocessor(preproc)
feats_test.apply_preprocessor()
feats_test = preproc.apply(feats_test)

distance=HammingWordDistance(feats_train, feats_train, use_sign)

Expand Down
10 changes: 4 additions & 6 deletions examples/undocumented/python/distance_manhattenword.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,16 +11,14 @@ def distance_manhattenword (train_fname=traindna,test_fname=testdna,order=3,gap=
charfeat=StringCharFeatures(CSVFile(train_fname), DNA)
feats_train=StringWordFeatures(charfeat.get_alphabet())
feats_train.obtain_from_char(charfeat, order-1, order, gap, reverse)
preproc=SortWordString()
preproc.init(feats_train)
feats_train.add_preprocessor(preproc)
feats_train.apply_preprocessor()
preproc = SortWordString()
preproc.fit(feats_train)
feats_train = preproc.apply(feats_train)

charfeat=StringCharFeatures(CSVFile(test_fname), DNA)
feats_test=StringWordFeatures(charfeat.get_alphabet())
feats_test.obtain_from_char(charfeat, order-1, order, gap, reverse)
feats_test.add_preprocessor(preproc)
feats_test.apply_preprocessor()
feats_test = preproc.apply(feats_test)

distance=ManhattanWordDistance(feats_train, feats_train)

Expand Down
11 changes: 4 additions & 7 deletions examples/undocumented/python/kernel_comm_ulong_string.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,18 +16,15 @@ def kernel_comm_ulong_string (fm_train_dna=traindat,fm_test_dna=testdat, order=3
charfeat.set_features(fm_train_dna)
feats_train=StringUlongFeatures(charfeat.get_alphabet())
feats_train.obtain_from_char(charfeat, order-1, order, gap, reverse)
preproc=SortUlongString()
preproc.init(feats_train)
feats_train.add_preprocessor(preproc)
feats_train.apply_preprocessor()

preproc = SortUlongString()
preproc.fit(feats_train)
feats_train = preproc.apply(feats_train)

charfeat=StringCharFeatures(DNA)
charfeat.set_features(fm_test_dna)
feats_test=StringUlongFeatures(charfeat.get_alphabet())
feats_test.obtain_from_char(charfeat, order-1, order, gap, reverse)
feats_test.add_preprocessor(preproc)
feats_test.apply_preprocessor()
feats_test = preproc.apply(feats_test)

use_sign=False

Expand Down
10 changes: 4 additions & 6 deletions examples/undocumented/python/kernel_comm_word_string.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,17 +16,15 @@ def kernel_comm_word_string (fm_train_dna=traindat, fm_test_dna=testdat, order=3
charfeat.set_features(fm_train_dna)
feats_train=StringWordFeatures(charfeat.get_alphabet())
feats_train.obtain_from_char(charfeat, order-1, order, gap, reverse)
preproc=SortWordString()
preproc.init(feats_train)
feats_train.add_preprocessor(preproc)
feats_train.apply_preprocessor()
preproc = SortWordString()
preproc.fit(feats_train)
feats_train = preproc.apply(feats_train)

charfeat=StringCharFeatures(DNA)
charfeat.set_features(fm_test_dna)
feats_test=StringWordFeatures(charfeat.get_alphabet())
feats_test.obtain_from_char(charfeat, order-1, order, gap, reverse)
feats_test.add_preprocessor(preproc)
feats_test.apply_preprocessor()
feats_test = preproc.apply(feats_test)

kernel=CommWordStringKernel(feats_train, feats_train, use_sign)

Expand Down
10 changes: 4 additions & 6 deletions examples/undocumented/python/kernel_weighted_comm_word_string.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,16 +14,14 @@ def kernel_weighted_comm_word_string (fm_train_dna=traindat,fm_test_dna=testdat,
charfeat=StringCharFeatures(fm_train_dna, DNA)
feats_train=StringWordFeatures(charfeat.get_alphabet())
feats_train.obtain_from_char(charfeat, order-1, order, gap, reverse)
preproc=SortWordString()
preproc.init(feats_train)
feats_train.add_preprocessor(preproc)
feats_train.apply_preprocessor()
preproc = SortWordString()
preproc.fit(feats_train)
feats_train = preproc.apply(feats_train)

charfeat=StringCharFeatures(fm_test_dna, DNA)
feats_test=StringWordFeatures(charfeat.get_alphabet())
feats_test.obtain_from_char(charfeat, order-1, order, gap, reverse)
feats_test.add_preprocessor(preproc)
feats_test.apply_preprocessor()
feats_test = preproc.apply(feats_test)

use_sign=False
kernel=WeightedCommWordStringKernel(feats_train, feats_train, use_sign)
Expand Down

This file was deleted.

6 changes: 3 additions & 3 deletions examples/undocumented/python/preprocessor_fisherlda.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,9 @@ def preprocessor_fisherlda (data, labels, method):
sg_features = RealFeatures(data)
sg_labels = MulticlassLabels(labels)

preprocessor=FisherLda(method)
preprocessor.fit(sg_features, sg_labels, 1)
yn=preprocessor.apply_to_feature_matrix(sg_features)
preprocessor=FisherLda(1, method)
preprocessor.fit(sg_features, sg_labels)
yn = preprocessor.apply(sg_features).get_real_matrix('feature_matrix')

return yn

Expand Down
4 changes: 2 additions & 2 deletions examples/undocumented/python/preprocessor_kernelpca.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,9 @@ def preprocessor_kernelpca (data, threshold, width):
kernel = GaussianKernel(features,features,width)

preprocessor = KernelPCA(kernel)
preprocessor.init(features)
preprocessor.fit(features)
preprocessor.set_target_dim(2)
preprocessor.apply_to_feature_matrix(features)
features = preprocessor.apply(features)

return features

Expand Down
11 changes: 4 additions & 7 deletions examples/undocumented/python/preprocessor_logplusone.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,10 @@ def preprocessor_logplusone (fm_train_real=traindat,fm_test_real=testdat,width=1
feats_train=RealFeatures(fm_train_real)
feats_test=RealFeatures(fm_test_real)

preproc=LogPlusOne()
preproc.init(feats_train)
feats_train.add_preprocessor(preproc)
feats_train.apply_preprocessor()
feats_test.add_preprocessor(preproc)
feats_test.apply_preprocessor()

preproc = LogPlusOne()
preproc.fit(feats_train)
feats_train = preproc.apply(feats_train)
feats_test = preproc.apply(feats_test)

kernel=Chi2Kernel(feats_train, feats_train, width, size_cache)

Expand Down
8 changes: 3 additions & 5 deletions examples/undocumented/python/preprocessor_normone.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,9 @@ def preprocessor_normone (fm_train_real=traindat,fm_test_real=testdat,width=1.4,
feats_test=RealFeatures(fm_test_real)

preprocessor=NormOne()
preprocessor.init(feats_train)
feats_train.add_preprocessor(preprocessor)
feats_train.apply_preprocessor()
feats_test.add_preprocessor(preprocessor)
feats_test.apply_preprocessor()
preprocessor.fit(feats_train)
feats_train = preprocessor.apply(feats_train)
feats_test = preprocessor.apply(feats_test)

kernel=Chi2Kernel(feats_train, feats_train, width, size_cache)

Expand Down
4 changes: 2 additions & 2 deletions examples/undocumented/python/preprocessor_pca.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@ def preprocessor_pca (data):
features = RealFeatures(data)

preprocessor = PCA()
preprocessor.init(features)
preprocessor.apply_to_feature_matrix(features)
preprocessor.fit(features)
features = preprocessor.apply(features)

return features

Expand Down
8 changes: 3 additions & 5 deletions examples/undocumented/python/preprocessor_prunevarsubmean.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,9 @@ def preprocessor_prunevarsubmean (fm_train_real=traindat,fm_test_real=testdat,wi
feats_test=RealFeatures(fm_test_real)

preproc=PruneVarSubMean()
preproc.init(feats_train)
feats_train.add_preprocessor(preproc)
feats_train.apply_preprocessor()
feats_test.add_preprocessor(preproc)
feats_test.apply_preprocessor()
preproc.fit(feats_train)
feats_train = preproc.apply(feats_train)
feats_test = preproc.apply(feats_test)

kernel=Chi2Kernel(feats_train, feats_train, width, size_cache)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,9 @@ def preprocessor_randomfouriergausspreproc (fm_train_real=traindat,fm_test_real=
feats_test=RealFeatures(fm_test_real)

preproc=RandomFourierGaussPreproc()
preproc.init(feats_train)
feats_train.add_preprocessor(preproc)
feats_train.apply_preprocessor()
feats_test.add_preprocessor(preproc)
feats_test.apply_preprocessor()
preproc.fit(feats_train)
feats_train = preproc.apply(feats_train)
feats_test = preproc.apply(feats_test)

kernel=Chi2Kernel(feats_train, feats_train, width, size_cache)

Expand Down
10 changes: 4 additions & 6 deletions examples/undocumented/python/preprocessor_sortulongstring.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,10 @@ def preprocessor_sortulongstring (fm_train_dna=traindna,fm_test_dna=testdna,orde
feats_test=StringUlongFeatures(charfeat.get_alphabet())
feats_test.obtain_from_char(charfeat, order-1, order, gap, reverse)

preproc=SortUlongString()
preproc.init(feats_train)
feats_train.add_preprocessor(preproc)
feats_train.apply_preprocessor()
feats_test.add_preprocessor(preproc)
feats_test.apply_preprocessor()
preproc = SortUlongString()
preproc.fit(feats_train)
feats_train = preproc.apply(feats_train)
feats_test = preproc.apply(feats_test)

kernel=CommUlongStringKernel(feats_train, feats_train, use_sign)

Expand Down
Loading