-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
using pca in pipeline #116
Comments
Hey, At this time in the master branch, the PrincipalComponentAnalysis can't be used directly as a feature extraction module. To use it for feature extraction, you need to run the feature extraction outside of the pipeline, and then input the output of the PrincipalComponentAnalysis module as input to the pipeline. To help improve this, I've added a new PCA module to the toolkit which allows the PrincipalComponentAnalysis algorithm to be used directly within a pipeline as a feature extraction module. You can find this new PCA module in the dev branch: https://github.com/nickgillian/grt/tree/dev/GRT/FeatureExtractionModules/PCA You can find an example of how to use this here: https://github.com/nickgillian/grt/blob/dev/examples/FeatureExtractionModules/PCAPipelineExample/PCAPipelineExample.cpp I still need to test this fully, so there may be bugs/issues (which is why it is still in the dev branch and not merged with master). One note, there is currently a hack with how you need to use PCA module. This is because you need to train the PCA module before you can use it, so this requires you to add the PCA module to the pipeline, then access a pointer to the PCA module from the pipeline, then train the PCA model with your dataset. You can see this hack in the example above. This is bad for two reasons:
I'm working on improving this to enable you to add multiple modules to the pipeline before PCA, add a classifier after PCA, and then when you call pipeline.train(data) the pipeline will automatically iterate through all the modules, pipe the data recursively through each stage, train the feature modules (like PCA) and then finally train the classifier at the end of the pipeline. For now, you will need to do this manually. |
Nice, thanks! Makes sense in training it before.
I will try it soon.
…On Sun, Mar 26, 2017 at 8:39 PM, Nicholas Gillian ***@***.***> wrote:
Hey,
At this time in the master branch, the PrincipalComponentAnalysis can't be
used directly as a feature extraction module. To use it for feature
extraction, you need to run the feature extraction outside of the pipeline,
and then input the output of the PrincipalComponentAnalysis module as input
to the pipeline.
To help improve this, I've added a new PCA module to the toolkit which
allows the PrincipalComponentAnalysis algorithm to be used directly within
a pipeline as a feature extraction module.
You can find this new PCA module in the dev branch: https://github.com/
nickgillian/grt/tree/dev/GRT/FeatureExtractionModules/PCA
You can find an example of how to use this here: https://github.com/
nickgillian/grt/blob/dev/examples/FeatureExtractionModules/
PCAPipelineExample/PCAPipelineExample.cpp
I still need to test this fully, so there may be bugs/issues (which is why
it is still in the dev branch and not merged with master).
One note, there is currently a hack with how you need to use PCA module.
This is because you need to train the PCA module before you can use it, so
this requires you to add the PCA module to the pipeline, then access a
pointer to the PCA module from the pipeline, then train the PCA model with
your dataset. You can see this hack in the example above. This is bad for
two reasons:
1. It means you can't have any module before the PCA module in the
pipeline (because the data will not be pumped through this module for
training the PCA module)
2. The coding flow is rather ugly (as you need to add the module, then
get a pointer to it, then manually train it).
I'm working on improving this to enable you to add multiple modules to the
pipeline before PCA, add a classifier after PCA, and then when you call
pipeline.train(data) the pipeline will automatically iterate through all
the modules, pipe the data recursively through each stage, train the
feature modules (like PCA) and then finally train the classifier at the end
of the pipeline. For now, you will need to do this manually.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#116 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACI5PZUD42jtMtS9W49PNWhU909Bma9oks5rprD6gaJpZM4MYdzq>
.
|
Hi,
can pca be used in pipeline as a feature extractor?
It is theoretically in Feature Extraction examples but when I try to add it to the pipeline via pipeline.addFeatureExtractionModule(pca); I get no known conversion for argument 1 from ‘GRT::PrincipalComponentAnalysis’ to ‘const GRT::FeatureExtraction&’ error.
Also, can pca be trained on TimeSeriesClassificationData data, or only on matrix type?
The text was updated successfully, but these errors were encountered: