New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pipeline cookbook #4323
Add pipeline cookbook #4323
Conversation
vinx13
commented
Jun 2, 2018
•
edited
edited
- add pipeline meta example and cookbook
- add transformer factory
Pipeline | ||
======== | ||
|
||
Pipeline is a machine that chains multiple transformers and machines. It consists of a sequence of transformers as intermediate stages and a machine as the final stage. Features are transformed by transformers and fed into the next stage sequentially. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add sgclass
to link to the API docs
Pipeline | ||
======== | ||
|
||
Pipeline is a machine that chains multiple transformers and machines. It consists of a sequence of transformers as intermediate stages and a machine as the final stage. Features are transformed by transformers and fed into the next stage sequentially. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you start every sentence in a new line? I.e. newline after each ".", this makes subsequent diffs way easier to review
Pipeline | ||
======== | ||
|
||
Pipeline is a machine that chains multiple transformers and machines. It consists of a sequence of transformers as intermediate stages and a machine as the final stage. Features are transformed by transformers and fed into the next stage sequentially. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"stage" of what? Maybe add "stage of training"
|
||
.. toctree:: | ||
:maxdepth: 1 | ||
:glob: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
|
||
# additional integration variables | ||
#![extract_centers_and_radius] | ||
RealMatrix c = kmeans.get_cluster_centers() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this be done with the generic get
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, this is not a parameter :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see.
Ok, one of the next things I want to do is to register callback functions with get
so that we can lazily evaluate things, like in this case
src/shogun/machine/Pipeline.cpp
Outdated
REQUIRE( | ||
m_stages.empty() || | ||
holds_alternative<CTransformer*>(m_stages.back()), | ||
"Transformers can not be placed after machines.\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you maybe make this "Last element in pipeline is %s", and then put the machine name there?
src/shogun/machine/Pipeline.cpp
Outdated
REQUIRE( | ||
m_stages.empty() || | ||
holds_alternative<CTransformer*>(m_stages.back()), | ||
"Multiple machines are added to pipeline.\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same story, pls print the name of the last added machine so the user knows what she did wrong
src/shogun/machine/Pipeline.cpp
Outdated
{ | ||
REQUIRE( | ||
!m_stages.empty() && holds_alternative<CMachine*>(m_stages.back()), | ||
"Machine has not been added.\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you be more elaborate here as well
"Pipline cannot be trained without an added machine. Last element is %s"
src/shogun/machine/Pipeline.cpp
Outdated
CLabels* CPipeline::apply(CFeatures* data) | ||
{ | ||
REQUIRE( | ||
!m_stages.empty() && holds_alternative<CMachine*>(m_stages.back()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as above. user facing error messages need to be more explicit
src/shogun/machine/Pipeline.cpp
Outdated
{ | ||
REQUIRE( | ||
!m_stages.empty() && index < m_stages.size() - 1, | ||
"Index out of bound. There are only %d transformers.\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print the index as well
"Requested index (%d) out of bounds " ...
src/shogun/machine/Pipeline.h
Outdated
|
||
namespace shogun | ||
{ | ||
class CPipeline : public CMachine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you write some minimal class docs?
|
||
EXPECT_EQ(pipeline->get_transformer(1), transformer2.get()); | ||
EXPECT_EQ(pipeline->get_machine(), machine.get()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great set of tests!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
I made some comments on user facing parts of this
b7d2ff9
to
f211b65
Compare
f211b65
to
0fa4faa
Compare
|
||
#![create_pipeline] | ||
Pipeline pipeline() | ||
pipeline.with(subMean) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs fixing, because of reserved keyword
#![create_pipeline] | ||
Pipeline pipeline() | ||
pipeline.with(subMean) | ||
pipeline.with(pca) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change it
|
and in java
|
* Add transformer factory * Add pipeline cookbook * Fix swig include
* Add transformer factory * Add pipeline cookbook * Fix swig include
* Add transformer factory * Add pipeline cookbook * Fix swig include
* Add transformer factory * Add pipeline cookbook * Fix swig include
* Add transformer factory * Add pipeline cookbook * Fix swig include