Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pipeline #4317

Merged
merged 16 commits into from Jun 13, 2018
Merged

Conversation

@vinx13
Copy link
Member

@vinx13 vinx13 commented May 30, 2018

No description provided.

Copy link
Member

@vigsterkr vigsterkr left a comment

minor stuff

{
if (holds_alternative<CTransformer*>(stage))
{
CTransformer* transformer = shogun::get<CTransformer*>(stage);

This comment has been minimized.

@vigsterkr

vigsterkr May 30, 2018
Member

oooootoooo, auto :)

if (holds_alternative<CTransformer*>(stage))
{
CTransformer* transformer = shogun::get<CTransformer*>(stage);
if (transformer->train_require_labels())

This comment has been minimized.

@vigsterkr

vigsterkr May 30, 2018
Member

this could be a simple ternary expression:

transformer->train_require_labels() ? transformer->fit(data, m_labels) : transformer->fit(data);
}
}

return NULL; // unreachable

This comment has been minimized.

@vigsterkr

vigsterkr May 30, 2018
Member

nullptr

{
if (holds_alternative<CTransformer*>(stage))
{
CTransformer* transformer = shogun::get<CTransformer*>(stage);

This comment has been minimized.

}
else
{
CMachine* machine = shogun::get<CMachine*>(stage);

This comment has been minimized.

protected:
virtual bool train_machine(CFeatures* data = NULL) override;

std::vector<variant<CTransformer*, CMachine*>> m_stages;

This comment has been minimized.

@vigsterkr

vigsterkr May 30, 2018
Member

sadly we wont be able to register this for a while :)
i.e. no serialization :D

}
else
{
CMachine* machine = shogun::get<CMachine*>(stage);

This comment has been minimized.

@vinx13 vinx13 force-pushed the vinx13:feature/pipeline branch from f71e7a3 to 62ab551 May 31, 2018
@vigsterkr
Copy link
Member

@vigsterkr vigsterkr commented May 31, 2018

@vinx13 would be great to add a simple xval cookbook where we use a CPruneVarSubMean transformer before doing PCA and applying k-means on a toy data :D

@vigsterkr
Copy link
Member

@vigsterkr vigsterkr commented May 31, 2018

  • adding unit test would be great
@vigsterkr
Copy link
Member

@vigsterkr vigsterkr commented May 31, 2018

aaaah and another thing: getter methods for the pipeline elements would be desirable. i.e. get the trained transformer etc.

@vinx13 vinx13 force-pushed the vinx13:feature/pipeline branch 2 times, most recently from e260a20 to 860a9ed Jun 1, 2018
@vinx13 vinx13 mentioned this pull request Jun 2, 2018
return require_labels;
}

void CPipeline::list_stages() const

This comment has been minimized.

@vigsterkr

vigsterkr Jun 4, 2018
Member

how about renaming this to

std::string to_string() const;

and basically return a string instead of directly writing SG_INFO.... that of course means that SGObject::to_string needs to be virtual

This comment has been minimized.

@karlnapf

karlnapf Jun 4, 2018
Member

I like this more as well! users can then print the strings themselves

}
}

CTransformer* CPipeline::get_transformer(size_t index) const

This comment has been minimized.

@vigsterkr

vigsterkr Jun 4, 2018
Member

maybe instead of accessing the pipeline elements like this it'd be good to have a way to access them by name? see for example http://scikit-learn.org/stable/tutorial/text_analytics/working_with_text_data.html#evaluation-of-the-performance-on-the-test-set
of course this would mean to be able to add elements as a tuple, say

Pipline().with(("customName", ZeroMean).with....

but that could be that we support both, meaning either having

Pipline().with(("customName", ZeroMean).with....

and

Pipline().with(ZeroMean).with....

and in the 2nd case we use the get_name() string...

@vigsterkr vigsterkr force-pushed the shogun-toolbox:feature/transformers branch from 45162b6 to 8ac8703 Jun 4, 2018
@vinx13 vinx13 force-pushed the vinx13:feature/pipeline branch from 23de4fd to 30fcfdd Jun 4, 2018
void CPipeline::check_pipeline()
{
REQUIRE(!m_stages.empty(), "Pipeline is empty");
REQUIRE(

This comment has been minimized.

@vigsterkr

vigsterkr Jun 4, 2018
Member

i wonder if we should start using typed exceptions, because for now EVERYTHING is ShogunException. things like MachineNotTrainedException etc....
@lisitsyn @karlnapf mm?

This comment has been minimized.

@vigsterkr

vigsterkr Jun 4, 2018
Member

note, that for now the only way to infer the reason of exception is actually parsing the exception message... that's not really 'nice' from an automation/developer's perspective

This comment has been minimized.

@karlnapf

karlnapf Jun 6, 2018
Member

Actually, this was part of the "user experience" GSoC project. So yes makes total sense imo

return shogun::get<CTransformer*>(stage.second);
}

SG_ERROR("Transformer with name %s not found.\n", name.c_str());

This comment has been minimized.

@vigsterkr

vigsterkr Jun 4, 2018
Member

imo here std::invalid_argument be the more appropriate...
but in lot of the REQUIRE() case it's the same... namely a 'more specific type' of exception would be much more appropriate from a shogun user... if its getting integrated in a larger framework.

pipeline->with(transformer1);

auto features = some<NiceMock<MockCFeatures>>();
EXPECT_THROW(pipeline->train(features), ShogunException);

This comment has been minimized.

@vigsterkr

vigsterkr Jun 4, 2018
Member

the need for a more specific type of exception came up in my head here atm, because expecting there ShogunException is too broad.... it could actually come from a transformer or the machine in the pipeline... right? @karlnapf @lisitsyn @vinx13

@vigsterkr vigsterkr force-pushed the shogun-toolbox:feature/transformers branch from b7d2ff9 to f211b65 Jun 8, 2018
@vinx13 vinx13 force-pushed the vinx13:feature/pipeline branch from 8e04386 to 394083d Jun 8, 2018
* whenever an object in Shogun in invalid state is used. For example, a machine
* is used before training.
*/
class InvalidStateException : public ShogunException

This comment has been minimized.

@vigsterkr

vigsterkr Jun 8, 2018
Member

i would define these in separate headers and cpp files :)

This comment has been minimized.

@vigsterkr

vigsterkr Jun 8, 2018
Member

and in the case of "a machine is used before training" i would call that exception MachineNotTrainedException as invalid state could be a lot of things and the not trained case i believe should be explicitly spelled out for the user :)

@vigsterkr
Copy link
Member

@vigsterkr vigsterkr commented Jun 11, 2018

@lisitsyn @karlnapf so as trying to add pipeline to the swig interface it came out that with is a reserved keyword in python. see the design from 2017 ws: https://github.com/shogun-toolbox/shogun/wiki/Hackathon-2017-base-api#pipeline

how about changing Pipeline.with(Transformer) to Pipeline.over(Transformer), since add and as are kind of taken by SGObject, imo as could still fly but add would be matched. i'm wondering whether then would be a reserved keyword in any languages we support?

and another idea here: imo we should have a Pipeline that is inherited from CMachine, but the builder should be a separate base class, PipelineBuilder.

@vinx13 i would still like to have a builder that supports adding a list of elements. something like: add_stages([transf1, transf1, machine])

@vigsterkr vigsterkr force-pushed the shogun-toolbox:feature/transformers branch from f211b65 to 0fa4faa Jun 11, 2018
@vinx13 vinx13 force-pushed the vinx13:feature/pipeline branch from a869313 to 7d480f5 Jun 12, 2018
@vigsterkr
Copy link
Member

@vigsterkr vigsterkr commented Jun 12, 2018

@karlnapf we are getting travis problems here as well..... this branch is rebased over HEAD od develop, but octave is throwing this error

error: in method 'Evaluation_evaluate', argument 2 of type 'shogun::CLabels *' (SWIG_TypeError)

error: called from

    /opt/shogun/build/examples/meta/octave/multiclass/cartree.m at line 39 column 10
error: in method 'Evaluation_evaluate', argument 2 of type 'shogun::CLabels *' (SWIG_TypeError)

error: called from

    /opt/shogun/build/examples/meta/octave/binary/linear_support_vector_machine.m at line 42 column 10
@vinx13
Copy link
Member Author

@vinx13 vinx13 commented Jun 12, 2018

the error also happens on python

  4/297 Test #238: generated_python-binary-linear_support_vector_machine ...................................***Failed    0.24 sec

Traceback (most recent call last):

  File "/opt/shogun/build/examples/meta/python/binary/linear_support_vector_machine.py", line 50, in <module>

    accuracy = eval.evaluate(labels_predict, labels_test)

TypeError: in method 'Evaluation_evaluate', argument 2 of type 'shogun::CLabels *'

swig/python detected a memory leak of type 'CBinaryLabels *', no destructor found.
@vigsterkr vigsterkr merged commit ad75959 into shogun-toolbox:feature/transformers Jun 13, 2018
0 of 2 checks passed
0 of 2 checks passed
continuous-integration/appveyor/pr AppVeyor build failed
Details
continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
vigsterkr added a commit that referenced this pull request Jun 28, 2018
Add initial pipeline implementation
Add InvalidStateException and use it in pipeline
Move exception to shogun/lib/exception
Add MachineNotTrainedException
vigsterkr added a commit that referenced this pull request Jul 10, 2018
Add initial pipeline implementation
Add InvalidStateException and use it in pipeline
Move exception to shogun/lib/exception
Add MachineNotTrainedException
vigsterkr added a commit that referenced this pull request Jul 12, 2018
Add initial pipeline implementation
Add InvalidStateException and use it in pipeline
Move exception to shogun/lib/exception
Add MachineNotTrainedException
ktiefe added a commit to ktiefe/shogun that referenced this pull request Jul 30, 2019
Add initial pipeline implementation
Add InvalidStateException and use it in pipeline
Move exception to shogun/lib/exception
Add MachineNotTrainedException
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants