Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor GridSearch and parameter tree #4598

Closed
wants to merge 15 commits into from

Conversation

gf712
Copy link
Member

@gf712 gf712 commented Mar 31, 2019

Proposed API (in python):

OPTION 1: change nested object parameter values with a string

import numpy as np   
import shogun as sg   
kernel = sg.kernel("GaussianKernel")   
svm = sg.machine("LibSVM", kernel=kernel)   
ps = sg.GridParameters(svm) 
ps.attach("kernel::log_width", np.array([1., 2.])) 
ps.attach("kernel::combined_kernel_weight", np.array([1., 2.])) 
ps.attach("C1", np.array([1., 2.])) 
ps.attach("C2", np.array([1., 2.])) 
for x in range(16):
    print(ps.next_combination()) 

OPTION 2: change nested object parameters by manipulating the nested node directly, before adding it to the tree

import numpy as np   
import shogun as sg   
kernel = sg.kernel("GaussianKernel")   
svm = sg.machine("LibSVM")
ps = sg.GridParameters(svm) 
kernel1 = sg.kernel("PolyKernel") 
node1 = sg.GridParameters(kernel1) 
node1.attach("degree", np.array([1.,2,3])) 
ps.attach("kernel", node1) 
ps.attach("C1", np.array([1., 2.])) 
ps.attach("C2", np.array([1., 2.]))
for x in range(12):
    print(ps.next_combination())

OPTION 1 and 2: combine both options, but option 1 becomes unavailable when there are several nodes representing a single object, i.e. kernel=[GaussianKernel, PolyKernel]

import numpy as np 
import shogun as sg 

kernel1 = sg.kernel("GaussianKernel")  
kernel2 = sg.kernel("PolyKernel") 
svm = sg.machine("LibSVM", kernel=kernel1)

node = sg.GridParameters(svm)
node.attach("kernel::log_width", np.array([1.]))
node.attach("C1", np.array([1., 2., 3.])) 
node.attach("C2", np.array([1., 2., 3.])) 


node2=sg.GridParameters(kernel2)
node2.attach("degree", np.array([1.,2.]))

node.attach("kernel", node2)

for x in range(20):
    print(node.next_combination()) 
  • implement option 1
  • implement option 2
  • implement GridSearch engine

@karlnapf
Copy link
Member

karlnapf commented Apr 1, 2019

Would it be possible to merge GridParameters and GridSearch?

@gf712
Copy link
Member Author

gf712 commented Apr 1, 2019

Working on that bit! But yes that would work, I called it ModelSelectionTree for now

@gf712 gf712 force-pushed the parameter_search_refactor branch from 3495ea8 to a1bbf1d Compare April 1, 2019 14:13

private:
std::stringstream* m_stream;
};
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder to self, once #4594 is done can put this back into place and use SGVector to_string method

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merged :)

* @param ss
* @param visitor
*/
void to_string_helper(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and when #4594 is done also refactor this

@karlnapf
Copy link
Member

karlnapf commented Apr 1, 2019

Working on that bit! But yes that would work, I called it ModelSelectionTree for now

Actually I think GridSearch is pretty good name wise :) No need to make "tree" part if it imo

@karlnapf
Copy link
Member

karlnapf commented Apr 1, 2019

I think you forgot to update the gpl submodule ...

@gf712
Copy link
Member Author

gf712 commented Apr 1, 2019

@karlnapf
this is how I see it working for now:

import numpy as np 
import shogun as sg 
kernel = sg.kernel("GaussianKernel") 
svm = sg.machine("LibSVM", kernel=kernel) 
ps = sg.ModelSelectionTree(svm) 
ps.attach("GaussianKernel::log_width", np.array([1., 2., 3.])) 
ps.attach("C1", np.array([1., 2., 3.])) 
ps.next_combination()
ps.next_combination()
...

And next_combination is called within GridSearch, so have:

import numpy as np 
import shogun as sg 
kernel = sg.kernel("GaussianKernel") 
svm = sg.machine("LibSVM", kernel=kernel) 
gs = sg.GridSearch(svm) 
gs.attach("GaussianKernel::log_width", np.array([1., 2., 3.])) 
gs.attach("C1", np.array([1., 2., 3.]))
gs.train(...)

Working on the second part now where multiple kernels could be added

@@ -10,6 +10,8 @@
%newobject CParameterCombination::leaf_sets_multiplication();
%newobject CModelSelectionParameters::get_combinations();
%newobject CModelSelectionParameters::get_single_combination();
%newobject ParameterNode::attach();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get that we need attach, but why would we need to call next_combination from swig?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's just for debugging right now

@@ -300,5 +300,17 @@ PUT_ADD(CTokenizer)
%template(kernel) kernel<float64_t, float64_t>;
%template(features) features<float64_t>;

%template(attach) ParameterNode::attach<int32_t>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

best way to test this is to add a meta example

struct is_sg_matrix<T> : public std::true_type \
{ \
};
#define SG_ADD_TYPE(T, type_) \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe do this into a separate PR?

m_current_node = m_nodes.begin();
}

void ModelSelectionTree::reset()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is that needed for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resets the internal state of a node, so that it goes back from the beginning when called from the parent node.


tree->attach(
param, *any_cast<decltype(val.begin())>(m_param_iter[param]));
SG_PRINT("CURRENT: %s, PARAM: %s\n", m_current_param.c_str(), param.c_str())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can make those prints DEBUG and then leave them in actually, might be useful for future debugging?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, I'll do that in the end, I just don't want to set to debug level because then it becomes too verbose

* @param name
* @param node
*/
ParameterNode* attach(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would hide most things from swig in here, e.g. this one

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be needed to add another GridSearch instance

ParameterNode* attach(const std::string& param, T value)
{
if (!set_param_helper(param, make_any(value)))
SG_SERROR("Could not attach %s", param.c_str());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is user facing, maybe put some more infos in there? So if users chain multiple attachments, they know which one failed from the error message and see what has been built so far?

namespace shogun
{

class ParameterNode : public CSGObject
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder ... does this need to be a subclass of SGObject? Would it ever need to be serialized? Would it ever need to be cloned/equal? Probably not or?

namespace shogun
{

class ParameterNode : public CSGObject
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to have some kind of high level explain of the parts involved in here and how they play together, in some doxygen comment. For future development of the stuff. Not now, but definitely later

Copy link
Member

@karlnapf karlnapf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Say we wanted to do distributed grid searches. Would this design be more or less compatible? What about multi-core?

Copy link
Member

@karlnapf karlnapf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an example would be quite nice :)

Really cool that you pulled this off so quickly!

@gf712
Copy link
Member Author

gf712 commented Apr 1, 2019

I will have to go through the reference to check how smart pointers and stl containers handle race conditions, but I think it should be fine

@karlnapf
Copy link
Member

karlnapf commented Apr 1, 2019

Also in terms of instantiating the combinations.... I would imagine that a central generator creates combinations which are then sent to workers to solve, and in particular multiple combinations need to be instantiated before it is clear which ones came back yet.... but ok, let's think about that later :)

@@ -613,6 +617,9 @@ class CSGObject
*/
#ifndef SWIG // SWIG should skip this part
std::map<std::string, std::shared_ptr<const AnyParameter>> get_params() const;

std::map<std::string, std::shared_ptr<const AnyParameter>> get_params(ParameterProperties) const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about having a default value that selects all.. i have this locally:
std::map<std::string, std::shared_ptr<const AnyParameter>> get_params(const ParameterProperties& p = ParameterProperties::ALL) const;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, once that is merged I'll rebase it. this will probably still take a while until its done..

@gf712
Copy link
Member Author

gf712 commented Apr 1, 2019

Also in terms of instantiating the combinations.... I would imagine that a central generator creates combinations which are then sent to workers to solve, and in particular multiple combinations need to be instantiated before it is clear which ones came back yet.... but ok, let's think about that later :)

I guess we could keep a buffer that feeds the workers? So would always have a pool of at least N jobs left, if possible, where N >= number_of_workers.

@gf712 gf712 force-pushed the parameter_search_refactor branch 7 times, most recently from 2d49dfb to 4d2c9df Compare April 2, 2019 12:15
eval_machine->get<CMachine*>("machine")->to_string().c_str())
}

// note that this may implicitly lock and unlock the machine
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should totally remove this concept of locking machines for now tbh. We could add that back later. It is also not something that would be sitting inside GridSearch, but rather in the cross-validation codes, since that is where the issue of precomputing stuff when re-training multiple times using the same parameters happens

SG_UNREF(result);
}

if (verbose)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would remove this verbose option. We can soon just make it observable ...

@@ -312,9 +295,7 @@ namespace shogun
*/
CPipelineBuilder* pipeline()
{
auto result = new CPipelineBuilder();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is also probably material for another PR, not this one :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, I just needed to fix this because it was causing "false positive" memory leaks

Copy link
Member

@karlnapf karlnapf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a whole bunch of comments. Most things are minor, but there is the thing with cloning the objects that makes me worry a bit.

@gf712 gf712 force-pushed the parameter_search_refactor branch from b354917 to a3d2d80 Compare April 17, 2019 12:10
c2_param[0] = 0.1
c2_param[1] = 1
c2_param[2] = 10
RealVector c1_param([0.1, 1.0, 10.0])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@stale
Copy link

stale bot commented Feb 26, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Feb 26, 2020
@gf712
Copy link
Member Author

gf712 commented Feb 26, 2020

keeping it alive for another 180 days, might pick up during GSoC

@stale stale bot removed the stale label Feb 26, 2020
@gf712
Copy link
Member Author

gf712 commented May 20, 2020

for reference here is a much simpler cartesian product implementation where we can just use visitor patterns (std::string becomes shogun::Any):

std::vector<std::vector<std::string>> cartesian_product(
    const std::vector<std::vector<std::string>>& inputs) {
    std::vector<std::vector<std::string>> result;
    if (std::all_of(inputs.begin(), inputs.end(), [](const auto& el) {return el.empty();})) {
        return result;
    }

    for (auto& el: inputs.front()) {
        result.push_back({el});
    }

    auto size = std::accumulate(inputs.begin(), inputs.end(), 1,
        [](const auto& lhs, const auto& rhs){
            return lhs * rhs.size();
    });

    std::vector<std::vector<std::string>> temp;
    temp.reserve(size);

    for (auto i = 1; i < inputs.size(); ++i) {
        temp.clear();
        for (const auto& e: result) {
            for (const auto& f: inputs[i]) {
                temp.emplace_back(e).push_back(f);
            }
        }
        result = std::move(temp);
    }
    return result;
}

int main()
{
    std::vector<std::string> a {"a1","a2","a3"};
    std::vector<std::string> b {"b1","b2","b3"};
    std::vector<std::string> c {"c1","c2"};

    auto result = cartesian_product({a, b, c});

    std::cout << "Result: \n";
    for (const auto& vec: result) {
        for (const auto& el: vec)
            std::cout << el << ',';
        std::cout << '\n';
    }
}

The generator style approach one is much nicer, but it is currently flawed and would need more fixing.

@stale
Copy link

stale bot commented Nov 16, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Nov 16, 2020
@stale
Copy link

stale bot commented Nov 23, 2020

This issue is now being closed due to a lack of activity. Feel free to reopen it.

@stale stale bot closed this Nov 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants