Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Updation of scikit libraries #60
Hi @Iron-Stark, thanks for the PR. I think that we may want to consider other ways of providing all of these options to the config.yaml file. Right now we might write something like
but if the goal is to allow the user to specify any setting in the config.yaml file, we may want to restructure the input to look more like
otherwise I think we will run out of option names. :) So instead of abstracting to parameters specified similarly to mlpack, we would end up abstracting to some other format. (I am not sure my idea is the best, so please feel free to disagree or improve upon it.)
A thing to keep in mind is that in order to produce benchmarks that can actually be useful for someone looking at it, we have to reduce the scope a little bit sometimes---instead of, e.g., benchmarking every possible configuration of LSH, we instead benchmark "the default configuration of each library's LSH configuration given that we can tune the number of tables and projections"; or, e.g., we benchmark "each k-means algorithm that is implemented for each library with k and the initial centroids set specifically". But if the scope is too large (e.g. "benchmark every k-NN algorithm") then there get to be so many different parameter combinations that it's impossible to reasonably do that.
So I don't think it's a problem necessarily to provide this many options to be accessed through the configuration (other than the running out of option names issue), but I do think we should keep in mind that we may not want to use all of these options in the
Yes I see your point. There are certain configurations of LSH available in sklearn and not available in say shogun. So introducing those parameters are not useful as benchmarks for them cannot be generated as these options are not available in the other library.
Also I agree that the config.yaml file can be restructured the way you have suggested so that the parameters can be abstracted to some other format.