Format of Machine Learning API #186

Kelvin-Ng · 2016-12-19T04:01:20Z

The content is put to https://github.com/husky-team/husky/wiki/TODO:-Format-of-Machine-Learning-API

I have thought of several choices, but I don't have a clue of which one is absolutely better. Currently my preference is in descending order. Discussion may help for deciding the final choice. The current machine learning library is in quite a mess, so this is quite important.

Kelvin-Ng · 2016-12-19T10:15:24Z

Performance issue is added.

ddmbr · 2016-12-20T02:01:37Z

Somehow I feel this design is limited to gradient-based method only? I think all first-order gradient-based methods fit into this class. Is this your intention?

If so, we can actually create different type traits for different classes of problems.

Kelvin-Ng · 2016-12-20T03:05:50Z

I said that this is for example only.

For genetic algorithm, for example again, the Problem class should contain public methods of mutation(), crossover(), etc. I was only going to explain what kind of methods should be included in the classes, not exactly what methods should be included.

If we implement genetic algorithm later, we need to add mutation(), crossover(), etc, to the Problem classes existing in the husky lib so that users can choose to use gradient-based methods or genetic algorithm.

TatianaJin · 2016-12-20T04:25:41Z

Your idea is really inspiring. I am very interested in how to generalize the design to non-gradient-based methods. For example, to me the optimization problem for k-means like algorithms has a quite specific logic to find the optimum. My current work (yes they are messy especially the svm one) represents the optimization problem by the gradient function in models and let optimizers handle which samples to use and the update sequence. And a model assigns its problem to an optimizer. I do not know how to extend this to other optimization methods... Could you explain more on mutation() and crossover() (forgive my ignorance)?

And I think the optimization problems are only part of a model? For some machine learning models the inference parts are the same while the optimization part can be alternated. And I considered them as the same model with different formulations.

Also, if considering neural network (I am not sure whether you think this belongs to the same topic), a model may contain many 'submodels' if each neuron is considered a model according to its activation function, and each activation function corresponds to a type of optimization problem. It appears to me the whole machine learning algorithm takes the form: a model assigns the optimization problems to different types of optimizers and uses the solution to do inference.

lijinf2 · 2016-12-20T04:51:21Z

@Kelvin-Ng I agree with you. Yuzhen and me once developed a general framework for genetic algorithms for AI course, and it works. For each class of machine learning algorithms, I believe there should be a general, efficient and user-friendly framework, which can be built upon Husky. But there is no such framework that is general, efficient and user-friendly to all machine learning algorithms. I think any of the three methods you mentioned is reasonable and no one is always better than the other two. Why not we first determine what class of ml algorithms we want to support, and then consider which of the three method is the best.

Kelvin-Ng · 2016-12-20T06:08:58Z

@TatianaJin

Your last sentence precisely and concisely summarize what machine learning is basically doing.

What you say is very correct. Sometimes one machine learning model can be optimized in different ways by modelling it as different optimization problems (e.g. L1 vs L2 regularized linear regression), or by using different optimization algorithms on the same model and same optimization problem (e.g. L2 regularized linear regression using FGD, SGD, SVRG or genetic algorithm). The best design should be able to separate all of these so that models, optimization problems and optimization algorithms can be combined in different ways flexibly.

(An optimization problem is tightly tied to a model, but a model can be formulated as different optimization problems.)

Kelvin-Ng · 2016-12-20T06:34:44Z

The wiki page is updated with a new section. Thanks @TatianaJin for inspiring me.

kygx-legend assigned Kelvin-Ng Dec 19, 2016

ylchan87 mentioned this issue Dec 26, 2016

Implement K-means #145

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Format of Machine Learning API #186

Format of Machine Learning API #186

Kelvin-Ng commented Dec 19, 2016 •

edited

Kelvin-Ng commented Dec 19, 2016

ddmbr commented Dec 20, 2016

Kelvin-Ng commented Dec 20, 2016

TatianaJin commented Dec 20, 2016

lijinf2 commented Dec 20, 2016

Kelvin-Ng commented Dec 20, 2016 •

edited

Kelvin-Ng commented Dec 20, 2016

Format of Machine Learning API #186

Format of Machine Learning API #186

Comments

Kelvin-Ng commented Dec 19, 2016 • edited

Kelvin-Ng commented Dec 19, 2016

ddmbr commented Dec 20, 2016

Kelvin-Ng commented Dec 20, 2016

TatianaJin commented Dec 20, 2016

lijinf2 commented Dec 20, 2016

Kelvin-Ng commented Dec 20, 2016 • edited

Kelvin-Ng commented Dec 20, 2016

Kelvin-Ng commented Dec 19, 2016 •

edited

Kelvin-Ng commented Dec 20, 2016 •

edited