API Design #1

Evizero · 2015-09-11T09:05:56Z

I could use your opinion on this package @lindahua . Basically, so far I implemented a Linear SVM based on Regression.jl. I think SVMs deserve its own package, because there are a lot of specialized algorithms, such as dual coordinate descent, (which is the one I implemented thus far).

I implemented it in a way that you can also use the standard Regression.jl solver for the LinearSVM case.

a = hcat(rand(10)+0,rand(10)+2)
b = hcat(rand(10)+2, rand(10)+0)
X = vcat(a,b)'
Y = vcat(ones(10),ones(10)*-1)
train = randbool(size(X,2))
ret = Regression.solve(hingereg(X[:,train], Y[train]; bias = 0.0),
                       reg = SqrL2Reg(1.0e-2),
                       solver = DualCoordinateDescent())

I am not yet sure how to fit the kernel SVMs into this design, but I will think about it. Probably over the prediction function

Any opinion on this?

Evizero · 2015-09-11T09:09:41Z

Note, I didn't want to touch SVM.jl, simply because I'd have to delete the whole code anyway. Beside inefficient implementations, there are a few strange things going on in that package, such as convergences being defines as iter >= maxiter, checking if a float is exactly 0.0 without tolerance, or storing a scalar as a vector of identical scalars.

lindahua · 2015-09-11T09:30:31Z

This would become a great package. Your efforts will be appreciated by the community.

In terms of incorporating kernels, I think it may be a good idea to just write kernel based prediction functions. Not completely sure whether it may eventually work out, but I think it is good to try.

When this package matures, we should consider merge this and SVM.jl in certain way -- people tend to search for SVM.jl first instead of KSVM.jl. It seems like that SVM has not been actively updated for a while (except for changes to make it work with Julia 0.4).

cc @johnmyleswhite

Evizero · 2015-09-11T13:55:21Z

To really make it clean, I will have to make some additions to EmpiricalRisks.

For example I currently abuse the L1 / L2 regularizer to specify if I use L1-SVM or L2-SVM. However, in the later case that L1/L2 actually refers to the (squared) hinge loss and not the penalty. So given the same parameters, my new solver does a very different thing than the others. Also the C in SVM is not the same as the lambda in the usual case, but 1 / lambda.

Bottom line: I will implement the squared Hinge Loss

I'll submit a PR as soon as I get to it. Do you have any objections right away?

johnmyleswhite · 2015-09-11T14:55:04Z

I think it would be good to let that SVM.jl name go vacant for a month or two to make people realize the package is dead. After that, it would be pretty easy to move this new over to that name.

Evizero · 2015-09-11T15:43:29Z

johnmyleswhite · 2015-09-11T15:54:33Z

If you're looking for work, some performance comparisons against LibSVM would be very compelling.

Evizero · 2015-09-11T15:57:09Z

That's a good idea. added.

I did look at that code. It is quite impressive. I'm hopefull we'll get within a factor of 10

Evizero · 2015-09-13T17:24:14Z

I added sparse matrix support to the list (at least for linear SVM). I think that is the single most important usecase for linear SVMs anyhow, so I will tackle that soon.

Does anyone have a specific suggestion for what package(s) to use for the sparse stuff, or is Base ok? The associated issues are kind of confusion regarding the current best practice

johnmyleswhite · 2015-09-13T20:48:27Z

The only meaningful sparse support I know of is in Base and it's only for CSC format. Eventually that will be moved out to a package, but the timeline on that is very unclear.

Evizero · 2015-09-16T19:07:25Z

Thanks. Turns out the inner structure of the SparseMatrixCSC type is perfect for dual coordinate descent and I was able to really abuse this nicely

Evizero · 2015-09-16T19:57:45Z

Well, so the first informal benchmarks (that are simply an adoption of here ) hints on a 2000x20000 matrix that for this version of the dual cd algorithm, that my implementation is at least somewhat on par with Pythons Scikit learn. The algorithm I implemented is pretty much the same as the one LibLinear (and thus Scikit-learn) uses, however it is neither online nor does it use anything like shrinking at this point. That being said I compare in the tests against Scikit-learns and get the same accuracy to at least three decimal places on those datasets.

So for this very informal benchmark I use scikit-learns LinearSVC which is just a liblinear wrapper, and I set the same settings for iterations and tolerance

python: 45.3 sec
julia - dense version: 22.5 sec
julia - sparse version: 28.5 sec

note that the generated training data is not sparse at all but very dense (the SVMLight file is 1 GB), thus I am actually surprised that the sparse implementation is competitive for this particular example.

I will implement a formal benchmark and check for significance, but that will have to wait a little until the other tasks are done

Edit: There is a good chance though that these numbers are overly optimistic, because I am not yet sure if I do the convergence check in the same manner as LibLinear. For this dataset I got .93 acc and scitlearn .94, which makes me suspect that my implementation simply exited earlier

johnmyleswhite · 2015-09-16T20:38:10Z

Very impressive work, even if the numbers need further revision.

Evizero · 2015-09-18T15:27:16Z

About half an hour ago I have read a discussion on licensing and derivative work and that got me a little paranoid.

First of all I did look at the SVM.jl code, even though I didn't take it as a reference (I implemented the algorithm just according to the paper). That being said, it's fair to say that the package did influence me. But that shouldn't be a problem, as everything should be OK if I giving credit for inspiration here, which I will gladly do.

But there might be an issue. I aim to make this package available under MIT. However, I just remembered that I did browse over the liblinear code, mainly to get a feel of how complex it is to implement something like this in C. Nonetheless, I technically looked at it for a minute or so. Since I am implementing the same algorithm which should give the same result, I guess one could argue that I am in a bit of a grey area here from a license standpoint.

What am I to do now? Can I still make this package MIT licensed, or do I have to remove dual coordinate descent and implement an other algorithm while being really careful to not stumble on any code that implements it?

Edit: I didn't realize you were the author of SVM.jl. Shows how ignorant I can be sometimes. Sorry if I stepped on your toes. I am pretty new to the open source movement, and did not give the licensing issues as much attention as I probably should have.

johnmyleswhite · 2015-09-18T16:46:22Z

What is the liblinear license?

A good rule of thumb is that you can never read GPL code unless you're writing a GPL library. Reading BSD code is harmless, although you still need to give attribution.

-- John

Sent from my iPhone

On Sep 18, 2015, at 8:27 AM, Christof Stocker notifications@github.com wrote:

About half an hour ago I have read a discussion on licensing and derivative work and that got me a little paranoid.

First of all I did look at the SVM.jl code, even though I didn't take it as a reference (I implemented the algorithm just according to the paper). That being said, it's fair to say that the package did influence me. But that shouldn't be a problem, as everything should be OK if I giving credit for inspiration here, which I will gladly do.

But there might be an issue. I aim to make this package available under MIT. However, I just remembered that I did browse over the liblinear code, mainly to get a feel of how complex it is to implement something like this in C. Nonetheless, I technically looked at it for a minute or so. Since I am implementing the same algorithm which should give the same result, I guess one could argue that I am in a bit of a grey area here from a license standpoint.

What am I to do now? Can I still make this package MIT licensed, or do I have to remove dual coordinate descent and implement an other algorithm while being really careful to not stumble on any code that implements it?

—
Reply to this email directly or view it on GitHub.

Evizero · 2015-09-18T17:08:32Z

modified BSD. So I guess I am not screwed? Man, should have known better. I didn't even get anything out of looking at it anyway.

Thank you, that's good thing to know. I'll simply never look at any related code again. Thankfully I haven't encountered any frustrating implementation issue thus far. might have ended up looking in the wrong place for answers

johnmyleswhite · 2015-09-18T17:21:46Z

Modified BSD == 2-clause or 3-clause? Either are fine.

The licensing situation is very complicated and we're being a little paranoid just to avoid any dangers for corporate users. But public domain, MIT and BSD code are all fine (except for 4-clause BSD). APL is ok, although you probably have to release your code as APL if you base anything on an APL library. But that's arguably better for some corporate users anyway.

Evizero · 2015-09-18T17:35:06Z

It's 3-clause. I should have really spend time on this issue before-hand. Didn't think simply browsing over it could cause me issues. I was planning to look at the code of a GPL licensed package, but luckily I didn't get around to it yet (I am sure my IP traces and such can prove that). Dodged a bullet there. I'll avoid anything code-related like the plague from now on.

You don't happen to know how I can properly give attribution for looking at the BSD code? I don't seem to find anything that describes this situation.

Also (you probably missed it since I edited my post before), I came to realize that you are the author of SVM.jl, which I did look at very closely. How can I properly give you attribution? This is my naive try to do so

johnmyleswhite · 2015-09-18T18:03:19Z

Usually I do things like this: https://github.com/JuliaLang/julia/blob/e97588db65f590d473e7fbbb127f30c01ea94995/src/support/asprintf.c

Since SVM is MIT, I'm not worried about attribution, but you could do the same thing as the BSD example above.

Evizero · 2015-09-18T18:13:53Z

Thank you, I will happily do that.

Thank you for your very quick and helpful advise, I got really paranoid and frustrated there for a second. I am just glad that I can continue this project with a clean conscience.

johnmyleswhite · 2015-09-18T18:15:52Z

Welcome to the world that the GPL has created. :)

Evizero · 2015-09-18T18:27:33Z

So, I hope this is appropriate: link

johnmyleswhite · 2015-09-18T18:28:23Z

That's more than enough

Evizero · 2015-09-21T10:10:11Z

This package will soon be at a usable stage, so I have started sketching out a very rough draft for the README. Since you both provide pretty good examples in Regression.jl and SVM.jl respectively I have been thinking of adapting those.

Are you both okay with how I assign you credit for the examples in the README?

johnmyleswhite · 2015-09-21T14:54:30Z

Works for me

Evizero mentioned this issue Sep 11, 2015

Add squared hinge loss for L2-SVM lindahua/EmpiricalRisks.jl#8

Merged

Evizero added this to the Version 0.0.1 milestone Sep 13, 2015

Evizero closed this as completed Nov 16, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API Design #1

API Design #1

Evizero commented Sep 11, 2015

Evizero commented Sep 11, 2015

lindahua commented Sep 11, 2015

Evizero commented Sep 11, 2015

johnmyleswhite commented Sep 11, 2015

Evizero commented Sep 11, 2015

johnmyleswhite commented Sep 11, 2015

Evizero commented Sep 11, 2015

Evizero commented Sep 13, 2015

johnmyleswhite commented Sep 13, 2015

Evizero commented Sep 16, 2015

Evizero commented Sep 16, 2015

johnmyleswhite commented Sep 16, 2015

Evizero commented Sep 18, 2015

johnmyleswhite commented Sep 18, 2015

Evizero commented Sep 18, 2015

johnmyleswhite commented Sep 18, 2015

Evizero commented Sep 18, 2015

johnmyleswhite commented Sep 18, 2015

Evizero commented Sep 18, 2015

johnmyleswhite commented Sep 18, 2015

Evizero commented Sep 18, 2015

johnmyleswhite commented Sep 18, 2015

Evizero commented Sep 21, 2015

johnmyleswhite commented Sep 21, 2015

API Design #1

API Design #1

Comments

Evizero commented Sep 11, 2015

Evizero commented Sep 11, 2015

lindahua commented Sep 11, 2015

Evizero commented Sep 11, 2015

johnmyleswhite commented Sep 11, 2015

Evizero commented Sep 11, 2015

johnmyleswhite commented Sep 11, 2015

Evizero commented Sep 11, 2015

Evizero commented Sep 13, 2015

johnmyleswhite commented Sep 13, 2015

Evizero commented Sep 16, 2015

Evizero commented Sep 16, 2015

johnmyleswhite commented Sep 16, 2015

Evizero commented Sep 18, 2015

johnmyleswhite commented Sep 18, 2015

Evizero commented Sep 18, 2015

johnmyleswhite commented Sep 18, 2015

Evizero commented Sep 18, 2015

johnmyleswhite commented Sep 18, 2015

Evizero commented Sep 18, 2015

johnmyleswhite commented Sep 18, 2015

Evizero commented Sep 18, 2015

johnmyleswhite commented Sep 18, 2015

Evizero commented Sep 21, 2015

johnmyleswhite commented Sep 21, 2015