-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API Design #1
Comments
Note, I didn't want to touch SVM.jl, simply because I'd have to delete the whole code anyway. Beside inefficient implementations, there are a few strange things going on in that package, such as convergences being defines as iter >= maxiter, checking if a float is exactly 0.0 without tolerance, or storing a scalar as a vector of identical scalars. |
This would become a great package. Your efforts will be appreciated by the community. In terms of incorporating kernels, I think it may be a good idea to just write kernel based prediction functions. Not completely sure whether it may eventually work out, but I think it is good to try. When this package matures, we should consider merge this and |
To really make it clean, I will have to make some additions to EmpiricalRisks. For example I currently abuse the L1 / L2 regularizer to specify if I use L1-SVM or L2-SVM. However, in the later case that L1/L2 actually refers to the (squared) hinge loss and not the penalty. So given the same parameters, my new solver does a very different thing than the others. Also the C in SVM is not the same as the lambda in the usual case, but 1 / lambda. Bottom line: I will implement the squared Hinge Loss I'll submit a PR as soon as I get to it. Do you have any objections right away? |
I think it would be good to let that SVM.jl name go vacant for a month or two to make people realize the package is dead. After that, it would be pretty easy to move this new over to that name. |
I agree with that. There is no rush to claim that name. This package needs a lot of work before I am willing to Tag it. There still need to be ...
Can anyone think of anything important I should add to that list? If you know people that have a stake or strong opinion on SVMs, please add them to the conversation. (Maybe @tbreloff ?) |
If you're looking for work, some performance comparisons against LibSVM would be very compelling. |
That's a good idea. added. I did look at that code. It is quite impressive. I'm hopefull we'll get within a factor of 10 |
I added sparse matrix support to the list (at least for linear SVM). I think that is the single most important usecase for linear SVMs anyhow, so I will tackle that soon. Does anyone have a specific suggestion for what package(s) to use for the sparse stuff, or is Base ok? The associated issues are kind of confusion regarding the current best practice |
The only meaningful sparse support I know of is in Base and it's only for CSC format. Eventually that will be moved out to a package, but the timeline on that is very unclear. |
Thanks. Turns out the inner structure of the |
Well, so the first informal benchmarks (that are simply an adoption of here ) hints on a 2000x20000 matrix that for this version of the dual cd algorithm, that my implementation is at least somewhat on par with Pythons Scikit learn. The algorithm I implemented is pretty much the same as the one LibLinear (and thus Scikit-learn) uses, however it is neither online nor does it use anything like shrinking at this point. That being said I compare in the tests against Scikit-learns and get the same accuracy to at least three decimal places on those datasets. So for this very informal benchmark I use scikit-learns python: 45.3 sec note that the generated training data is not sparse at all but very dense (the SVMLight file is 1 GB), thus I am actually surprised that the sparse implementation is competitive for this particular example. I will implement a formal benchmark and check for significance, but that will have to wait a little until the other tasks are done Edit: There is a good chance though that these numbers are overly optimistic, because I am not yet sure if I do the convergence check in the same manner as LibLinear. For this dataset I got .93 acc and scitlearn .94, which makes me suspect that my implementation simply exited earlier |
Very impressive work, even if the numbers need further revision. |
About half an hour ago I have read a discussion on licensing and derivative work and that got me a little paranoid. First of all I did look at the SVM.jl code, even though I didn't take it as a reference (I implemented the algorithm just according to the paper). That being said, it's fair to say that the package did influence me. But that shouldn't be a problem, as everything should be OK if I giving credit for inspiration here, which I will gladly do. But there might be an issue. I aim to make this package available under MIT. However, I just remembered that I did browse over the liblinear code, mainly to get a feel of how complex it is to implement something like this in C. Nonetheless, I technically looked at it for a minute or so. Since I am implementing the same algorithm which should give the same result, I guess one could argue that I am in a bit of a grey area here from a license standpoint. What am I to do now? Can I still make this package MIT licensed, or do I have to remove dual coordinate descent and implement an other algorithm while being really careful to not stumble on any code that implements it? Edit: I didn't realize you were the author of SVM.jl. Shows how ignorant I can be sometimes. Sorry if I stepped on your toes. I am pretty new to the open source movement, and did not give the licensing issues as much attention as I probably should have. |
What is the liblinear license? A good rule of thumb is that you can never read GPL code unless you're writing a GPL library. Reading BSD code is harmless, although you still need to give attribution. -- John Sent from my iPhone
|
modified BSD. So I guess I am not screwed? Man, should have known better. I didn't even get anything out of looking at it anyway. Thank you, that's good thing to know. I'll simply never look at any related code again. Thankfully I haven't encountered any frustrating implementation issue thus far. might have ended up looking in the wrong place for answers |
Modified BSD == 2-clause or 3-clause? Either are fine. The licensing situation is very complicated and we're being a little paranoid just to avoid any dangers for corporate users. But public domain, MIT and BSD code are all fine (except for 4-clause BSD). APL is ok, although you probably have to release your code as APL if you base anything on an APL library. But that's arguably better for some corporate users anyway. |
It's 3-clause. I should have really spend time on this issue before-hand. Didn't think simply browsing over it could cause me issues. I was planning to look at the code of a GPL licensed package, but luckily I didn't get around to it yet (I am sure my IP traces and such can prove that). Dodged a bullet there. I'll avoid anything code-related like the plague from now on. You don't happen to know how I can properly give attribution for looking at the BSD code? I don't seem to find anything that describes this situation. Also (you probably missed it since I edited my post before), I came to realize that you are the author of SVM.jl, which I did look at very closely. How can I properly give you attribution? This is my naive try to do so |
Usually I do things like this: https://github.com/JuliaLang/julia/blob/e97588db65f590d473e7fbbb127f30c01ea94995/src/support/asprintf.c Since SVM is MIT, I'm not worried about attribution, but you could do the same thing as the BSD example above. |
Thank you, I will happily do that. Thank you for your very quick and helpful advise, I got really paranoid and frustrated there for a second. I am just glad that I can continue this project with a clean conscience. |
Welcome to the world that the GPL has created. :) |
So, I hope this is appropriate: link |
That's more than enough |
This package will soon be at a usable stage, so I have started sketching out a very rough draft for the README. Since you both provide pretty good examples in Regression.jl and SVM.jl respectively I have been thinking of adapting those. Are you both okay with how I assign you credit for the examples in the README? |
Works for me |
I could use your opinion on this package @lindahua . Basically, so far I implemented a Linear SVM based on Regression.jl. I think SVMs deserve its own package, because there are a lot of specialized algorithms, such as dual coordinate descent, (which is the one I implemented thus far).
I implemented it in a way that you can also use the standard Regression.jl solver for the LinearSVM case.
I am not yet sure how to fit the kernel SVMs into this design, but I will think about it. Probably over the prediction function
Any opinion on this?
The text was updated successfully, but these errors were encountered: