Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Add simple cross-validation #1044
This code and definitely the tests look good so far, but I haven't reviewed it fully in depth. I have some high-level questions that I think should probably be addressed first, to make sure that we are understanding everything from the same perspective.
Thank you for your hard work with this! I am excited to work through any issues and be able to integrate this into mlpack.
I decided to move the discussion about the API out here, since Github gives a little bit of a bigger text box for it... This is a big and difficult discussion, because cross-validation and the hyper-parameter tuner mean that a lot of different methods in mlpack must agree on the same basic API and structure, and that is a difficult problem. Basically, we are forced to find a way to unify everything, and as we get further in this project, I think we keep finding more things that are not unified... That unfortunately means that the scope of my comment here is quite large, not just the CV code.
I would like to see the "standard" mlpack algorithm look like this:
So, for instance, take
So we could use this like this:
I know that there are a lot of mlpack methods that do not follow this style exactly, and this needs to be fixed. I will open an issue for that.
Because of this, I'd also like to be able to use the cross-validation code in the same way. But there are some extra complexities. I see that there are three types of runtime parameters we must worry about for
But this is different than the idea for "standard" mlpack algorithms I proposed above, where the type 3 parameters go in the constructor of the algorithm. Therefore, this leads me to suggest something else:
In this situation we have actually avoided the confusion of asking the user to pass algorithm hyperparameters and CV hyperparameters at the same time, by having the user pass the type 3 parameters to the constructor of
This allows us to not require users to pass all hyperparameters at once, which I think is really important for keeping code readable:
So, I think, at the end of the day, what I am saying is that I think this interface I have proposed for
However, I know that this interface would be a lot of refactoring work for you, so I want to ensure that my discussions here delay your project only minimally. Therefore, if you agree that this interface is a good change, then I can open an issue highlighting the API changes we'll need to make to have mlpack algorithms match the "standard" interface, and then I will go implement those changes quickly (within a week), and in the longer term (before the 3.0.0 release) I will prepare some documents detailing what the "standard" mlpack API and design is. If you don't agree that it is a good change, then let's keep discussing and see if we can find a better solution. :)
It’s a little bit sad to hear all of that only in the middle of July. The interface of SimpleCV in the PR shouldn’t not look like something new for you - it is the same in my proposal, as well as in the pretty long discussion #929. The same goes for hyper-parameter tuning interface, which we quite in depth discussed in the mail thread "Cross-validation and hyper-parameter tuning infrastructure” (like here). But anyway, I would like to know what is your current vision of the interface for hyper-parameter tuning. It looks like the strategy "pass hyper-parameters first, data later" is not going to be applicable here.
I think, it's quite normal that you discuss something at length, which at the end turns out to be suboptimal, it looks like something like that happened here. Don't get me wrong, I totally get your point, you both put a lot of time into the class design and to realize that we have to put more time on it is frustrating. But I think in the end it is worth it if we can come up with something that is easier to use as the current idea I think more people are going to use it.
That said, I think the API proposed by @rcurtin makes sense in the context of usability and readability, and if we all agree on that, we can go and implement those changes quickly.
I see, now that I review previous discussions, that my responses probably seem incomprehensible and schizophrenic. It is worth keeping in mind that I look through so much code every day that it is hard to keep it all straight, and as a result I can give inconsistent answers. In this case, I owe you an apology since what I wrote here directly contradicts what I wrote just a few months ago! So I am sorry about that. I know how stressful it is to be on the receiving end of this and so I will work to make sure it does not happen again for you.
In addition, I want to point out, there are two factors that make this project extra difficult to stay on top of---first, since this project sits "on top of" all of the other mlpack methods, some design work has to be done to standardize all of those. Second, reviewing heavy template metaprogramming code requires a huge amount of effort since the code is typically not very straightforward. These contribute to the difficulty of keeping everything I have previously said in mind.
I see that there is also a misunderstanding on my side:
When I review the proposal now, I see that
As for the API change I suggested, what we have now that we did not in April is a formal idea of how all mlpack methods will look:
The primary benefit here is the second point: a user is not required to set all hyperparameters to set the last one; instead they can just call the function corresponding to the last one's name (see the
This specific point is one I did not realize in our discussion of #929. So, what I would like to understand from your side, is: is this possible, and how much extra effort would that be? If it is not easily possible, or it takes a huge amount of effort, then we can leave it as proposed in #929; after all, I even wrote:
I have to add, the API idea I am proposing does not make any sense for the hyperparameter tuner, only for cross-validation. In the hyperparameter tuner it only makes sense for the user to pass all of the hyperparameters to the
@zoq, my main fear was that a lot of work that have been already done (it primary concerns the meta programming tools and the way it is utilized by SimpleCV) was like wasting time. It is why I have said "it's a little bit sad" - I guess such cases can be considered as a little bit sad. Anyway, I'm sorry if let myself be too much negative, but I hope you guys got my point.
The current interface for cross-validation (proposed in the PR) was designed with the intention to be utilized by hyper-parameter tuning - there we want to prepare a dataset only once and then use it to assess different sets of hyper-parameters. So, from this point of view I think it makes sense to have the interface proposed in this PR.
Now I'm finishing working on the planned part of the GSOC project (the plan that I described in my proposal with remarks from the end of this letter). I think that all planned code should be ready for review by the end of the second evaluation period. That means that after the second evaluation period I should have time to implement the interface of cross-validation that you suggest (in addition to the interface proposed in this PR). What if we go this way? If you agree, I will go ahead and add the remaining minor changes that you have suggested in this PR.
This one's on me, don't worry about it. :)
Ok, that seems reasonable to me---but could we make the documentation for the class clearer about exactly what it does and how it would be used? This would be in addition to the tutorial you are planning to make (because a user may not find that tutorial first, they may find the class documentation first).
Yes, definitely, I can agree that the
Yes, this seems reasonable to me; we can decide at that time which interface makes the most sense (including possibly both), and see what is possible in the remaining time you have this summer.
To add to that, I think this is ready to merge if you can add some more documentation to the
Looks great to me, I think that this is ready to merge. But the Windows build is failing:
If you can handle that, then I think it's time to merge.
https://appveyor.statuspage.io/incidents/m2vdvw39kdk8, I think we should wait a couple more hours and restart the build.
Ah, ok, thanks. That online compiler is a nice tool to know about! I think the workaround you have committed is just fine.
I am happy with this PR and I think it's ready to merge. I'll wait 5 days for the merge since this PR is complex, in case anyone has any more comments.
We can revert the changes to