Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timegap optimize simplify #204

Merged
merged 17 commits into from Oct 24, 2019

Conversation

stephanecollot
Copy link
Contributor

@stephanecollot stephanecollot commented Sep 20, 2019

  • replace two parameters df, date_col by one date_serie
  • insure no mutation with a copy
  • check if index match and if index is unique
  • optimise the iloc/loc/get_loc with a join and index
  • add plotting function with unit test
  • add summary function with unit test
  • update notebook doc

@stephanecollot
Copy link
Contributor Author

Hi @koaning could you have a look at this PR? A bit bigger
Cheers

Copy link
Collaborator

@MBrouns MBrouns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

We should note that this is a backwards-incompatible change and while I don't think we have a problem with those yet, we should at some point come up with a bit of a deprecation strategy

sklego/model_selection.py Show resolved Hide resolved
sklego/model_selection.py Outdated Show resolved Hide resolved
sklego/model_selection.py Outdated Show resolved Hide resolved
sklego/model_selection.py Show resolved Hide resolved
@koaning
Copy link
Owner

koaning commented Sep 22, 2019

Mhm. In the future it would be best to first discuss a user interfacing change on github issues before working on it.

That said @stephanecollot could you perhaps opt into the github actions beta? We've switched to github actions to simplify the CI but unfortunately it does mean that you need to sign up for the bta if we want the CI to run automatically. This issue should be fixed once actions is no longer in beta.

sklego/model_selection.py Outdated Show resolved Hide resolved
@stephanecollot
Copy link
Contributor Author

stephanecollot commented Sep 23, 2019

@koaning Ok I just opt into github actions

@stephanecollot
Copy link
Contributor Author

Hi @MBrouns thanks for the review, I replied and fixed your comments.
@koaning some how it is still not running the CI steps I think.

@stephanecollot
Copy link
Contributor Author

up

@koaning
Copy link
Owner

koaning commented Oct 4, 2019

Strange. Are you going to pydata meetup in two weeks? Meeting in person might make sense.

@stephanecollot stephanecollot changed the title Timegap optimize simplify Timegap optimize simplify Oct 4, 2019
@stephanecollot
Copy link
Contributor Author

stephanecollot commented Oct 4, 2019 via email

@koaning
Copy link
Owner

koaning commented Oct 18, 2019

@stephanecollot today @MBrouns and myself made a final review. we think we can merge this but only if you can remove plot and summary methods. This is competely fine in a notebook but since it deviates from the sklearn standard we prefer not to have it part of the library.

Do you agree?

@stephanecollot
Copy link
Contributor Author

Hi guys, thanks a lot of the review.
I strongly think that having the plot and summary functions is a valuable feature.
For example you can quickly check the number of samples per fold and the number of days, it could be non homogenous and therefore you would either change the parameters or figure out that something is wrong with your data.
Personally I would always check those. Copy/pasting every-time from the example, with the possibility to make a mistake, doesn't sound good to me.
That is true that those functions don't exist in sklearn but that is because they are not needed in the standard cross validators, and adding two function doesn't make it less compatible.
What do you think?

@koaning
Copy link
Owner

koaning commented Oct 23, 2019

Technically, they do exist in sklearn, but they merely exist in the documentation:

https://scikit-learn.org/stable/auto_examples/model_selection/plot_cv_indices.html

Although we cannot be 100% scikit-learn compatible for all our lego bricks. We really want to be careful with adding features that do not adhere to their standard. We definitely think the plotting and summary is awesome for the documentation but adding it to our library is something we prefer not to consider now that the project is getting traction. @MBrouns and myself prefer to use scikit-learn utils for testing as much as possible so features that fall outside of the sklearn scope are bound to be hard to maintain.

If you can make a really strong case that this feature is hard to use without it, or later if you can demonstrate that there is indeed a large community need for it, then we can consider it more easily. By adding this feature we'd get a harder dependency on matplotlib (which we prefer not to have).

@stephanecollot
Copy link
Contributor Author

What about putting the plot in the documentation and leaving the describe in the class?
Therefore we don't have a dependency on matplotlib and we split the difference ?

@koaning
Copy link
Owner

koaning commented Oct 23, 2019

We shouldn't see these conversations as a negotiation. But sure, that's fair.

@stephanecollot
Copy link
Contributor Author

ok done

@stephanecollot
Copy link
Contributor Author

Do you think that the build github Actions are not running because it is a pull request from my fork?
If you give me write permission I can push to this repo and see if it is running.

@koaning
Copy link
Owner

koaning commented Oct 23, 2019

I can check it out myself from the browser and confirm if the tests run. I'll do that now.

Gotta say ... Github-Actions sure disappointed me a bit here.

@stephanecollot
Copy link
Contributor Author

All green now, what did you do?

@MBrouns
Copy link
Collaborator

MBrouns commented Oct 24, 2019

I think @koaning pushed your branch to a branch on this repo, rather than the fork. That causes the actions to actually do run.

@MBrouns MBrouns merged commit 302b3b3 into koaning:master Oct 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants