New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Add ConvSCCS model + refactoring of SCCS-related stuff (preprocessing, simulations, etc.) #158

Merged

MaryanMorel merged 2 commits into master from TICK-361

Mar 22, 2018

Member

MaryanMorel commented Jan 31, 2018

Add ConvSCCS model + refactoring of SCCS-related stuff (preprocessing, simulations, etc.) according to the paper

Feature products are not shipped as a learner option for now, but it can be used independently.

Most objects are not picklable for now, this should be fixed in another task to allow parallel CV and bootstrap.

MaryanMorel requested a review from stephanegaiffas

January 31, 2018 14:20

stephanegaiffas changed the title ~~Tick 361~~ Add ConvSCCS model + refactoring of SCCS-related stuff (preprocessing, simulations, etc.)

Collaborator

stephanegaiffas commented Jan 31, 2018

@MaryanMorel : good job on this PR. I've had a first quick review, and it seems that there is no example in the documentation, please provide one (on simulations at least, later on a real dataset), so that we can understand how to use the preprocessing tools and the ConvSCCS learner

Member Author

MaryanMorel commented Jan 31, 2018

Indeed, I haven't thought of putting an example. I'll add the one corresponding to the 'example notebook', it takes some time to run though (~2min). Is it ok for the doc or should I accelerate things up?

Collaborator

stephanegaiffas commented Jan 31, 2018

We don't care if it takes extra 2 minutes for the doc build. The example must appear in the examples of the doc of tick, not in unittests of course (see the tick/examples folder), and maybe refer to it in the documentation.

Contributor

Mbompr commented Jan 31, 2018

If it is possible to accelerate it might worth it though! When working on the doc it's more convenient if things go fast. But yes, if it cannot be accelerated, it's better to have a slow example rather than no example at all!

stephanegaiffas requested changes

View reviewed changes

doc/modules/survival.rst Outdated

                  ModelCoxRegPartialLik
                  ModelSCCS
+                 ConvSCCS

Collaborator

stephanegaiffas Jan 31, 2018

ConvSCCS is a learner right, not a model ? This should go in the 1. Learner section

Member Author

MaryanMorel Feb 7, 2018 •

edited

Done! But GitHub does not seem to understand the change has been done. Do I need to perform a specific action?

MaryanMorel force-pushed the TICK-361 branch from 61cea33 to 8497968 Compare

February 6, 2018 16:38

Collaborator

stephanegaiffas commented Feb 7, 2018

@MaryanMorel : Where are we with this ?

MaryanMorel force-pushed the TICK-361 branch from 808ae46 to 0aa28a4 Compare

February 7, 2018 15:24

Member Author

MaryanMorel commented Feb 7, 2018

Ooops, I broke one test, it should be fixed now. I got an example running quite "quickly" (regarding we need a bootstrap do compute confidence intervals).

MaryanMorel force-pushed the TICK-361 branch 2 times, most recently from 594b316 to 2444f5f Compare

February 7, 2018 16:27

Member Author

MaryanMorel commented Feb 8, 2018

@stephanegaiffas the build is passing, do you still need to review some code ?

stephanegaiffas requested changes

View reviewed changes

Collaborator

stephanegaiffas left a comment

Mostly things about :

Attribute and option names
tests in setters
an attributed allowing to get the intensity and the CI for the intensity...
Some better explanations in docstring about the attributes. We need to explain precisely how coeffs is organized... I'll help if you want for this and the doc

tick/survival/convolutional_sccs.py Outdated

+                  def __init__(self, n_lags: np.array,
+                               penalized_features: np.array,
+                               strength_tv=None, strength_group_l1=None,

Collaborator

stephanegaiffas Feb 9, 2018

I'm annoyed by passing strengths to solvers... we should use C_tv (which corresponds to 1 / strength) and C_group_l1 instead, in a more sciait-like manner... (this is what we do in other learners for linear models for instance...)

tick/survival/convolutional_sccs.py Outdated

+                          `n_lags` time intervals. `n_lags` values must be between 0 and
+                          `n_intervals` - 1.
+                      penalized_features : `numpy.ndarray`, shape=(n_features,), dtype="bool"

Collaborator

stephanegaiffas Feb 9, 2018

what don't we penalize all features by default ? Can't we put a default to this ?

Member Author

MaryanMorel Feb 11, 2018

Default is not straightforward, as we only get the number of features during the _prefit

tick/survival/convolutional_sccs.py Outdated

+                      verbose : `bool`, default=False
+                          If `True`, solver verboses history, otherwise nothing is
+                          displayed.

Collaborator

stephanegaiffas Feb 9, 2018

print_every and record_every at not documented (use the docstring from solvers :))

Member Author

MaryanMorel Feb 11, 2018

ok

tick/survival/convolutional_sccs.py Outdated

+                      n_coeffs : `int` (read-only)
+                          Total number of coefficients of the model
+                      coeffs : `numpy.ndarray`, shape=(n_coeffs,), dtype="float64" (read-only)

Collaborator

stephanegaiffas Feb 9, 2018

We should explain how the coeffs are organised, or output in this learning a 2D matrix (n_features, n_lags) or something like this... We need to explain how the coeffs can be used :)

tick/survival/convolutional_sccs.py Outdated

+                      bootstrap_coeffs : `Bootstrap_CI` (read-only)
+                          Bootstrap coefficients and confidence intervals of the model.
+                      """

Collaborator

stephanegaiffas Feb 9, 2018

Can we add an attribute (using a property) that computes the intensity of each feature or something like this ?

tick/survival/convolutional_sccs.py Outdated

+                  def _construct_solver_obj(step, max_iter, tol, print_every,
+                                            record_every, verbose, seed):
+                      # seed cannot be None in SVRG
+                      solver_obj = SVRG(step=step, max_iter=max_iter, tol=tol,

Collaborator

stephanegaiffas Feb 9, 2018

can you add a comment : # TODO: we might want to use SAGA also later... (might be faster here)

Member Author

MaryanMorel Feb 11, 2018

ok

tick/survival/convolutional_sccs.py Outdated

+                  @step.setter
+                  def step(self, value):
+                      self._set('_step_size', value)

Collaborator

stephanegaiffas Feb 9, 2018

Check that it's > 0

Member Author

MaryanMorel Feb 11, 2018

ok

tick/survival/convolutional_sccs.py Outdated

+                      self._solver_obj.step = value
+                  @property
+                  def _strengths(self):

Collaborator

stephanegaiffas Feb 9, 2018

I'd rather like a separate C_tv and C_group_l1 with getters and setter, where we check that there are > 0, and where we update private _strenght_tv and _strengh_group_l1 (if zero we put None). This is what we do in linear learners

Member Author

MaryanMorel Feb 11, 2018

this property is meant to be private, i.e. used only internally to avoid having many .set() everywhere, C_tv and C_group_l1 have public getters and setters of their own. I can remove it if you find it ugly

tick/survival/convolutional_sccs.py

+                  @n_lags.setter
+                  def n_lags(self, value):
+                      offsets = [0]

Collaborator

stephanegaiffas Feb 9, 2018

We must check here that it's >= 0 and <= n_intervals - 1

Member Author

MaryanMorel Feb 11, 2018

Ok for >= 0, but we only get to know n_intervals during _prefit. This model does not fit very well with sklearn fit / predict framework

Member Author

MaryanMorel Feb 19, 2018

I finally put the check on n_lags <= n_intervals - 1 in _prefit rather than the setter

tick/survival/convolutional_sccs.py Outdated

+                          Coefficients of the model.
+                      bootstrap_coeffs : `Bootstrap_CI` (read-only)
+                          Bootstrap coefficients and confidence intervals of the model.

Collaborator

stephanegaiffas Feb 9, 2018

Add a Reference section with the paper describing the method

Mbompr mentioned this pull request

template survival module #176

Closed

MaryanMorel force-pushed the TICK-361 branch 3 times, most recently from 6f06f43 to a3d2c98 Compare

February 22, 2018 15:51

Member Author

MaryanMorel commented Mar 12, 2018 •

edited

@stephanegaiffas Are you ok to merge? (after rebasing)

MaryanMorel added 2 commits

March 20, 2018 16:15


          Add ConvSCCS learner

44925fd


          Bugfix Too many open files exception in Hawkes simulations

1f9c926

MaryanMorel force-pushed the TICK-361 branch from a3d2c98 to 1f9c926 Compare

March 20, 2018 16:38

Member Author

MaryanMorel commented Mar 21, 2018

@stephanegaiffas the branch has been rebased, ok to merge?

stephanegaiffas approved these changes

View reviewed changes

Collaborator

stephanegaiffas commented Mar 21, 2018

Done !

Member Author

MaryanMorel commented Mar 22, 2018

Thanks !

MaryanMorel merged commit 088393b into master

Mbompr deleted the TICK-361 branch

September 28, 2018 12:17

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment