-
Notifications
You must be signed in to change notification settings - Fork 299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add incremental training option to DAGMM model #65
Conversation
Thanks for the contribution. I'm hesitant to merge this as is because it's a pretty ad-hoc way of handling multi-series settings (a feature requested by #52). The main downsides are:
If this is a direction you're interested in pursuing, I think a better approach would be to design a base class / mixin which implements common features of multi-series training (incremental training would be a good default option). Then, any model which supports such behavior can simply inherit from this base class. If a model has specific batch training which handles all time series together, it can override the default incremental multi-series training. I'm open to further discussion on this topic though. As an aside: the docs job is currently failing because the docstring for |
Thank you for the detailed comment!
Agree :) That is why it is still a draft created for further discussion. What you propose definitely makes sense. I guess changing Even though I am not a fan of multiple inheritance, this seems like a good use case for it. # base.py
class IncrementalTrainingMixin:
@abstractmethod
def train(train_data: Union[TimeSeries, Iterable[TimeSeries]], ...):
# the rest of the arguments are the same as in the `DetectorBase`
pass
# dagmm.py
class DAGMM(IncrementalTrainingMixin, DetectorBase):
# the order means the `train` method from the`IncrementalTrainingMixin` overrides the one from the `DetectorBase`
def train(train_data: Union[TimeSeries, Iterable[TimeSeries]], ...):
if isinstance (train_data, TimeSeries):
...
# run the existing training method
self._train(X=data)
if isinstance (train_data, Iterable[TimeSeries]):
# incrementally train the model as proposed in the PR
...
for data in train_data:
self._train(X=data, incremental=True) What do you think?
|
This direction is more or less what I had in mind. I think it makes most sense to add a new For data shuffling, I think this can be quite model-specific. But some common behaviors (across both anomaly detection and forecasting) include
Of course, some of these may be complicated if we are using a lazy iterator of time series. So maybe we could start by assuming we have a I suspect that the cleanest implementation will be to include params What do you think? |
Thanks @aadyotb for the comment! This makes sense, let me draft something tomorrow. |
Added the initial (draft) implementation of multiple series training.
PS: not sure why the doc building fails... |
You should call |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! This looks mostly good to me, barring a couple of small issues I noted inline. Happy to approve once these are fixed. Also, see my comment above on how to diagnose why the docs build is failing.
merlion/models/anomaly/dagmm.py
Outdated
else: | ||
anomaly_labels = [None] * len(train_data) | ||
n_epochs = train_config.pop("n_epochs", 1) | ||
shuffle = train_config.pop("shuffle", False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you can make this shuffle = train_config.pop("shuffle", n_epochs > 1)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I will also add that shuffling is turned on by default for n_epochs
> 1 in the docstring.
train_config=train_config, post_rule_train_config=post_rule_train_config | ||
) | ||
) | ||
return train_scores_list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a potential issue here, where the post-rule (calibrator and threshold) is trained individually on each time series. I think this is fine for the time being, but can you add a #FIXME
comment here saying that the post-rule needs to be re-trained on the train_scores
from all the models?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was not sure how this incremental training would affect the post_rule
...
on the train_scores from all the models
Do you mean from all the epochs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, from all the epochs.
Thank you for the feedback! I will address the remarks and add a unit test for the method as well to make it complete. |
merlion/models/anomaly/dagmm.py
Outdated
self, train_data: List[TimeSeries], anomaly_labels: List[TimeSeries] = None, | ||
train_config=None, post_rule_train_config=None | ||
self, multiple_train_data: List[TimeSeries], anomaly_labels: List[TimeSeries] = None, | ||
train_config=dict(), post_rule_train_config=None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you change the default value of train_config
back to None
? See here for why train_config=dict()
can be a problem. The preferred pattern would be train_config = {} if train_config is None else train_config
. Looks good to me once this is addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😅 Spending time with other languages makes me forget some of the Python pitfalls.
I agree that it is safer to avoid this at all even if a mutation doesn't happen.
Returned the None
back with additional if.
I can push out v1.1.2 once this is merged, along with one other PR in the works by me. |
@isenilov v1.1.2 is now out with this feature. |
Currently, the DAGMM model object gets created every time one calls the
train
methodMerlion/merlion/models/anomaly/dagmm.py
Line 132 in e21f7be
However, it makes it impossible to:
The proposed change adds an option to perform incremental training passing corresponding dictionary to the
train
method which disables the model recreation. The change does not affect the existing API so the behavior stays the same if notrain_config
is passed.The same change can also be applied to some of the other models.
@aadyotb would love to hear your opinion.