-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#43 #75 #81 Adapt Donut to Missing Values #82
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a plot regarding the results without use_zero. Can all algorithms somehow work with nan values or des it break any of them?
I adapted the two that could not handle NaN's until now so they behave like if I would have not removed that use_zero flag. |
main.py
Outdated
@@ -36,7 +38,7 @@ def run_pipeline(): | |||
SyntheticDataGenerator.extreme_1_polluted(0.5), | |||
SyntheticDataGenerator.extreme_1_polluted(1) | |||
] | |||
detectors = [RecurrentEBM(num_epochs=15), LSTMAD(), Donut(), DAGMM(), LSTM_Enc_Dec(epochs=200)] | |||
detectors = [RecurrentEBM(num_epochs=15), LSTMAD(), Donut(), DAGMM(), LSTM_Enc_Dec(num_epochs=200)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While you're at it, can you change num_epochs
for LSTM_Enc_Dec
to something low like 5-15 or so? It makes the benchmark run forever, and I just can't imagine that many epochs bringing any benefit.
src/evaluation/experiments.py
Outdated
@@ -13,7 +13,7 @@ def run_pollution_experiment(outlier_type='extreme_1', output_dir=None, steps=5) | |||
datasets = [ | |||
SyntheticDataGenerator.get(f'{outlier_type}_polluted', pollution) for pollution in np.linspace(0, 1, steps) | |||
] | |||
detectors = [LSTM_Enc_Dec(epochs=200), DAGMM(), Donut(), RecurrentEBM(), LSTMAD()] | |||
detectors = [LSTM_Enc_Dec(num_epochs=200), DAGMM(), Donut(), RecurrentEBM(), LSTMAD()] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
num_epochs
This addresses Training Donut raises "Tensor had NaN values" #43, Adapt Donut to Missing Data #75 and Training Donut on Missing Values Outlier (100%) throws ValueError #81 .