-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new multivariate dataset to pipeline and test in notebook #72
Conversation
# Conflicts: # src/algorithms/lstm_ad.py
…rce/MP-2018 into fix/lstmad_threshold
from .dataset import Dataset | ||
|
||
""" | ||
TODO: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add an issue about this (with some more details)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deprecated comment removed
""" | ||
|
||
|
||
def get_random(x, strength=1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "add_scaled_random" or something would be a more suitable method name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Nice! Do you mind adding the datasets to main.py? |
self.mean_curve_length = mean_curve_length | ||
self.mean_curve_amplitude = mean_curve_amplitude | ||
self.global_noise = 0.1 # Noise added to all dimensions over the whole timeseries | ||
self.dim2 = dim2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why "dim2"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be better to call it "anomaly_function". At the moment there are only 2D anomalies, another PR will solve that
# The last two values are ignored for generation of not anomalous data | ||
|
||
|
||
def doubled_dim2(curve_values, anomalous, interval_length): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the next 4 methods you're not using interval_length
. I think it'd make sense to rename it to _
to indicate that it is not used.
if not anomalous: | ||
return curve_values, -1, -1 | ||
else: | ||
# The curve in the second dimension occures a few timestamps later |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
occurs
# Add anomaly labels with slight padding (dont start with the first interval value). | ||
# The padding is curve_length / padding_factor | ||
if create_anomaly: | ||
assert end > start and start >= 0, f'Invalid anomaly indizes: {start} to {end}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can simplify that to assert end > start >= 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also: indices
Do you mind uncommenting |
Addresses #40
Comments are missingCode might be confusing -> I'll refactor it with the purpose of reusing it for other types of MV outliers.UPDATE:
Already implemented anomalies: