New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to get exact outliers in Univariate Time Series using OutlierDetector? #139
Comments
Hi @sthirumoorthi , Thanks for opening the issue, our tutorial for outlier detection (2. Outlier Detection) will be useful, pls let us know if any other questions. |
Hello, Thanks for the quick response. I was referring to the outlier detection tutorial available in this link. However, the model returns outliers in a specific range instead of returning single outlier value or index. In my example dataset, i was expecting the model to return "500" and "73" as an outlier or index. But it did return additional values/indexes. ts_outDetection.outliers[0] -> Would it be possible to check the outcome of this model and provide your comments please? |
@sthirumoorthi, thanks for sharing your results. did you transform your data to TimeSeriesData before applying the detector? |
Hi @MoKazemi9, Thanks for checking the details. Yes. I did the transformation before applying the detector. #transform the data for outlier detection #Outlier Detection model for the 'Daily Total Female Birth' dataset My complete Python file is available in my GitHub repository with the test dataset, for your reference. |
I think @sthirumoorthi is asking why the detector returns the index ( |
Hi @sanelemahlalela.. Getting the value or index is not a problem. The model returns range of values/indexes or multiple values/indexes instead of returning the outliers. In the above example, the model returns the below timestamps as outliers wherein i was expecting only one timestamp ('2019-01-09 00:00:00') as outlier (value '500'). ts_outDetection.outliers[0] -> As per my understanding, we don't have any hyper-parameters to adjust (except iqr_mul) to increase the accuracy of the model. So i wanted to understand how the outlier detection logic works for this model and what needs to be done to detect the exact outliers. |
@sanelemahlalela.. Thanks for checking the details. The detector returns the results as expected, in your case. Not sure why this is not working for my test data. Might be an issue in my test data? or the way i coded the logic? I double checked the python code and couldn't find any issue. I understand that you have picked portion of my test data for your test. Can you try the actual data from my repository (csv file) please? |
Hello, I'm trying to analyze the Outlier Detection framework for my project but it appears like the model returns the outlier range (not the exact index). Below are the details about my dataset.
2019-01-01 | 35
2019-01-02 | 32
2019-01-03 | 30
2019-01-04 | 31
2019-01-05 | 44
2019-01-06 | 29
2019-01-07 | 45
2019-01-08 | 43
2019-01-09 | 500
2019-01-10 | 27
2019-01-11 | 38
.....
I would expect the model to return the outlier as "500" and date as "2019-01-09". But the model returns as below.
ts_outDetection.outliers[0] ->
[Timestamp('2019-01-06 00:00:00'),
Timestamp('2019-01-07 00:00:00'),
Timestamp('2019-01-08 00:00:00'),
Timestamp('2019-01-09 00:00:00'),
Timestamp('2019-01-10 00:00:00'),
Timestamp('2019-01-11 00:00:00'),
Timestamp('2019-01-12 00:00:00')]
Can someone help me to understand the outlier detector concept in Kats or direct me to the reference document(if any) please?
Let me know if you need more details.
The text was updated successfully, but these errors were encountered: