You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Experimenting with summary_drift_report, I found cases of two very similar distributions that are tagged as "severe drift".
In the image below, we have two profiles of the same features with 5k samples each, drawn from the same distribution. From the report, I understand that it's tagged as "severe drift", as the p-values are within the range for "severe drift". However, they are pretty similar.
We can see the similarity in the double_histogram plot:
I also stored the features from both profiles and ran a kstest, wich yielded a "no-drift" result, with p-value of 0.36.
Steps to Reproduce it
To generate the distributions with a higher number of samples, and to be able to access them as lists after logging it, I changed the code in the Profile_Viewer_In_Notebook a bit:
Which yielded the following result: {'data': {'is_drift': 0, 'distance': array([0.0184], dtype=float32), 'p_val': array([0.36134896], dtype=float32), 'threshold': 0.01}, 'meta': {'name': 'KSDrift', 'detector_type': 'offline', 'data_type': None, 'version': '0.8.1'}}
What is the expected correct behavior?
Maybe I'm misunderstanding the results, but my understanding is that the drift results are different from what I'd expect. For this case, there should be no alerts of drift between the distributions.
The text was updated successfully, but these errors were encountered:
Summary
Experimenting with summary_drift_report, I found cases of two very similar distributions that are tagged as "severe drift".
In the image below, we have two profiles of the same features with 5k samples each, drawn from the same distribution. From the report, I understand that it's tagged as "severe drift", as the p-values are within the range for "severe drift". However, they are pretty similar.
We can see the similarity in the double_histogram plot:
I also stored the features from both profiles and ran a kstest, wich yielded a "no-drift" result, with p-value of 0.36.
Steps to Reproduce it
To generate the distributions with a higher number of samples, and to be able to access them as lists after logging it, I changed the code in the
Profile_Viewer_In_Notebook
a bit:Which yielded the profile in the below images.
With the lists of values for
1mixture_distribution
, I ran the ks test like the following:Which yielded the following result:
{'data': {'is_drift': 0, 'distance': array([0.0184], dtype=float32), 'p_val': array([0.36134896], dtype=float32), 'threshold': 0.01}, 'meta': {'name': 'KSDrift', 'detector_type': 'offline', 'data_type': None, 'version': '0.8.1'}}
What is the expected correct behavior?
Maybe I'm misunderstanding the results, but my understanding is that the drift results are different from what I'd expect. For this case, there should be no alerts of drift between the distributions.
The text was updated successfully, but these errors were encountered: