Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix for issue #306 - [BUG] The change points detected are different from the points reported from MOA, using the same example #306 #309

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

denisesato
Copy link

@denisesato denisesato commented Mar 25, 2022

There are two differences comparing the MOA and scikit-multiflow implementation detected and fixed:

  1. When a concept drift is detected, the statistics should be reset. If not, the detector will continuously detect more drifts after the first detection. One example is the “drift_stream.npy” applied in the unit test. The stream contains one drift that starts at index 999 (the distribution of values changed). However, with the current implementation, four drifts are detected

With the current source code:

drift_stream npy

After resetting the statistics on the add_element method:

drift_stream_new_implementation

  1. In the method detected_change, the code does not iterate over the complete bucket because of this line:
    for k in range(cursor.bucket_size_row-1):

I have changed the code line to solve it because the range function does not include the cursor value.bucket_size_row. New code line:
for k in range(cursor.bucket_size_row):

Changes proposed in this pull request:

  • Change method add_element() from ADWIN class to reset statistics if a concept drift was detected.
  • Change the method detected_change from ADWIN class to iterate the complete bucket.

Checklist

Implementation

  • Implementation is correct (it performs its intended function).
  • Code is consistent with the framework.
  • Code is properly documented.
  • PR description covers ALL the changes performed.
  • Files changed (update, add, delete) are in the PR's scope (no extra files are included).

Tests

  • [X ] New functionality is tested.
  • [X ] Tests are created for the new functionality or existing tests are updated accordingly.
  • [X ] ALL tests pass with no errors.
  • CI/CD pipelines run with no errors.
  • Test Coverage is maintained (coverage may drop by no more than 0.2%).

1) Reset the statistics after a drift has been detected in the method add_element (ADWIN class). This is necessary to avoid continuously detecting a drift after the first detection.
2) Make k to iterate to the last item of the bucket in the method detected_change (ADWIN class).
The unity test was also changed to reflect the current expected behavior.
[BUG] The change points detected are different from the points reported from MOA, using the same example scikit-multiflow#306
[BUG] The change points detected are different from the points reported from MOA, using the same example scikit-multiflow#306
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant