Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong Label on method SOGAAL and MOGAAL #237

Closed
luisfelipe18 opened this issue Sep 29, 2020 · 9 comments
Closed

Wrong Label on method SOGAAL and MOGAAL #237

luisfelipe18 opened this issue Sep 29, 2020 · 9 comments
Labels

Comments

@luisfelipe18
Copy link

On the original paper from Xiangnan He, the inliers are marked as 1 and outliers marked as 0. I checked their original code and the code of pyod line by line and are almost the same.
On the line https://github.com/yzhao062/pyod/blob/94c27ef3841de4b0e5a732f9c720092a24397633/pyod/models/mo_gaal.py#L79 you state 0 for inliers and 1 for outliers , but in the original paper at page 3, section 3.1, third and fourth line says 0 outlier, 1 inlier. https://arxiv.org/pdf/1809.10816.pdf

@yzhao062
Copy link
Owner

This is a good point and potentially a bug. I did not implement this algorithm by myself and need some probing. In the worst case, it is incorrect and we need to flip the score and also the score.

@yzhao062 yzhao062 added the bug label Sep 29, 2020
@luisfelipe18
Copy link
Author

image

in the line 11, from principalDf["resultado"] = principalDf["resultado"].map({0:"b",1:"r"}), I am mapping 0 to "b" and 1 to "r".

As you can see, the lower proportion is "b" which are equivalents to 0. (This 0 and 1 comes from model.predict . Since docs says 0 must be a inlier and it appears in lower proportion I started to think that something is wrong.

@anranhui
Copy link

I also encountered this error, sogaal original 1 is normal data, but in pyod, 0 is normal data

@zhaoxing-zstar
Copy link

I compare the results between SO_GAAL and other algorithms, and I think the score for SO_GAAL should be flipped (0 for outlier, 1 for normal). Maybe overriding the _process_decision_score method would work.

@yzhao062
Copy link
Owner

I suspect that guys...
This is the sogaal example with a simple synthetic data.

image

if I flip the score by -1, the performance looks incorrect.
image

Maybe I miss some points?

@zhaoxing-zstar
Copy link

I suspect that guys... This is the sogaal example with a simple synthetic data.

image

if I flip the score by -1, the performance looks incorrect. image

Maybe I miss some points?

My fault, you're right.

@anranhui
Copy link

anranhui commented Apr 20, 2022 via email

@anranhui
Copy link

anranhui commented Apr 20, 2022 via email

@yzhao062
Copy link
Owner

looking forward to knowing more about the progress. good luck with the paper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants