Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is the integer cast in abs() really intended? #42

Closed
LTLA opened this issue Jan 8, 2021 · 3 comments
Closed

Is the integer cast in abs() really intended? #42

LTLA opened this issue Jan 8, 2021 · 3 comments

Comments

@LTLA
Copy link

LTLA commented Jan 8, 2021

For various reasons, I am writing a C++ implementation of som.c, and I noticed

FlowSOM/src/som.c

Lines 133 to 134 in 6f79b8b

tmp = data[i + j*n] - codes[cd + j*ncodes];
change += abs(tmp);

tmp is double-precision but abs() takes integer arguments. This call will truncate tmp to integer before actually computing the absolute difference. Nothing will be added to change if all tmp lie in (-1, 1). As it stands now, the iterations will terminate if all tmp have absolute values less than 1. Is this intentional? Did you mean to use fabs() instead?

Regardless, I don't understand the motivation behind the stop condition of change < 1. A threshold of 1 seems pretty arbitrary when the scale of the input data could be anything. Why not just terminate when the change in the code coordinates drops below a certain level (e.g., defined as some small fraction of the standard deviation across codes)?

@SamGG
Copy link
Contributor

SamGG commented Jan 8, 2021

Interesting point.
I don't think there is such a stop condition in the implementation of the kohonen package.
https://github.com/rwehrens/kohonen/blob/28eb3599bdc423e94a4a1e09310ffd9de667ead8/kohonen/src/supersom.cpp#L93-L173

@SofieVG
Copy link
Owner

SofieVG commented Jan 19, 2021

Dear @LTLA,

This integer cast was indeed not intended and has already been changed to fabs() on my local installation. Apparently that update hasn't made it to this github yet, but I will update soon.

The idea behind the change parameter is indeed to check for convergence, but I agree that the value 1 can be kind of arbitrary, especially when the package is getting wider use. I think we choose this value back then because it made sense for the datasets we were working with at the time. I'll consider updating this in the future, either at least make it a parameter, or indeed, as you are suggesting, computing it based on the data range itself.

Kind regards,
Sofie

@LTLA
Copy link
Author

LTLA commented Jan 21, 2021

Okay, good to see that it's fixed.

@LTLA LTLA closed this as completed Jan 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants