Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Neighborhood Function Selection #23

Closed
brogie62 opened this issue Nov 21, 2015 · 14 comments
Closed

Neighborhood Function Selection #23

brogie62 opened this issue Nov 21, 2015 · 14 comments

Comments

@brogie62
Copy link

Could you allow for user selection of neighborhood function beyond Gaussian? I would like the option of bubble but there are other functions that different users may prefer.

Thanks for the great work!

@peterwittek
Copy link
Owner

Could you give us a reference to this neighbourhood function? I am not familiar with it. Thanks.

@brogie62
Copy link
Author

image
It is a simple cut-off or square function where all nodes within the radius are affected equally. It is the only neighborhood function in the Matlab Neural Network Toolbox and in the R package 'kohonen'. It is one of 2 choices (along with Gaussian) in the R package 'som'.

@peterwittek
Copy link
Owner

Commit 1e6988b implements choosing between bubble and Gaussian neighborhood functions across all interfaces (CLI, API, Python, R, MATLAB). Please test it. Once it works, I close this issue.

For future reference, the above extract is from this paper. The paper concludes that "[t]"he bubble neighboring function yields the worst result for all the data sets analyzed." Nevertheless, we have it now.

Please note that Somoclu does not have a compact support for the neighborhood function by default. This means that in Eq. (2), N_c contains all nodes. To recover the behavior in Eq. (2), request a compact support with the corresponding optional parameter.

Also note that the Gaussian neighborhood in Somoclu is slightly different from Eq. (3): the eta rank does not appear in the denumerator.

@brogie62
Copy link
Author

Thanks Peter. I am aware of the conclusions from that paper. Parameter optimization studies conducted on my data sets gives the reverse.

@peterwittek
Copy link
Owner

This surely isn't the best paper ever written on self-organizing maps, so I am not surprised at your findings. Please give it a go with the new bubble neighborhood function and let me know if it works as expected.

@brogie62
Copy link
Author

I am trying to install your updated R package but am getting an error:
'Error in getOctD(x, offset, len) : invalid octal digit'
I have never installed from source before so I am guessing I am doing something wrong.
I have previously been running from the CLI but thought R would be a faster test than re-compiling.

@peterwittek
Copy link
Owner

I tested the R version and I did not have any problems (GCC 5.2.2 and R 3.2.2, freshly installed Rcpp). @xgdgsc, do you have an insight on this?

@xgdgsc
Copy link
Collaborator

xgdgsc commented Nov 24, 2015

Might be file corruption ? How did you install the packege? What operating system/ which version of R do you use?

@brogie62
Copy link
Author

Ubuntu 14.04
R 3.0.2
gcc 4.8.2

This is on an AWS instance. It may be that I am installing the wrong file or poor command syntax. I used R CMD INSTALL trying both the Rsomoclu.R and the Rsomoclu.cpp files and got the same error for both.

@peterwittek
Copy link
Owner

I just noticed there was an unrelated typo in the R version. Please use the latest git, I fixed the problem.

This is what I do:

git clone https://github.com/peterwittek/somoclu
cd somoclu
./autogen.sh
./configure
make r
R

Then from the R command line, I install Rcpp from CRAN and RScomoclu from source:

install.packages("Rcpp")
install.packages("./src/Rsomoclu_1.5.tar.gz", repos=NULL, type="source")

This works for me. Sorry for the trouble, I hope you get it working. Once we go through with this basic testing, we can do a minor release to make it available on CRAN.

@brogie62
Copy link
Author

Compiled and tested Rsomoclu. Found that I got very different results using gaussian than I did from the CLI version of Somoclu. Decided to test prior version (1.5) of Rsomoclu. With the same parameters the outcome is very different between the two interfaces, R and CLI. The CLI results are more in line with experience using other som packages. Rsomoclu also outputs the BMUs as a list that is 2 x input rows, not 'x' and 'y' columns like CLI. Not sure if Rsomoclu is xxxxx... then yyyyy... or xyxyxyxy... The column is headed 'x'.

@peterwittek
Copy link
Owner

The main difference between the R version and all the other version is that the R wrapper uses R's random number generator for initializing the map. This might result in vastly different maps.

You are right, the encoding for BMUs is xyxy...

This is becoming unrelated to the original issue, so please open a new one if you see problems with the R wrapper. We do not really have R expertise, so it is quite limited what we can do, but opening an issue can never hurt.

@peterwittek
Copy link
Owner

Did you have a chance to test the bubble distance?

@peterwittek
Copy link
Owner

Actually, the bubble distance only makes any sense with a compact support. I changed the code accordingly (commit 3a25793), and with that, I believe this issue is resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants