Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convergence on single neuron with large vectors #43

Closed
cbanbury opened this issue Nov 18, 2016 · 13 comments
Closed

Convergence on single neuron with large vectors #43

cbanbury opened this issue Nov 18, 2016 · 13 comments

Comments

@cbanbury
Copy link
Contributor

I've been playing with this a bit more and it works well for the canonical example of mapping colours. However, when I feed data with more variables (~40) into the SOM, all of the inputs tend to converge on a single neuron.

You seem to have had this issue before with: #17, I'm wondering if it is again related to normalisation?

Should probably have:

  • options for normalisation and re evaluate normalisation strategy
  • more extensive tests for data with larger vectors
@nmondon
Copy link
Member

nmondon commented Nov 18, 2016

Hi Carl !

Thanks for the feedback, you're right about these points,

  • I will make the normalization optional and maybe others algorithmic steps, it will allow us to figure out what are the root of your issue
  • yes, we should improve the test coverage, one of the issues I encountered then was the limited duration of a test under mocha (2 seconds max if I recall well).

@cbanbury
Copy link
Contributor Author

I have tried commenting out the normalisation line:

this.data = this.normalize(data, scales);

but still see the same convergence. I'll have a play with different normalisation methods externally.

Regarding the timeout, I think you can set the timeout for one or more tests manually. I've been trying to find some test data, how about using astronomical spectra:

http://cdsarc.u-strasbg.fr/viz-bin/Cat?III/92#sRM2.1

This paper did something similar to classify stellar types using SOM.

@nmondon
Copy link
Member

nmondon commented Nov 18, 2016

Thanks, let me know if you find something!

I'll have a look, I'm sure that will be an interesting test case :)

@nmondon
Copy link
Member

nmondon commented Nov 28, 2016

I was quite busy the past week, but I will be more available for this this week !

@nmondon
Copy link
Member

nmondon commented Nov 28, 2016

waow, 2799 dimensions in the stellar dataset!

@cbanbury
Copy link
Contributor Author

Ha, yes it might be a bit overkill for a test, in theory it should still work though. Would be nice to see what the limits are for this kind of thing using JavaScript.

@nmondon
Copy link
Member

nmondon commented Nov 28, 2016

Vectorial operations seem to be the problem (combined with normalized values)... Even with a single iteration, all data are converging to the same neuron because dist method returns a NaN... I'm not sure yet

@nmondon
Copy link
Member

nmondon commented Nov 29, 2016

I got it, it's a BIG mistake in the eigenvectors generation!!
Basically, I generate vectors of dimension N with N the num of my input data, not the num of their dimensions... :ashamed

@nmondon
Copy link
Member

nmondon commented Nov 29, 2016

It was working because :

  • of the dist method parameters order
  • and because vectors of inputs had a lower dimension than the neurons' vectors...

Basically, I could have randomly initialized my neurons' vectors, it would have been the same...

The convergence on a single neuron occurs as soon as the dimensions cardinality is bigger than the data input cardinality which make the dist method returns NaN

I'm gonna add a decent test coverage on that!

@cbanbury
Copy link
Contributor Author

Oops! At least it's a fairly easy fix. 😸

@nmondon
Copy link
Member

nmondon commented Nov 30, 2016

@cbanbury I've finally added an issue on ml-pca repo: mljs/pca#9 because I was not sure of the behavior of their eigenvectors...
but it was actually my mistake,

After having fixed this, I ran the stars example and results are not that bad for a first attempt, I've begun a visualisation in a dedicated repo: https://github.com/seracio/kohonen-stars (beware, the vis is working but SOM calculation is based on a non released yet version of kohonen - https://github.com/seracio/kohonen/tree/45-api-redesign)

@cbanbury
Copy link
Contributor Author

Awesome stuff! I have a feeling that I've run into a similar issue with the ml-pca package, so perhaps their docs need more clarity.

The visualisation looks great, and nice to have as an example for using the package.

@nmondon
Copy link
Member

nmondon commented Dec 1, 2016

v0.7.0 is out, it finally only fix this bug, the API redesign will be for v1!

@nmondon nmondon closed this as completed Dec 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants