-
Notifications
You must be signed in to change notification settings - Fork 206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve decomposition documentation #1159
Comments
+1 Since apart from This would help with the docs, as dimensionality reduction & denoising (PCA/RPCA) are cleanly separated from NMF and ICA, which are what you'd actually use for interpretation! |
Not sure about taking NMF out of decomposition as performing a decomposition is what it actually does. |
@francisco-dlp so does ICA? Or at least, the distinction needs to be clearer
After all, you can also have Sparse Component Analysis (SCA) for sparsity constraints. |
ICA estimates a mixing matrix to unmix the sources. Those sources in our case are often the result of dimensionality reduction through matrix decomposition methods. But I see your point, for some applications NMF could be seen as a blind source separation method and users may benefit from making it clearer that just NMF may be enough in some cases. |
Yes - that sort of distinction sounds sensible. |
Just to flag this was discussed quite a bit in #969 - A key point is that basically any decomposition approach in scikit-learn is available and the documentation needs to at least direct people to the relevant pages there so that people can know what the algorithms are. Personally I'm not so keen on the isolation of ICA as something different from the others and would push to remove the method blind_source_separation() completely. In scikit-learn for example ICA just within decomposition as in sklearn.decomposition.FastICA just like sklearn.decomposition.PCA --- to me this is its proper place. In HyperSpy the separation comes from a practical point of typical workflows originally having been do PCA then do ICA (I know its for good reasons regarding noise) but nevertheless as far as I'm concerned they're both decomposition approaches of different flavours and we should be consistent with sklearn. |
I agree with @dnjohnstone's point about |
Let's see if we agree with the following:
|
I think I see your point but it seems most of the argument here is about which bit actually constitutes ICA... It seems to me that the general trend has been towards including 'the stuff that you need to do to your data to get independent components out' under the ICA banner. Probably this is because even if less formally clear - it's what most people want in practice. The situation seems to be summed up pretty well in Hyvarien's most recent review where he says "Most ICA algorithms divide the estimation of the model into two steps: a preliminary whitening and the actual ICA estimation. Whitening means that the data are first linearly transformed by a matrix V such that Z=VX is white.... Such a matrix V can be easily found by PCA..." It seems you're keen to keep the "actual ICA estimation" in its own method, which may have formal merit - but this seems a bit out of step with much of the literature I'm reading, which follow Hyvarien's "most ICA algorithms". There are also many papers that just like to consider all these situations as matrix factorisations under alternative constraints and point out that the ICA version of that only works when you've done your preprocessing. Perhaps most importantly though it is out of step with scikit-learn, which seems the natural thing to keep consistent with. |
Of course whitening is a required pre-processing step for ICA and other BSS algorithms. Obviously, HyperSpy does whiten the data before passing it to ICA. However this does not make ICA a decomposition algorithm as the SVD is solely performed on the input of ICA for whitening purposes and this step is separate from the dimensionality reduction, which happens in decomposition. I don't want to keeps things as they are just because I prioritise formal accuracy over practicality. It's just that matrix decomposition and mixing matrix estimation are two different things that need to be performed in two separated steps in order to keep the required flexibility. I ignore why sklearn has decided to place ICA under the decomposition umbrella. However, in their case it doesn't really matter as performing decomposition and ICA remain a two step process. If we place ICA in decomposition, ICA becomes a 1 step process, and we'll be losing flexibility and clarity on the way. |
Why do you ignore why sklearn has decided to place ICA under the decomposition umbrella? That seems to be the key point? What makes this different from the usual arguments that "machine learning things should be left in their proper place" i.e. sklearn? |
In either case, this is all stuff that isn't clear in the docs at present! :-) |
The point is that we shouldn't take our decision based on what others did without knowing their motivation. Currently, I don't know why they took their decision. If somebody finds it out, please bring it to this discussion. Otherwise, I don't think that it is acceptable to use "X did Y" as an argument in this discussion. @tjof2, I fully agree, this isn't clear in the docs. |
scikit-learn/scikit-learn#858 changed |
Ok... another point then, is signal decomposition = matrix decomposition (taken as a synonym for matrix factorisation)? Would anyone object to the sentence "the signal was decomposed into its independent components after dimensionality reduction using SVD"? Using decompose as a synonym for separate. Also how would this reduce flexibility? Surely it would still be possible to do s.decomposition('svd') and then sm = s.get_decomposition_model() and then s.decomposition('ICA'). I think there are also issues with the suggestion that ICA is the only way to blindly separate sources. |
Hoping I'll be able to address this soon. |
Adding here since it's the most general issue:
(what are the variables and trials axes?) |
Agree @thomasaarholt , should be signal and navigation axes I think? Another way of saying |
I believe you're right, yes. We could use a dictionary to convert signal and navigator to the appropriate names (I don't know which one is which!). |
Pretty sure trials == samples in "sklearn" lingo, so variables == features. |
aaaaaand samples=navigator? :p |
I believe so! |
As @jeinsle pointed out, it would be nice if we clarified the use and connections between the following terminologies, which I believe are two sets of words for the same thing. I am not sure that these are the correct way around:
|
Regarding centering, also worth documenting this in a neat way: https://gitter.im/hyperspy/hyperspy?at=5e8360d4f7cff9290c851d83
|
Currently the documentation for decomposition is very disorganised. The docstring currently looks like following. I think it could be useful to separate out a single general section, and a section/groups for each algorithm so that it's clearer what arguments one should be adjusting to get a correct result.
The text was updated successfully, but these errors were encountered: