-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changed way the cost of unmixing matrix calculated it ICA #45
Conversation
Test failures on 0.5 are unrelated AFAICT. |
@alexmorley Could you rebase this PR? |
Yup should get to it today. |
src/ica.jl
Outdated
for i = 1:m | ||
s += abs(w[i] - wp[i]) | ||
end | ||
s = abs(abs(dot(w, wp))-1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the main change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why calculate it with dot product? It's become less clear, especially with 2 abs
functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's quite difficult to explain here but I have a jupyter nb I showing why if you want? Basically the way it's calculated at the moment doesn't asymptote at zero - i.e. even when the model is barely changing the tol can stay at something > 1. I think no one realised this before because there was no warning if convergence didn't happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the dot product is how they suggest to do it in the reference in the comments and also in scikit-learn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some research, I suggest you to change conversion calculation to the following: maximum(abs.(abs.(diag(W*Wp')) .- 1.0))
. The reason is that the correct matrix of weights W
, after normalization, should have the following property WW' = I
. The expression diag(WWp') - I
give you more precise value of how far changes in weights move W
from its orthonormalized form. We can use maximum
or sum
to get maximal or total difference for convergence estimation.
Your convergence calculation is always converges and does it faster which could lead to incorrect results. Look below. First column is current change rate, second is your implementation, last is above calculation.
julia> M = fit(ICA, X, k; do_whiten=true, tol=2.5, maxiter=30)
2.7182540174611347, 0.2996264154133659, 0.5276962145093531
3.0873453111737414, 0.002352424311681034, 0.6247855046549973
3.1447356280525156, 0.0014036589171737557, 0.6537159231416463
3.1679711437656857, 0.0009662565084542774, 0.6806973376704752
3.166754646302584, 0.0008037999032722842, 0.6980296646003903
3.145146606504619, 0.000841724592342219, 0.7016499929651104
3.1039783936998644, 0.001008690856632688, 0.689136642633039
3.0425278504440394, 0.0012098909759400422, 0.6593491061607
2.9601111617728346, 0.0014419722505419896, 0.6129391462450918
2.8579888825199515, 0.0016480056551534394, 0.5531533509183671
2.741180919705083, 0.0016715155873130438, 0.4860398699178754
2.7160674402877802, 0.0014642018514219313, 0.41912963747827925
2.7737243520216177, 0.0011096563568189222, 0.3588329065421515
2.810103653533086, 0.0007531663381148412, 0.30837943597137607
2.8288396596248577, 0.00048629306431180463, 0.2677750273106896
2.8342414310712805, 0.0003591663933480982, 0.23521688719459
2.830031988895299, 0.00027982622456790285, 0.20851744678389883
2.818982528539496, 0.00022793739209669983, 0.1858470400602602
2.80302631989497, 0.00019477312380877798, 0.16591852239611582
2.7834958042716638, 0.00017363798248259954, 0.14792017366690646
2.7613314524069534, 0.0001598235921874691, 0.13138665300581553
2.7372298175083327, 0.00015012259501212544, 0.11608115245647521
2.711738561083237, 0.0001424029114446279, 0.10190561630994
2.685313250988781, 0.00013530892706092867, 0.08883713077569821
2.658348559087098, 0.00012805496272139116, 0.07688481738321817
2.6311932069878727, 0.00012027189363195134, 0.06606213951454254
2.6041553815819003, 0.00011188308615894815, 0.056370719426909965
2.5775033728225365, 0.0001029999342684329, 0.04779263155835933
2.551464635233311, 9.400275450666129e-5, 0.04028867399246405
2.526225211066467, 8.48707803052795e-5, 0.03380050619913799
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that sounds even more sensible to me 🖒 will get to it today
Just fails on nightly due to use of |
…s. Removed `Printf` dependency.
- Added testsets. - Removed `Printf` dependency.
Change in how cost of unmixing matrix is calculated.
Previous implementation won't converge to a cost near zero.
Both the reference in the code and the scikit-learn implementation use the dot product (as in this PR).
NB this a fork of #37 . So that should be merged first.