-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Convergence Exceptions for PPCA (EM & Bayes methods) & ICA (fastica!) #37
Conversation
Thanks! I'm not familiar with that code, but the changes look OK to me. However you likely need to increase the minimum StatsBase version (should check when |
It would indeed make sense to print information about how close to convergence the algorithm was. This can easily be added to StatsBase via a new field to |
Sorry I deleted my comment above ^^^ Ah yeah OK good idea. I can do that later on. I think it would be a good idea to also add an option to stop these errors being thrown as I have already run into situations where I don't care that much if its converged or not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, thanks! (And an excellent first contribution, I might add.)
@ararslan Do you agree with the change in how the change in unmixing matrix is calculated? How it was before wouldn't converge to a near-zero tol? I checked the reference above and the scikit-learn implementation and they both suggest to use the dot product. |
@test isa(M, ICA) | ||
@test indim(M) == m | ||
@test outdim(M) == k | ||
@test mean(M) == mv | ||
W = M.W | ||
@test W'C * W ≈ eye(k) | ||
|
||
@test_throws StatsBase.ConvergenceException fit(ICA, X, k; do_whiten=true, tol=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the most recent changes, the call to fit
does not fail to converge and so the test fails.
I'm not intimately familiar with that algorithm so I can't speak to the merits of using a dot product, but I think that change would be better left to a separate PR. |
bd133f4
to
0e7c84f
Compare
src/ica.jl
Outdated
@@ -60,7 +60,7 @@ function fastica!(W::DenseMatrix{Float64}, # initialized component matrix, | |||
X::DenseMatrix{Float64}, # (whitened) observation sample matrix, size(m, n) | |||
fun::ICAGDeriv, # approximate neg-entropy functor | |||
maxiter::Int, # maximum number of iterations | |||
tol::Real, # convergence tolerance | |||
tol::Float64, # convergence tolerance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better not restrict accepted types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a workaround because ConvergenceException
expects tol
& chg
to be the same type but can do a conversion instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not matter. fastica!
works only with Float64
data - see above types of W
and X
parameters. ica
should be refactored to accept data of any floating type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but tol
could be of a different type from the data itself, as long as comparison operations are supported.
src/ica.jl
Outdated
@@ -88,6 +88,7 @@ function fastica!(W::DenseMatrix{Float64}, # initialized component matrix, | |||
end | |||
|
|||
# main loop | |||
chg = 0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will change it to NaN
but it makes chg
available outside scope of the loop & also (when NaN
covers the case of zero iterations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, thanks!
@@ -49,7 +50,7 @@ C = cov(X, 2) | |||
|
|||
# FastICA | |||
|
|||
M = fit(ICA, X, k; do_whiten=false) | |||
M = fit(ICA, X, k; do_whiten=false, tol=Inf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because, the way the cost of the model is currently calculated, it's not guaranteed to converge otherwise. As suggested above I could make another PR to change the calculation to use the dot product (see this commit on my fork: alexmorley@f59b63e) and delete the tol=Inf
change in the tests in that PR.
Or I could delete it here and make the change from the above commit in this PR.
Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please make another PR. Mixing different issues is the recipe not to get a PR merged quickly. :-)
The same applies to the i = 0
change elsewhere (which BTW introduces a trailing whitespace).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK there's another PR in #45 .
src/ppca.jl
Outdated
break | ||
end | ||
L_old = L | ||
i += 1 | ||
end | ||
converged || throw(ConvergenceException(tot, chg, Float64(tol))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you call Float64
here and not above? BTW, it would be better to do oftype(chg, tol)
in case chg
ends up being of a different type than Float64
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
woops... should be on both and will change to use oftype
.
src/ica.jl
Outdated
@@ -88,6 +88,7 @@ function fastica!(W::DenseMatrix{Float64}, # initialized component matrix, | |||
end | |||
|
|||
# main loop | |||
chg = 0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, thanks!
src/ppca.jl
Outdated
break | ||
end | ||
L_old = L | ||
i += 1 | ||
end | ||
converged || throw(ConvergenceException(tot, chg, oftype(chg,tol))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Space after comma for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! The CI failure is about Compat on Julia 0.5, which is unrelated AFAICT. Probably time to drop 0.5 support.
Looks fine. |
See #36
Todo
add ConvergenceExceptions to ICA & PPCA methods
correct the calculation of the change in model for fastica!Seperate PR.add option to turn off convergence exceptions(can just put tol=Inf)add extra field to exception (in statsbase) to see how close to convergence the algorithm was
(pending travis) Add lastchange to exceptions now that statsbase is update as above (commit: JuliaStats/StatsBase.jl@4d03581)