-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: KFOCI #10
Comments
Thanks very much, excellent idea! I had not been aware of KPC.
Norm
…On Wed, Dec 20, 2023 at 7:55 AM Benjamin Lang ***@***.***> wrote:
There is a R package KPC <https://cran.r-project.org/web/packages/KPC/>
that implements a more general and improved version of FOCI, called KFOCI
(Kernel FOCI), that was proposed by Huang et al. The improvement over
existing methods in certain settings is quite remarkable, thus I believe it
would be a great addition to have functions similar to the ones for FOCI.
What do you think?
For categorical variables, it may even be possible to partially (i.e. as
long as they have an order) refrain from creating dummy variables by using
them as integer-based variables.
—
Reply to this email directly, view it on GitHub
<#10>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZ34ZLOQVF7TSDF4PZF3MLYKMC63AVCNFSM6AAAAABA5BO2LGVHI2DSMVQWIX3LMV43ASLTON2WKOZSGA2TAOBUG4ZTENA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Great, happy to hear that! Some more food for thoughts: If binary/categorical variables are included, there will be randomness when calling KFOCI (due to breaking ties in the k-NN graph). So it could make sense to multiply call KFOCI on the same data set and somehow condense or visualize the results. For the former some sort of stability selection could be done, e.g. as proposed in Section 2.3 https://onlinelibrary.wiley.com/doi/10.1002/sim.8955. This proposal is in a slightly different context but sounds generic and could be applicable to KFOCI (and FOCI) as well. Unfortunately, I do not know how this "stable set" would behave, maybe it is not a good idea because it could violate the nice property of Theorem 7 from Huang et al. Any thoughts? |
Unfortunately, I don't have time to go through the theory in Huang et al,
and anyway, remember that I was not involved in the theory behind FOCI.
However, re Kormaksson et al, I can at least offer a comment. Note the
function qeFOCImult. It turns FOCI on m cores, resulting in m sets of
features. The user can specify whether to take the union (aggressive) or
intersection (conservative) of the m sets. It would seem that what Sec. 2.3
of Kormaksson et al does is somewhat similar in spirit to taking the
intersection in qeFOCImult.
…On Thu, Dec 21, 2023 at 11:20 AM Benjamin Lang ***@***.***> wrote:
Great, happy to hear that! Some more food for thoughts: If
binary/categorical variables are included, there will be randomness when
calling KFOCI (due to breaking ties in the k-NN graph). So it could make
sense to multiply call KFOCI on the same data set and somehow condense or
visualize the results. For the former some sort of stability selection
could be done, e.g. as proposed in Section 2.3
https://onlinelibrary.wiley.com/doi/10.1002/sim.8955. This proposal is in
a slightly different context but sounds generic and could be applicable to
KFOCI (and FOCI) as well. Unfortunately, I do not know how this "stable
set" would behave, maybe it is not a good idea because it could violate the
nice property of Theorem 7 from Huang et al. Any thoughts?
—
Reply to this email directly, view it on GitHub
<#10 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZ34ZLJGUVZXXUADRZR4ETYKSDWVAVCNFSM6AAAAABA5BO2LGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRWHAYTKMZXGI>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Thanks, point taken! Appreciate your comment, that exactly goes into the direction I was aiming for. |
Good, please let me know what you find works well.
…On Fri, Dec 22, 2023 at 11:09 AM Benjamin Lang ***@***.***> wrote:
Thanks, point taken! Appreciate your comment, that exactly goes into the
direction I was aiming for.
—
Reply to this email directly, view it on GitHub
<#10 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZ34ZOU3HPL2P6ZKUWX533YKXLH5AVCNFSM6AAAAABA5BO2LGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRYGAYDAMJVGA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
I think it is fair to say that KFOCI performs better than FOCI. Still, I found that the performance for linear or monotone relationships lacks 'power' (this is in line with other observations, e.g. in A survey of some recent developments in measures of I also found that the algorithm in Kormaksson et al (with r = 0.5) works quite well when applied to multiple independent runs of KFOCI on the same data set. One disadvantage resulting from that is that the ordering of variables gets lost; this may however be resolved by saving the ranks of each run and then investigate the tuples of ranks via |
There is a R package KPC that implements a more general and improved version of FOCI, called KFOCI (Kernel FOCI), that was proposed by Huang et al. The improvement over existing methods in certain settings is quite remarkable, thus I believe it would be a great addition to have functions similar to the ones for FOCI. What do you think?
For categorical variables, it may even be possible to partially (i.e. as long as they have an order) refrain from creating dummy variables by using them as integer-based variables.
The text was updated successfully, but these errors were encountered: