-
I'm working with externally generated clustering partitions, where each partition is stored in a file with tab-separated labels on each line (identical to the output format of mcl with the --abc flag). My goal is to use these partitions as input for RCL, but I need to convert them to the .mci format, which is the accepted format for RCL. I initially tried using the mcxload -etc command for this conversion. While it successfully converted the files to .mci format, the RCL output was problematic— it produced the same clustering result for every resolution level. Upon investigation, I discovered that the .mci format generated by mcxload -etc differs from the format created by the rcl mcl command for clustering. How can I effectively convert my externally generated clustering partitions to the .mci format that RCL requires? |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 7 replies
-
For the
My
Then
You also need the network in
indicating that they share the same row domain (which corresponds to the labels; the 2 3 and 7 indicate numbers of clusters).
should get the setup going, and I hope this works. Let me know of course if there are issues or if you don't have the network yet in the right format. Apologies for the not overly user-friendly experience. |
Beta Was this translation helpful? Give feedback.
-
Thank you @micans for this thorough explanation! I did already have my network in .mci format, so this was exactly what I needed. |
Beta Was this translation helpful? Give feedback.
-
Cool, let me know if there are issues - I'm also mildly interested to know if there are no issues and the output is usable; both technically and whether the resulting RCL clustering makes sense in relationship to the input. Although RCL does little more than stack the inputs along (to my mind) the path of least resistance. Anyway, that's up to you. Edit: I'm pretty sure you have this sorted, but for future reference and other people's benefit, it is important that all the clusterings ( |
Beta Was this translation helpful? Give feedback.
-
Sounds all good! thanks for the fix, now merged. |
Beta Was this translation helpful? Give feedback.
-
I'd like to add a caveat for a particular case where this RCL implementation is probably not giving helpful advice, although it does provide the information needed. If you have only few (ballpark five or less) input clusterings for a large-ish data set (say 10k or more; for expression data I assume your data set is somewhere in the range 5k-40k genes), then the RCL values are potentially not very differentiated. It will always construct a tree simply by taking the sorted RCL values and applying single link clustering. If there are ties it will just arbitrarily take one of the tying values. This could lead to a situation (if there are many ties) where two clusters of non-trivial size are created and then merged, all at the same linkage value. However, this situation will be visible from the |
Beta Was this translation helpful? Give feedback.
For the
mcxload
step make sure to have a tab file ready (that maps labels to consecutive indexes), make sure that each input files uses only those labels and all those labels and use that tab file as follows withmcxload
. My tab filemydata.tab
looks like this:My
mycls1.label
clustering file:Then
You also need the network in
mci
format; do you already have that? For now I assume you do. …