Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kBET #364

Open
wants to merge 9 commits into
base: master
from

Conversation

Projects
None yet
5 participants
@flying-sheep
Copy link
Member

commented Nov 15, 2018

Currently it’s almost a MVP.

What do you people think (especially @mbuttner)? Possible improvements:

  • Heuristic for neighborhood size
  • Expose more data: E.g. a column in .obs with the corrected p values
  • Better docs
  • groupby
  • Mean of multiple samples instead of whole data
  • Adapt to cells that appear in no neighborhoods
  • Speedups?

I guess at least the heuristic and the docs have to go in!

I also don’t know if relying on the previous neighbors call is a good idea. Its default is 15 Neighbors, and even for a small (200 cells) dataset, the number the heuristic determined was 75.

@flying-sheep flying-sheep force-pushed the kbet branch 2 times, most recently from 4f50737 to 5c4ca79 Nov 19, 2018

@mbuttner

This comment has been minimized.

Copy link
Member

commented Dec 12, 2018

Hi,
I have checked your code and it looks good to me, very neat, actually.
Very important from my point of view is the neighbourhood size estimation - 15 neighbours is too few even for small data sets.
If you do not have a heuristic, yet, I suggest to use a rough estimate for the neighbourhood size similarly to the R implementation as the quarter of the mean batch size:
floor(mean(batch_size)/4)
I should come up with a better estimate, though.

I can help you with the docs, if you like!

Lastly, I'd find it important that you can apply kBET group-wise. For example, you may add a groupby parameter and then kBET automatically computes a rejection rate (or acceptance rate) for each cluster. I hope that's not too much overhead.

@falexwolf

This comment has been minimized.

Copy link
Member

commented Jan 4, 2019

How are things going?

I think all of this looks fine except for two things:

  • we should not have a function pp.kbet_neighbors as it won't be used by anything else, pp.neighbors exists as it's a detrimental preprocessing step for 10 other tools; tl.kbet should make the call of kbet_neighbors internally and that should be a private function
  • the heading cluster scores is very confusing; I'd put into a metrics section and the notion that has been established for these kind of things (mostly by the Seurat developers, I think) is "alignment metric"; I can easily make the change if we're ready to go...

PS: Why are there no contributions by you, @mbuttner? After all, it's your method and paper. 😉

@mbuttner

This comment has been minimized.

Copy link
Member

commented Jan 4, 2019

@grst

This comment has been minimized.

Copy link
Contributor

commented Jan 11, 2019

Hi @flying-sheep,

does the base functionality already work, or would you still advise me to use the R package if I want to use kBET?

@flying-sheep

This comment has been minimized.

Copy link
Member Author

commented Jan 11, 2019

I’d advise to use the R package. The score is still off and I didn’t figure out why. Sorry.

/edit: it works now. I’d like to have a good toy example for the tests that actually has a batch effect, then I’d merge this

@flying-sheep flying-sheep force-pushed the master branch 2 times, most recently from 3efb194 to fc84096 Feb 12, 2019

@gokceneraslan

This comment has been minimized.

Copy link
Collaborator

commented Feb 21, 2019

Some(most?) batch correction methods produce outputs in a lower dimensional space (like CCA or scVI), which are more likely to be stored in .obsm. Therefore, if you guys decide to call kbet_neighbors and pp.neighbors internally, do not forget to provide options like use_rep in sc.tl.kbet() to make things like sc.pp.neighbors(..., use_rep='X_CCA') possible.

Another idea is to keep multiple kNN graphs in .uns['neighbors'] e.g. .uns['neighbors']['default'] and .uns['neighbors']['kbet'].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.