New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The method for annotating genes with cell types #812
Conversation
Thanks for the PR! Gene-set based annotation would be pretty useful to have here. Is there any chance there's a preprint we could look at for a little more context on the method? |
@ivirshup it is currently in writing. Will write back to you when we will have it ready. |
Any update on this? Can you add a test (probably reusing the example already in the method docstring)? |
@fidelram, we are still working paper/preprint. I will post it soon. I will add tests. So in order for the test to work should I add my library in the requirements.txt? What I observed is that other external packages are not included in project requirements. |
Yeah, please add it together with a comment mentioning where is needed (e.g |
@PrimozGodec, probably don't add this to from importlib.util import find_spec
@pytest.mark.skipif(find_spec('pointannotator') is None, reason="pointannotator not installed") You can add a requirement for the package to this line in Maybe we should eventually have a second requirements file for CI testing, like we do for anndata. |
14432ce
to
afe7fa9
Compare
I added unit tests and reformated the code. |
Thank you. We’re using pytest though, so please write the tests that way:
@pytest.fixture
def markers():
return pd.DataFrame(
...
)
@pytest.fixture
def adata():
...
return AnnData(data.values, var=data.columns.values)
def test_remove_empty_column(adata, markers):
...
annotations = annotator(adata, markers, num_genes=20)
...
assert len(annotations) == len(self.anndata)
... |
Only remaining thought: I have slight concerns about the name being too generic, but then again, this does exactly what people expect a “cell type annotator based on marker genes” to do. |
We decided not to add more packages to |
With this PR we propose our new annotation method to Scannpy:
Annotator marks the data with cell type annotations based on marker genes.
Over-expressed genes are selected with the Mann-Whitney U tests and cell
types are assigned with the hypergeometric test. This function first selects
genes from gene expression data with the Mann Whitney U test, then annotate
them with the hypergeometric test, and finally filter out cell types that
have zero scores for all cells. The results are scores that tell how
probable each cell type is for each cell.
Hope you like the method and merge it to Scampy.