diff --git a/docs/api.rst b/docs/api.rst index bf1f73e..f76e7fb 100644 --- a/docs/api.rst +++ b/docs/api.rst @@ -6,4 +6,6 @@ API api/averages api/subsample + api/cluster_with_annotations + api/fetch_atlas api/* diff --git a/docs/api/average_atlas.rst b/docs/api/average_atlas.rst new file mode 100644 index 0000000..1d5c96e --- /dev/null +++ b/docs/api/average_atlas.rst @@ -0,0 +1,4 @@ +northstar\.average_atlas +--------------------------- + +.. autofunction:: northstar.average_atlas diff --git a/docs/api/averages.rst b/docs/api/averages.rst index 7a2d506..a2a6fda 100644 --- a/docs/api/averages.rst +++ b/docs/api/averages.rst @@ -7,4 +7,6 @@ northstar\.Averages :show-inheritance: .. automethod:: __init__ - .. automethod:: __call__ + .. automethod:: fit + .. automethod:: fit_transform + .. automethod:: embed diff --git a/docs/api/cluster_with_annotations.rst b/docs/api/cluster_with_annotations.rst new file mode 100644 index 0000000..2573864 --- /dev/null +++ b/docs/api/cluster_with_annotations.rst @@ -0,0 +1,11 @@ +northstar\.ClusterWithAnnotations +--------------------------------- + +.. autoclass:: northstar.ClusterWithAnnotations + :members: + :undoc-members: + :show-inheritance: + + .. automethod:: __init__ + .. automethod:: fit + .. automethod:: fit_transform diff --git a/docs/api/subsample.rst b/docs/api/subsample.rst index b995768..8b494c8 100644 --- a/docs/api/subsample.rst +++ b/docs/api/subsample.rst @@ -7,4 +7,6 @@ northstar\.Subsample :show-inheritance: .. automethod:: __init__ - .. automethod:: __call__ + .. automethod:: fit + .. automethod:: fit_transform + .. automethod:: embed diff --git a/docs/api/subsample_atlas.rst b/docs/api/subsample_atlas.rst new file mode 100644 index 0000000..ddb9e7a --- /dev/null +++ b/docs/api/subsample_atlas.rst @@ -0,0 +1,4 @@ +northstar\.subsample_atlas +--------------------------- + +.. autofunction:: northstar.subsample_atlas diff --git a/docs/examples.rst b/docs/examples.rst index cf7920f..3cd1d99 100644 --- a/docs/examples.rst +++ b/docs/examples.rst @@ -13,3 +13,7 @@ You can use a custom atlas: You can also harmonize your atlas and target dataset (to be annotated) with another tool and then use northstar for clustering only. The advantage is that northstar's clustering algorithm is aware of the atlas annotations, therefore it is guaranteed to neither split not merge atlas cell types: - :doc:`External data harmonization ` + +You can also use northstar just as an API interface to our precompiled list of annotated atlases. This can be used to download averages and subsamples (we call them **atlas landmarks**) and use them to do whatever you want (e.g. classify using another tool, harmonize, look up marker genes, etc): + +- :doc:`Fetch a precompiled atlas landmark ` diff --git a/docs/examples/averages_custom_atlas.rst b/docs/examples/averages_custom_atlas.rst index 59818cd..8b24337 100644 --- a/docs/examples/averages_custom_atlas.rst +++ b/docs/examples/averages_custom_atlas.rst @@ -1,4 +1,4 @@ -Mapping data onto custom atlas +Fetch a precompiled atlas landmark ======================================== In this example, we will map cells onto a custom atlas, using the `Averages` class: @@ -7,56 +7,35 @@ In this example, we will map cells onto a custom atlas, using the `Averages` cla import anndata import northstar - # Read in the new data to be annotated - # Here we assume it's a loom file, but - # of course it can be whatever format - newdata = anndata.read_loom('...') + # Initialize the class + af = northstar.AtlasFetcher() - # Read in the atlas with annotations - atlas_full = anndata.read_loom('...') + # Get a list of the available landmarks + landmarks = af.list_atlases() - # Make sure the 'CellType' column is set - # if it has another name, rename it - atlas_full.obs['CellType'] = atlas_full.obs['cluster'].astype(str) - - # Subsample the atlas, we don't need - # 1M cells to find out 5 cell types - atlas_ave = northstar.average_atlas( - atlas_full, - ) - - # Prepare the classifier - # We exclude the fetal cells to focus - # on adult tissue. To keep the fetal - # cells, just take away the _nofetal - # It is common to balance all cell types - # with the same number of cells to keep - # a high resolution in the PC space, - model = northstar.Averages( - atlas=atlas_ave, - n_cells_per_type=20, + # Get one of them + atlas_sub = af.fetch_atlas( + 'Darmanis_2015', + kind='subsample', ) - # Run the classification - model.fit(newdata) - - # Get the inferred cell types - cell_types_newdata = model.membership +You can also fetch multiple atlases at once. They will be merged together. Because not all genes are present in all atlases, you can decide what to do for the genes that are missing from some atlases. In this example, we keep all genes and, for each atlas, we pad the missing genes with zeros. - # Get UMAP coordinates of the atlas - # and new data (joint embedding) - embedding = model.embed('umap') +.. code-block:: python -Notice that the `n_cells_per_type=20` is an easy way to balance the importance of each cell type in the final PC space, however it is not absolutely necessary. In fact, a larger number for some cell types will increase their weight in the PC space and therefore might increase the ability of assign cells to those types. The easiest way to use unbalanced averages is to do as follows: + import anndata + import northstar -.. code-block:: python + # Initialize the class + af = northstar.AtlasFetcher() - # Let's assume you are most interested in B cells and T cells - atlas_ave.obs['NumberOfCells'] = 20 - atlas_ave.obs.loc[['B cell', 'T cell'], 'NumberOfCells'] = 100 + # Get a list of the available landmarks + landmarks = af.list_atlases() - model = northstar.Averages( - atlas=atlas_ave, + # Get two atlases (merged) + atlas_sub = af.fetch_multiple_atlases( + ['Darmanis_2015', 'Enge_2017'], + kind='subsample', + join='union', ) -Notice that we omitted the `n_cells_per_type` argument in this case. The rest of the code stays the same.