# Census summary cell counts example

*Goal:* demonstrate basic use of the `census_summary_cell_counts` dataframe.

Each Cell Census contains a top-level dataframe summarizing counts of various cell labels. You can read this into a Pandas DataFrame:

In [1]:
import cell_census

with cell_census.open_soma() as census:
    census_summary_cell_counts = census["census_info"]["summary_cell_counts"].read().concat().to_pandas()

    # Dropping the soma_joinid column as it isn't useful in this demo
    census_summary_cell_counts = census_summary_cell_counts.drop(columns=["soma_joinid"])

census_summary_cell_counts

Unnamed: 0,organism,category,ontology_term_id,unique_cell_count,total_cell_count,label
0,Homo sapiens,all,na,28250282,43420131,na
1,Homo sapiens,assay,EFO:0008722,206279,260396,Drop-seq
2,Homo sapiens,assay,EFO:0008780,25652,51304,inDrop
3,Homo sapiens,assay,EFO:0008913,133511,133511,single-cell RNA sequencing
4,Homo sapiens,assay,EFO:0008919,44721,161998,Seq-Well
...,...,...,...,...,...,...
1206,Mus musculus,tissue_general,UBERON:0002113,164881,188361,kidney
1207,Mus musculus,tissue_general,UBERON:0002365,15577,31154,exocrine gland
1208,Mus musculus,tissue_general,UBERON:0002367,37715,130135,prostate gland
1209,Mus musculus,tissue_general,UBERON:0002368,13322,26644,endocrine gland


This dataframe is precomputed from the experiments in the Cell Census, and is intended to simplify quick looks at the Census contents.

You can do similar group statistics using Pandas `groupby` functions. 

The code below reproduces the above counts using full `obs` dataframe in the `Homo_sapiens` experiment.

Keep in mind that the Cell Census is very large, and any queries will return significant amount of data. You can manage that by narrowing the query request using `column_names` and `value_filter` in your query.

In [2]:
with cell_census.open_soma() as census:
    human = census["census_data"]["homo_sapiens"]
    obs_df = human.obs.read(column_names=["cell_type_ontology_term_id", "cell_type"]).concat().to_pandas()
    obs_df.groupby(by=["cell_type_ontology_term_id", "cell_type"], as_index=False, observed=True).size()